[Mono-list] Re: [Mono-devel-list] String::GetHashCode speedup

Jonathan Gilbert 2a5gjx302@sneakemail.com
Wed, 25 Feb 2004 21:57:09


At 11:03 PM 24/02/2004 -0500, Ben wrote:
>Hey guys,
>
>I transformed String.GetHashCode into a managed function. It works
>fairly well, even for somewhat large strings:
[snip]
>So it appears the break-even point here is at ~ 38 chars.
[snip]

I have an idea: why not keep both implementations?

class System.String
{
  .
  .
  public override int GetHashCode()
  {
    if (Length < 38) // the property can probably be bypassed here
      return icall_GetHashCode();

    // insert managed implementation
  }
  .
  .
}

Periodically, the break-even point can be tuned. That way, people who use
.GetHashCode on extremely large strings won't have exceptionally slow code
just because they are the minority, but hash tables with short string keys
will still perform well.

While it is true that this would require two independent implementations of
the same hash algorithm to be kept in sync, it would noticeably increase
the performance of both short & long strings.

The same argument applies to the copy operation that was also under
discussion: the icall overhead may be high, but for certain lengths and up,
the speed of the hand-optimized memcpy function outweighs the overhead. If
the string copy operation is going to be called every time someone uses the
instance method String::Append, I can certainly see a lot of cases where
the speed would improve substantially (e.g. people who don't realize that
'char' is not an 8-bit integer type using a System.String as a buffer for
incoming network data -- where have we seen this before? :-).

Jonathan