[Mono-list] Re: We have an implementation clash.

Serge serge@wildwestsoftware.com
Mon, 16 Jul 2001 14:31:53 +0400

Hi there,

> We should avoid to duplicate effort in the future.
Sorry for causing the collision, but when I started coding the Hashtable I
had no idea whether I will ever finish it or not. It so happened that it
took only several hours this weekend to code. So I think it's not a
collision at all, I was doing this for fun :-) Moreover, I think this will
be even useful to compare/merge implementations.

Well, to the point.
I checked John's implementation and ran it through my test code. Here is
what I have found out:
A tiny fix was applied to the code. It is necessary to cast the value
returned by GetHash(key) to uint, since it's possible that it'll be
negative. For instance "key-13243".GetHashCode() == -1177129432. This may
cause OverflowException or IndexOutOfRangeException to be thrown later in
the code. So, for example, index calculation will look like this:

int index = (int)((uint)GetHash(newEntry.Key) % newCapacity);
At first, I falled into the same trap with my own code as well.

The speed of both implementations is roughly the same. It also matches the
speed of Beta2 implementation. This latter is slightly faster than both our
implementations, but the difference is rather insignificant. Basically, this
proves that John's linked list implementation is not worse than my hack.
Initially, I thought the linked list should be slower. Of course, the test
is rough, the better one is needed. Also, it would be great to test the
memory consumption.

JB> There's a threadsafe wrapper in my implementation that might at least
Great! BTW, is there any ongoing work on System.Threading classes?

JB> Sergey, why generate the primes? Unless I've completely missed
JB> something, there are only 30 primes between 2 and Int32.MaxValue
JB> that meet necessary
JB> restrictions (p2 > p1 *2), and those should be all the ones we need.

I guess you're right, I missed this point. However, it seems that
CalcPrime/TestPrime is not a bottle-neck. Basically, I don't like hard-coded
tables ;-)
As for the primes table precalculated in the class constructor, I'm not sure
whether it's a good idea or not. The idea is to allocate some extra entries
upfront for the tables with the relatively small size (say, less than 8K
entries). Thus reduce the number of further rehashes for such tables.
Probably, such tables are used most often. It seems that it helps (again
only according to my rough tests).

John, could you undertake the merging of the implementations or whatever you
think is the best? Since you're already working on this. On my side, I will
try to find a couple of hours this week to test some other aspects, namely
memory consumption and more precise speed tests, and then let you know about
the results.

Also, I think I could implement SortedList the next weekend :-) According to
the status pages nobody is working on it. Am I right?

Kind regards,
Sergey Chaban