[Mono-dev] BitVector32 patch

Scott Peterson lunchtimemama at gmail.com
Wed Aug 6 11:24:19 EDT 2008


Thanks Rodrigo. I'll split the patch up tomorrow (almost my bed time). My
replies are inline.


@@ -84,14 +84,12 @@
>>
>>              public override int GetHashCode ()
>>              {
>> -                return (((Int16) mask).GetHashCode () << 16) +
>> -                       ((Int16) offset).GetHashCode ();
>> +                return mask << offset;
>>              }
>>
>>
>
> What is the possible range of offset? Can't it be bigger than 16 and
> discard even more bits
> of entropy from mask?
>

Sections are only instantiated in BitVector32.CreateSection which enforces
that the size of the bitmask plus the offset never exceeds 32. This means
that mask << offset will never truncate information in mask and will be
unique for any legal combination of mask and offset.


> @@ -184,9 +184,8 @@
>>              if (maxValue < 1)
>>                  throw new ArgumentException ("maxValue");
>>
>> -            int bit = HighestSetBit(maxValue) + 1;
>> -            int mask = (1 << bit) - 1;
>> -            int offset = previous.Offset + NumberOfSetBits
>> (previous.Mask);
>> +            int mask = (1 << HighestSetBit(maxValue)) - 1;
>> +            int offset = previous.Offset + HighestSetBit(previous.Mask);
>>
>>              if (offset > 32) {
>>                  throw new ArgumentException ("Sections cannot exceed 32
>> bits in total");
>>
>
> Shouldn't it be "int mask = (1 << (HighestSetBit(maxValue) + 1)) - 1;"?
>
> Besides that, I think the bit searching function should use a binary search
> which has better
> performance.
>

The new HighestSetBit algorithm returns what the old algorithm returned,
plus one. The new algorithm is actually more correct. The old algorithm, if
pass the number 1, would return 0. Obviously, one bit is set in the binary
reprisentation of the number 1.

As for a binary search, I did consider it but opted instead for the
four-line algorithm. The most loops the alogrithm will make is 31 (when
passed 2,147,483,647; negative numbers are not allowed). The smaller the
number, the fewer loops (or rather, the fewer bits used to represent the
nubmer, the fewer the loops). I imagine most people using BitVector32 are
going to pass relatively small maxValues since it is commonly used as a
compact store for bool and enum variables. At the very least, I think this
is a performance improvement over the old algorithm. Not only is it simpler,
but the old algorithm made two loops when passed 2,147,483,647 and 32 loops
when passed 1. I'm a fan of binary search, but I think it's overkill for
this situation. But I'm happy to be proven wrong :)

Cheers,
- Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080807/70985c29/attachment.html 


More information about the Mono-devel-list mailing list