[Mono-dev] BitVector32 patch

Wed Aug 6 13:55:15 EDT 2008

Hi

Please note that I don't know much about BitVector32, but I do think that
there is a better algorithm (if I understand correctly what you want to
accomplish):

Basically you want to do "int mask = (1 << HighestSetBit(maxValue)) - 1;"
where your HighestSetBit method has a loop... well, what about this:

private int MostSignificantBit(int x)
{
        x |= (x >> 1);
        x |= (x >> 2);
        x |= (x >> 4);
        x |= (x >> 8);
        x |= (x >> 16);
        return(x & ~(x >> 1));
}

and then just do "int mask = MostSignificantBit(maxValue) - 1"

The algorithm comes from
http://aggregate.org/MAGIC/#Most%20Significant%201%20Bit

No loop, no branch miss-predictions, looks fine to me.

Laurent Debacker.

2008/8/6 Scott Peterson <lunchtimemama at gmail.com>

> Thanks Rodrigo. I'll split the patch up tomorrow (almost my bed time). My
> replies are inline.
>
>
> @@ -84,14 +84,12 @@
>>>
>>>              public override int GetHashCode ()
>>>              {
>>> -                return (((Int16) mask).GetHashCode () << 16) +
>>> -                       ((Int16) offset).GetHashCode ();
>>> +                return mask << offset;
>>>              }
>>>
>>>
>>
>> What is the possible range of offset? Can't it be bigger than 16 and
>> discard even more bits
>> of entropy from mask?
>>
>
> Sections are only instantiated in BitVector32.CreateSection which enforces
> that the size of the bitmask plus the offset never exceeds 32. This means
> that mask << offset will never truncate information in mask and will be
> unique for any legal combination of mask and offset.
>
>
>> @@ -184,9 +184,8 @@
>>>              if (maxValue < 1)
>>>                  throw new ArgumentException ("maxValue");
>>>
>>> -            int bit = HighestSetBit(maxValue) + 1;
>>> -            int mask = (1 << bit) - 1;
>>> -            int offset = previous.Offset + NumberOfSetBits
>>> (previous.Mask);
>>> +            int mask = (1 << HighestSetBit(maxValue)) - 1;
>>> +            int offset = previous.Offset + HighestSetBit(previous.Mask);
>>>
>>>              if (offset > 32) {
>>>                  throw new ArgumentException ("Sections cannot exceed 32
>>> bits in total");
>>>
>>
>> Shouldn't it be "int mask = (1 << (HighestSetBit(maxValue) + 1)) - 1;"?
>>
>> Besides that, I think the bit searching function should use a binary
>> search which has better
>> performance.
>>
>
> The new HighestSetBit algorithm returns what the old algorithm returned,
> plus one. The new algorithm is actually more correct. The old algorithm, if
> pass the number 1, would return 0. Obviously, one bit is set in the binary
> reprisentation of the number 1.
>
> As for a binary search, I did consider it but opted instead for the
> four-line algorithm. The most loops the alogrithm will make is 31 (when
> passed 2,147,483,647; negative numbers are not allowed). The smaller the
> number, the fewer loops (or rather, the fewer bits used to represent the
> nubmer, the fewer the loops). I imagine most people using BitVector32 are
> going to pass relatively small maxValues since it is commonly used as a
> compact store for bool and enum variables. At the very least, I think this
> is a performance improvement over the old algorithm. Not only is it simpler,
> but the old algorithm made two loops when passed 2,147,483,647 and 32 loops
> when passed 1. I'm a fan of binary search, but I think it's overkill for
> this situation. But I'm happy to be proven wrong :)
>
> Cheers,
> - Scott
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080806/14f39d88/attachment.html