[Mono-dev] Mono's BitConverter.

Jonathan Pryor jonpryor at vt.edu
Sat Mar 24 10:23:02 EDT 2007


On Fri, 2007-03-16 at 18:28 -0400, Miguel de Icaza wrote:
>     So we have decided to implement an actually useful set of Bit
> conversion class and request developers to use this instead.
> 
> * Raw conversions (byte [] to datatype and back).
> 
> * LittleEndian to native (byte [] contains LE encoded data
>   and must be converted into a proper data type, as well
>   as routines for converting a datatype into a little-endian
>   encoded byte array).
> 
> * BigEndian to native (byte [] contains BE encoded data and
>   must be converted into native data type, as well as routines
>   for converting a datatype into a big-endian encoded byte
>   array).

We discussed this on IRC yesterday, and the last idea was to use methods
of the form:

	byte[] Get[Endian]Bytes ([Type] value);
	[Type] To[Endian][Type] (byte[] value);

We could also have Stream-oriented methods of the form:

	void Write[Endian] (Stream stream, [Type] value);
	[Type] Read[Endian][Type] (Stream stream);

Where [Endian] would be LittleEndian or BigEndian (as using LE or BE
just looks ugly, and code completion doesn't make it too difficult).  

Though thinking about it further, I'm not sure that "To[Endian][Type]"
is ideal, as "ToLittleEndianDouble()" implies that the double returned
will be in  Little-endian format, while it's the byte[] `value'
parameter which is little-endian and the return value will be
platform-endian.  Perhaps To[Type]From[Endian] would be better, e.g.
ToDoubleFromLittleEndian() would be better.  However, the pattern is
otherwise consistent between Get/To/Read/Write, so perhaps this can be
solved with documentation...

Other ideas were floated (use an extra enum parameter to specify
little/big endian conversion), but they were rejected (an extra enum
parameter would require `switch'ing on it, which would slow down the
method).

>     A few additional request have been made in addition: to provide
> methods for writing this to streams (it seems sensible).
> 
>     Paolo has also suggested Pack and Unpack methods:
> 
> byte [] Pack (string template, params object [] arguments)
> 
> ArrayList Unpack (string template, byte [] arguments)

I would suggest that IList be the return type instead of ArrayList, as
this would allow changing the actual collection if necessary (to e.g.
List<object> or an object[], etc. -- in fact, it should be possible to
return a object[] if each template parameter is a single character, as
the number of objects to return will always be template.Length, so the
overhead of a re-sizable array wouldn't be needed).

It might be useful to have an `IEnumerable' overload to Pack() as well,
so that if the arguments are already in a collection class they can be
reused as-is:

	byte[] Pack (string template, IEnumerable arguments);

The problem with this overload is that it causes problems with params
arrays: the following call would invoke the Pack(string, params
object[]) method instead of the Pack(string, IEnumerable) method:

	byte[] b = ByteConverter.Pack ("i", new object[]{1});

Given that, I'm not sure that the `params' parameter is desirable, and
it causes inconsistencies with my TryUnpack() suggestion (see below).

Finally, we need to specify the [Endian] for Pack, so it should be
PackLittleEndian() and PackBigEndian().

A Stream-oriented version would be useful as well:

	Stream Pack[Endian] (string template, Stream stream, params object[] arguments);
	Stream Pack[Endian] (string template, Stream stream, IEnumerable arguments);
	IList  Unpack[Endian] (string template, Stream value);

>     We would add in addition to that, Try versions that throw no
> exceptions on errors:
> 
> bool TryPack (string template, out byte [] result, params ...);
> bool TryUnpack (string template, out ArrayList result, ...);

As suggested above, IList should be used instead of ArrayList, and
TryPack() should have an IEnumerable overload.  Furthermore, `out'
arguments should generally be the last argument in the argument list
(when possible):

	bool TryPack[Endian] (string template, out byte[] result, params object[] args);
	bool TryPack[Endian] (string template, IEnumerable args, out byte[] result);
	bool TryUnpack[Endian] (string template, byte[] value, out IList result);

Note that if we have a `param's array, the `out' parameter can't be
last, which is inconsistent with the FxDG.

Personally, as nice as `param's arrays are, I think they should be
avoided here due to the confusion they bring to method overloading.

Putting it all together:

        static class Mono.ByteConverter {
                public static byte[] GetLittleEndianBytes (double value);
                public static double ToLittleEndianDouble (byte[] value);
                public static void   WriteLittleEndian (Stream stream, double value);
                public static double ReadLittleEndianDouble (Stream stream);
                
                public static byte[] GetBigEndianBytes (double value);
                public static double ToBigEndianDouble (byte[] value);
                public static void   WriteBigEndian (Stream stream, double value);
                public static double ReadBigEndianDouble (Stream stream);
                
                /* repeat above 8 methods for Byte, SByte, Char, Int16, UInt16, 
                   Int32, UInt32, Int64, UInt64, Single */
        }
        
        static class Mono.BytePacker {
                public static byte[] PackLittleEndian (string template, 
                        IEnumerable arguments);
                public static Stream PackLittleEndian (string template, 
                        Stream stream, IEnumerable arguments);
                
                public static byte[] PackBigEndian (string template, 
                        IEnumerable arguments);
                public static Stream PackBigEndian (string template, 
                        Stream stream, IEnumerable arguments);
                
                public static IList  UnpackLittleEndian (string template, byte[] value);
                public static IList  UnpackLittleEndian (string template, Stream value);
                
                public static IList  UnpackBigEndian (string template, byte[] value);
                public static IList  UnpackBigEndian (string template, Stream value);
                
                public static bool TryPackLittleEndian (string template, 
                        IEnumerable args, out byte[] result);
                public static bool TryPackLittleEndian (string template, 
                        Stream stream, IEnumerable args, out byte[] result);
                
                public static bool TryPackBigEndian (string template, 
                        IEnumerable args, out byte[] result);
                public static bool TryPackBigEndian (string template, 
                        Stream stream, IEnumerable args, out byte[] result);
                
                public static bool TryUnpackLittleEndian (string template, 
                        byte[] value, out IList result);
                public static bool TryUnpackLittleEndian (string template, 
                        Stream stream, out IList result);
                
                public static bool TryUnpackBigEndian (string template, 
                        byte[] value, out IList result);
                public static bool TryUnpackBigEndian (string template, 
                        Stream stream, out IList result);
        }

The problem with these suggestions is size -- ByteConverter would have
*88* methods.  It would likely be easier to just use BytePacker (which
only has 16 methods), but BytePacker has additional overhead for use,
due to the required creation of the IEnumerable argument and
interpretation of the `template' parameter,  In short, you wouldn't want
to use BytePacker in a time-critical loop, while ByteConverter probably
could (as it has less overhead per call, *especially* if you use the
Stream overloads, as you don't constantly re-create a byte[] return
value).

ByteConverter could be trimmed down by removing some of its methods --
it might be practical to remove the non-Stream overloads, as you can
always construct a System.IO.MemoryStream and get a byte[] from that
after converting *all* of your data -- but this might not be ideal
either.  I don't think the Stream overloads should be removed, as
they're an ideal way to keep GC pressure lower (fewer byte[] should need
to be created).  Alternatively, perhaps we don't need the unsigned
flavors, which would require only 56 methods.

 - Jon





More information about the Mono-devel-list mailing list