[Mono-devel-list] filename character conversions(util/strenc.c)

Paolo Molaro lupus at ximian.com
Wed Nov 5 07:12:51 EST 2003


On 11/04/03 Bernie Solomon wrote:
> > UTF16 is what the CLR specifies for "Unicode", and as the MS runtime
> > always seems to use LE (whether because it's x86-platform-endian or by
> > choice) it makes sense to always use LE in mono.  (Page 141 of my copy
> > of Partition II mentions x86-endian storage for enum array elements of
> > fixed args that are not bool or char.  It doesn't seem to specify the
> > endianness of unicode chars anywhere.)
> 
> This isn't how my big endian machines have been running and I can't really
> see how we can make this work (unless we make every native type LE which
> would be a big penalty). When would we make 'char' (which isn't a type in
> MSIL anyway) byte swap as it becomes an integer type? Indeed at the MSIL
> level short[] and char[] generate the same code (of course you can track
> data types and probably work out which is which but that is a lot of work
> and
> in unverifiable code that would be impossible).

Bernie is right, of course: the recent changes in metadata/locales.c and
util/strenc.c that unconditionally use UTF16LE are broken.
Runtime strings in mono are UTF with the endianess matching the cpu
endianess.

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better



More information about the Mono-devel-list mailing list