[Mono-dev] Question about encodings - possible documentation bug?
Mads Bondo Dydensborg
mbd at dbc.dk
Wed Jul 4 08:28:49 EDT 2007
Hi there
It is my understanding, that mono strings are always UTF16 internally.
But, what encoding does the source files needs to be?
The documentation (man gmcs) suggest this:
"By default files will be processed in the Latin-1 code page."
It happens to be, that I have some source files in Latin1. My locale is
en_US.UTF-8. It appears that some characters gets to be ignored, when strings
are constructed:
string test = "foo æøå bar";
(Middle three characters are Danish chars, with binary rep e6 f8 e5 in
Latin1, aka the encoding of the sourcefile. )
The string test will, on runtime, be printed (and I have checked this by
traversing the string) as "foo bar".
Now, using gmcs, with -codepage:28591 (Latin1) makes a string, that _does_
have the characters, sugggesting that the documentation are in error.
So, isn't this a bug in the docs, and should (g)mcs not complain, if it finds
chars not supported?
Regards,
Mads
--
Med venlig hilsen/Regards
Systemudvikler/Systemsdeveloper cand.scient.dat, Ph.d., Mads Bondo Dydensborg
Dansk BiblioteksCenter A/S, Tempovej 7-11, 2750 Ballerup, Tlf. +45 44 86 77 34
More information about the Mono-devel-list
mailing list