[Mono-dev] Question about encodings - possible documentation bug?
Mads Bondo Dydensborg
mbd at dbc.dk
Wed Jul 4 08:28:49 EDT 2007
It is my understanding, that mono strings are always UTF16 internally.
But, what encoding does the source files needs to be?
The documentation (man gmcs) suggest this:
"By default files will be processed in the Latin-1 code page."
It happens to be, that I have some source files in Latin1. My locale is
en_US.UTF-8. It appears that some characters gets to be ignored, when strings
string test = "foo æøå bar";
(Middle three characters are Danish chars, with binary rep e6 f8 e5 in
Latin1, aka the encoding of the sourcefile. )
The string test will, on runtime, be printed (and I have checked this by
traversing the string) as "foo bar".
Now, using gmcs, with -codepage:28591 (Latin1) makes a string, that _does_
have the characters, sugggesting that the documentation are in error.
So, isn't this a bug in the docs, and should (g)mcs not complain, if it finds
chars not supported?
Med venlig hilsen/Regards
Systemudvikler/Systemsdeveloper cand.scient.dat, Ph.d., Mads Bondo Dydensborg
Dansk BiblioteksCenter A/S, Tempovej 7-11, 2750 Ballerup, Tlf. +45 44 86 77 34
More information about the Mono-devel-list