[Mono-bugs] [Bug 23541][Wis] Changed - mcs needs to deal with the encoding of source files

bugzilla-daemon@rocky.ximian.com bugzilla-daemon@rocky.ximian.com
4 Sep 2002 16:10:00 -0000


Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by miguel@ximian.com.

http://bugzilla.ximian.com/show_bug.cgi?id=23541

--- shadow/23541	Tue Aug 27 20:29:44 2002
+++ shadow/23541.tmp.23053	Wed Sep  4 12:10:00 2002
@@ -116,6 +116,27 @@
 
 So IMHO this must be done either by the runtime or by our StreamReader implementation.
 
 I don't want to "force" this bug back into the runtime, so keeping it here and setting priority to wishlist.
 
 
+
+------- Additional Comments From miguel@ximian.com  2002-09-04 12:10 -------
+CSC has an option called /codepage:XXX which is used for specifying
+the codepage for the input file. 
+
+The documentation for codepage claims that if the source code is in
+either the default code page, or Unicode or UTF-8 the compiler will be
+able to figure things out on its own.
+
+I am assuming they mean `UTF-16' when they say Unicode.  
+
+Distinguishing what microsoft calls "Unicode" and "Unicode big endian"
+on Windows is easy, the first two bytes are 0xfe 0xff (unicode) or
+0xfe 0xff (unicode big endian).
+
+On *windows* they use 0xef 0xbb 0xbf for Utf-8 encoded files
+
+So I assume the rest is supposed to be encoded in the current
+"codepage", an interesting concept, because I do not know how code
+pages map to character sets on Unix or how to tell what the current
+codepage is.