[Mono-bugs] [Bug 76758][Maj] Changed - String with nonstandard characters is not handled correctly.

bugzilla-daemon at bugzilla.ximian.com bugzilla-daemon at bugzilla.ximian.com
Wed Dec 7 02:40:28 EST 2005


Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by atsushi at ximian.com.

http://bugzilla.ximian.com/show_bug.cgi?id=76758

--- shadow/76758	2005-12-07 01:56:31.000000000 -0500
+++ shadow/76758.tmp.15178	2005-12-07 02:40:28.000000000 -0500
@@ -1,13 +1,13 @@
 Bug#: 76758
 Product: Mono: Class Libraries
 Version: 1.1
 OS: unknown
 OS Details: Tested under Windows XP SP2 and CentOS 3.4
-Status: REOPENED   
-Resolution: 
+Status: RESOLVED   
+Resolution: NOTABUG
 Severity: Unknown
 Priority: Major
 Component: CORLIB
 AssignedTo: mono-bugs at ximian.com                            
 ReportedBy: admin at svwebhosting.com               
 QAContact: mono-bugs at ximian.com
@@ -194,6 +194,71 @@
   }
 }
 
 
 ------- Additional Comments From admin at svwebhosting.com  2005-12-07 01:56 -------
 It says 1252 in both Windows XP and CentOS
+
+------- Additional Comments From atsushi at ximian.com  2005-12-07 02:40 -------
+Ok, with the codepage is 1252, the corresponding character in Unicode
+codepoint is U+0xFF as well. Now run the proof of code below on both
+MS.NET and Mono:
+
+using System;
+using System.Text;
+
+public class TestClass
+{
+        public static void Main ()
+        {
+                char [] chars = Encoding.GetEncoding (1252).GetChars (
+                        new byte [] {0xFF});
+                Console.Write ("CHARS: ");
+                foreach (char c in chars)
+                        Console.Write ("{0:X04} ", (int) c);
+                Console.WriteLine ();
+                Test (Encoding.ASCII, chars);
+                Test (Encoding.UTF7, chars);
+                Test (Encoding.UTF8, chars);
+                Test (Encoding.Unicode, chars);
+        }
+
+        private static void Test (Encoding e, char [] chars)
+        {
+                Console.Write (e.EncodingName);
+                Console.Write (" : ");
+                foreach (byte b in e.GetBytes (chars))
+                        Console.Write ("{0} ", b);
+                Console.WriteLine ();
+        }
+}
+
+$ ./conv.exe
+CHARS: 00FF
+US-ASCII : 63
+Unicode (UTF-7) : 43 65 80 56 45
+Unicode (UTF-8) : 195 191
+Unicode : 255 0
+
+$ mono ./conv.exe
+CHARS: 00FF
+US-ASCII : 63
+Unicode (UTF-7) : 43 65 80 56 45
+Unicode (UTF-8) : 195 191
+Unicode : 255 0
+
+The results were the same, thus no bug here.
+
+Since on GNOME 2.0 environment the default encoding (it depends on
+environment variables such as LANG) is UTF-8, it is the expected
+result that you got unexpected result (for you; it is expected for me)
+when you feed source files written in CP1252 into UTF8 decoder.
+
+If you still want to use Latin1 sources, set LANG=blah.Latin1 (e.g.
+en_US.Latin1).
+
+As I wrote above, it is incorrect that Encoding.Default returns ANSI
+character set, since some encodings like Shift_JIS are not compatible
+with ANSI. It must be a bug in ECMA CLI specification since not all
+platforms use ANSI-compatible encoding as default encoding (it does
+not make sense that Encoding.Default returns certain different
+encoding than the actual encoding of the system default).


More information about the mono-bugs mailing list