[Mono-bugs] [Bug 76758][Maj] Changed - String with nonstandard
characters is not handled correctly.
bugzilla-daemon at bugzilla.ximian.com
bugzilla-daemon at bugzilla.ximian.com
Wed Dec 7 02:40:28 EST 2005
Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.
Changed by atsushi at ximian.com.
http://bugzilla.ximian.com/show_bug.cgi?id=76758
--- shadow/76758 2005-12-07 01:56:31.000000000 -0500
+++ shadow/76758.tmp.15178 2005-12-07 02:40:28.000000000 -0500
@@ -1,13 +1,13 @@
Bug#: 76758
Product: Mono: Class Libraries
Version: 1.1
OS: unknown
OS Details: Tested under Windows XP SP2 and CentOS 3.4
-Status: REOPENED
-Resolution:
+Status: RESOLVED
+Resolution: NOTABUG
Severity: Unknown
Priority: Major
Component: CORLIB
AssignedTo: mono-bugs at ximian.com
ReportedBy: admin at svwebhosting.com
QAContact: mono-bugs at ximian.com
@@ -194,6 +194,71 @@
}
}
------- Additional Comments From admin at svwebhosting.com 2005-12-07 01:56 -------
It says 1252 in both Windows XP and CentOS
+
+------- Additional Comments From atsushi at ximian.com 2005-12-07 02:40 -------
+Ok, with the codepage is 1252, the corresponding character in Unicode
+codepoint is U+0xFF as well. Now run the proof of code below on both
+MS.NET and Mono:
+
+using System;
+using System.Text;
+
+public class TestClass
+{
+ public static void Main ()
+ {
+ char [] chars = Encoding.GetEncoding (1252).GetChars (
+ new byte [] {0xFF});
+ Console.Write ("CHARS: ");
+ foreach (char c in chars)
+ Console.Write ("{0:X04} ", (int) c);
+ Console.WriteLine ();
+ Test (Encoding.ASCII, chars);
+ Test (Encoding.UTF7, chars);
+ Test (Encoding.UTF8, chars);
+ Test (Encoding.Unicode, chars);
+ }
+
+ private static void Test (Encoding e, char [] chars)
+ {
+ Console.Write (e.EncodingName);
+ Console.Write (" : ");
+ foreach (byte b in e.GetBytes (chars))
+ Console.Write ("{0} ", b);
+ Console.WriteLine ();
+ }
+}
+
+$ ./conv.exe
+CHARS: 00FF
+US-ASCII : 63
+Unicode (UTF-7) : 43 65 80 56 45
+Unicode (UTF-8) : 195 191
+Unicode : 255 0
+
+$ mono ./conv.exe
+CHARS: 00FF
+US-ASCII : 63
+Unicode (UTF-7) : 43 65 80 56 45
+Unicode (UTF-8) : 195 191
+Unicode : 255 0
+
+The results were the same, thus no bug here.
+
+Since on GNOME 2.0 environment the default encoding (it depends on
+environment variables such as LANG) is UTF-8, it is the expected
+result that you got unexpected result (for you; it is expected for me)
+when you feed source files written in CP1252 into UTF8 decoder.
+
+If you still want to use Latin1 sources, set LANG=blah.Latin1 (e.g.
+en_US.Latin1).
+
+As I wrote above, it is incorrect that Encoding.Default returns ANSI
+character set, since some encodings like Shift_JIS are not compatible
+with ANSI. It must be a bug in ECMA CLI specification since not all
+platforms use ANSI-compatible encoding as default encoding (it does
+not make sense that Encoding.Default returns certain different
+encoding than the actual encoding of the system default).
More information about the mono-bugs
mailing list