[Mono-bugs] [Bug 73086][Nor] Changed - The UTF-8 decoding problems

bugzilla-daemon@bugzilla.ximian.com bugzilla-daemon@bugzilla.ximian.com
Thu, 21 Apr 2005 07:28:20 -0400 (EDT)

Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by svetlanaz@mainsoft.com.


--- shadow/73086	2005-04-21 02:15:35.000000000 -0400
+++ shadow/73086.tmp.19207	2005-04-21 07:28:20.000000000 -0400
@@ -90,6 +90,21 @@
 can encode only up to 4 bytes per character). But it is not disturb 
 me and you can commit the patch.
 ------- Additional Comments From gonzalo@ximian.com  2005-04-21 02:15 -------
 Applying this patch breaks mcs.
+------- Additional Comments From svetlanaz@mainsoft.com  2005-04-21 07:28 -------
+In the .NET, UTF-8 decoder returns the '\uFEFF' character.
+In the Mono before my patch, the character was eaten.
+The patch corrects the problem.
+I think, that the Decoder is a low level API and should return all 
+encoded characters. And it is responsibility of the users to decide 
+how to treat each character. So, the problem is not with the patch, 
+but with the mcs itself, which incorrectly uses the decoder. The mcs 
+should handle the logic about the special characters such as '\uFEFF'