[Mono-devel-list] 7 regressions appeared for MS.VB.dll becauseof change in mcs
Rafael Teixeira
monoman at gmail.com
Mon Nov 1 18:00:25 EST 2004
Hi Jonathan,
> In Visual Studio .NET, one of the save options is "Unicode (UTF-8 with
> signature)." Obviously, mono cannot arbitrarily detect what encoding a
> given byte sequence is in (though there are some good heuristics out
> there), but if an explicit signature is present, will mono treat the file
> as UTF-8?
>
> Jonathan Gilbert
Well, first in this thread we are talking about mcs, mono's C#
compiler, not mono itself. Well nowadays mcs has such code in place:
try {
encoding = Encoding.GetEncoding (28591);
} catch {
Console.WriteLine ("Error: could not load encoding 28591, trying 1252");
encoding = Encoding.GetEncoding (1252);
}
and then
SeekableStreamReader reader = new SeekableStreamReader (input,
encoding, using_default_encoder);
So answering as using_default_encoder starts as true, mcs will try to
detect bytemarks and recognize the encoding, BUT if the byte marks
aren't present it will default to the ISO-8859-1 or Windows-1252
codepages.
That was the change in mcs I was talking about.
mbas still uses this code:
// We are here forcing StreamReader to assume current system codepage,
// because normally it defaults to UTF-8
input = new StreamReader(fileName, System.Text.Encoding.Default);
that simply uses the codepage/encoding set for the system.
Probably we may have to add a /codepage option for mbas also. Filled
#69004 at bugzilla to that end.
Fun,
On Sun, 31 Oct 2004 14:29:37 -0500, Jonathan Gilbert
<2a5gjx302 at sneakemail.com> wrote:
> At 12:08 PM 31/10/2004 -0500, Miguel de Icaza wrote:
> >Hello,
> >> <rant>
> >> As mcs no more defaults to the encoding set in the LANG environment
> >> variable (mine says LANG=en_US.UTF-8) one edits sources with, say,
> >> gedit like I do where you see every accented letter or another
> >> international character correctly represented in the source and then
> >> mcs compiles then all wrong.
> >
> >It never defaulted to it. You just upgraded your OS and that is why you
> >get that behavior.
> >
> >If you want the VB tests to pass completely, you should instead encode
> >any non-7bit characters using the \uXXXX syntax.
>
--
Rafael "Monoman" Teixeira
---------------------------------------
Just the 'crazy' me in a sane world, or would it be the reverse? I dunno...
More information about the Mono-devel-list
mailing list