[Mono-devel-list] 7 regressions appeared for MS.VB.dll becauseof change in mcs

Mon Nov 1 19:27:51 EST 2004

At 09:00 PM 01/11/2004 -0200, Rafael Teixeira wrote:
>Hi Jonathan,
>> In Visual Studio .NET, one of the save options is "Unicode (UTF-8 with
>> signature)." Obviously, mono cannot arbitrarily detect what encoding a
>> given byte sequence is in (though there are some good heuristics out
>> there), but if an explicit signature is present, will mono treat the file
>> as UTF-8?
>> 
>> Jonathan Gilbert
>
>Well, first in this thread we are talking about mcs, mono's C#
>compiler, not mono itself.
[snip]

Over the years I have come to think of the compiler as being a part of .NET
-- one does not have .NET without a C# compiler :-) I have even come to
think of Java in this way, even though it is patently untrue.

[snip]
>So answering as using_default_encoder starts as true, mcs will try to
>detect bytemarks and recognize the encoding, BUT if the byte marks
>aren't present it will default to the ISO-8859-1 or Windows-1252
>codepages.
[snip]

This answers my question :-) Thanks. The reason I wondered is because a
number of people have asked about the use of UTF-8, and while the use of
/codepage has been suggested many times, nobody has once suggested placing
the byte marks at the start of the file, which would solve the problem
permanently without needing new flags to compile. Of course, that also
requires a code editor that recognizes the byte marks, but let's face it:
we are living in the 21st century now. :-)

Thanks,

Jonathan Gilbert