[Mono-dev] mcs default encoding: Latin1 or not

Atsushi Eno atsushi at ximian.com
Mon Aug 29 01:19:56 EDT 2005


Hi,

> We shouldn't use non-ASCII characters inside code for identifiers but we 
> can
> use other characters in strings and comments. Of course we could use ASCII
> but I think UTF-8 is a better solution.

Actually I don't reasonably understand the reason why you can't use
non-ASCII identifier but in general it's ok.

> Back to vim: I think that the fact that vim has no UTF-8 support tells that
> vim is a tool from the past or the developers of vim still live in the past
> as everything around vim has UTF-8 support.

What I said is that vim *on cygwin* does not support UTF-8. Actually
there is no way on vim side to support utf-8 since cygwin itself does
not support utf-8 console output.

> At least on Windows you can open texts in any code page, edit them and when
> you save them no characters will be corrupted so you can open it again 
> using
> the correct code page. This is true for UTF-8 as well. Some control chars

This is simply not true. As I wrote before, usually Japanese text
editors (including notepad and vs.net) don't support Latin1 encoding.

> But I'm still sure that mcs has a bug regarding UTF-8. As non UTF-8 encoded
> files can be read using UTF8Encoding (it skips invalid characters) but mcs
> throws error for the source code that should not be caused by ignored
> comment characters.

mcs or UTF8Encoding seems to have a problem on processing BOM-less
UTF8 files. As people on irc would have seen, I don't like
SeekableStreamReader and would like to remove it. At least it is
the exact *fact* as of today that we shouldn't expect mcs to
handle BOM-less UTF-8 sources correctly.

Atsushi Eno




More information about the Mono-devel-list mailing list