[Mono-dev] mcs default encoding: Latin1 or not

Fri Aug 26 01:49:00 EDT 2005

Hi,

> If you don't like ISO 28591 because it's foreign, why do you want to use
> ASCII in source files?:)

Well, ASCII is not foreign for Japanese. All of iso-2022-jp /
shift_jis / euc-jp don't contradict ASCII and it is actually
part of those encodings.

I know there used to be non-ASCII based encodings such as Indian
ISSCII-7, Arabic ASMO 449, Banguradesh BDS 1520:1995 etc. but I
don't know any modern encoding that contradicts ASCII (I don't
think it is possible to publish world-ready applications with
those encodings).

So AFAIK ASCII is safe, the GCM for us. Latin1 is not the case.

> I personally hate the fact of having code pages but this has historical
> reasons. I think UTF-8 is a good solution as it is international,
> culture-neutral and ASCII compatible.
> 
> I think we are living in the age of Unicode. So there is no reason to use
> ASCII. It's OK to use only ASCII in identifiers and use English in comments
> and texts but I don't think we shouldn't take advantage of Unicode. We can
> use it for names for example.

Can we edit UTF8 files on vim on cygwin? No. This fact simply tells
that we are not living in the age of Unicode.

I heard a story - there was a Japanese or Chinesee who used Chinese
character in his (or her) blog which are aggregated in somewhere
(I don't remember the details) and that person got blamed of using
Chinese, even though it is written in utf-8 encoding.

> I think mcs should use Encoding.Default as default encoding as I think this
> is nearest to the user's need and provides compatiblity with csc.exe.

> But we should use UTF-8 without signature (BOM) for our .cs source code
> files and explicitly specify for mcs to use UTF-8.

Why? I think we *should* use BOM as we discussed before that mcs
(nor csc) does not autodetect encoding correctly.

Here I guess that you think BOM-less UTF8 sources could be edited
in Latin1 editors. What happens if I put CJK ideographs? Actually
we all (really all) Japanese hackers said that they feel reluctant
to edit those files that contain Latin1 letters, because our
usual editors does not support Latin1 (even as a candidate of
encodings to save file).

Atsushi Eno