[Mono-dev] Re: [Mono-devel-list] mcs patch for default encoding

Atsushi Eno atsushi at ximian.com
Mon Aug 22 05:13:33 EDT 2005


Hi,

> I think using 1252 as a fallback is better than UTF-8 as it is a regular
> single-byte code page. UTF-8 should be detected (and I think it is 
> detected)
> using byte order marks anyway.
> 
> I agree that using 28591 as the default encoding is is a bad decission.

This guess is "Western centric" ;-) Neither 1252 nor 28591 is
"regular" code page. For example, almost all Japanese text editors
does not support iso-8859-1 in "save as..." feature (usually, only
shift_jis (932/ANSI), iso-2022-jp (50221) and euc-jp (50932) are
supported). I assume this situation is common to other Asian nations.

(Thus Japanese hackers were reluctant to edit such files on which
someone wrote only-iso-8859-1 letters.)

UTF-8 should be detected but cannot be done perfectly (automatic
encoding detection is not always possible) and actually we don't
support further auto detection than BOM lookup (csc seems to handle
them fine). This could be fixed but in reality it's not working.

So I think using utf-8 as the default would make better sense.

> What about using Encoding.Default instead of
> CultureInfo.CurrentCulture.TextInfo.ANSICodePage as it is really based on
> system code page?

Yeah, I'd change them as such.

Atsushi Eno



More information about the Mono-devel-list mailing list