[Mono-dev] Re: [Mono-devel-list] mcs patch for default encoding

Mon Aug 22 06:31:27 EDT 2005

In this case there is no need to provide a fallback case in mcs as well.:)

Kornél

----- Original Message -----
From: "Atsushi Eno" <atsushi at ximian.com>
To: "Kornél Pál" <kornelpal at hotmail.com>
Cc: "mono-devel mailing list" <mono-devel-list at lists.ximian.com>
Sent: Monday, August 22, 2005 12:15 PM
Subject: Re: [Mono-dev] Re: [Mono-devel-list] mcs patch for default encoding


> Hi,
>
> Ok, then now I'll stop pushing shift_jis as the default (well,
> I didn't ;-) and just use Encoding.Default. BTW if it falls back
> to unsupported case, then this property returns UTF8.
>
> Atsushi Eno
>
> Kornél Pál wrote:
>> 1252 is far from Hungarian as well altough I think not as far as from
>> Japanese.:) But normally Encoding.Default will be used that depends on
>> the
>> hacker. If he likes Japanese code pages he can set them on the system and
>> will be used by mcs. The second case is only a fallback. And I think
>> using a
>> simple SBCS latin code page is better.
>>
>> Kornél
>>
>> ----- Original Message -----
>> From: "Atsushi Eno" <atsushi at ximian.com>
>> To: "Kornél Pál" <kornelpal at hotmail.com>
>> Cc: "mono-devel mailing list" <mono-devel-list at lists.ximian.com>
>> Sent: Monday, August 22, 2005 11:13 AM
>> Subject: Re: [Mono-dev] Re: [Mono-devel-list] mcs patch for default
>> encoding
>>
>>
>>> Hi,
>>>
>>>> I think using 1252 as a fallback is better than UTF-8 as it is a
>>>> regular
>>>> single-byte code page. UTF-8 should be detected (and I think it is
>>>> detected)
>>>> using byte order marks anyway.
>>>>
>>>> I agree that using 28591 as the default encoding is is a bad decission.
>>>
>>> This guess is "Western centric" ;-) Neither 1252 nor 28591 is
>>> "regular" code page. For example, almost all Japanese text editors
>>> does not support iso-8859-1 in "save as..." feature (usually, only
>>> shift_jis (932/ANSI), iso-2022-jp (50221) and euc-jp (50932) are
>>> supported). I assume this situation is common to other Asian nations.
>>>
>>> (Thus Japanese hackers were reluctant to edit such files on which
>>> someone wrote only-iso-8859-1 letters.)
>>>
>>> UTF-8 should be detected but cannot be done perfectly (automatic
>>> encoding detection is not always possible) and actually we don't
>>> support further auto detection than BOM lookup (csc seems to handle
>>> them fine). This could be fixed but in reality it's not working.
>>>
>>> So I think using utf-8 as the default would make better sense.
>>>
>>>> What about using Encoding.Default instead of
>>>> CultureInfo.CurrentCulture.TextInfo.ANSICodePage as it is really
>>>> based on
>>>> system code page?
>>>
>>> Yeah, I'd change them as such.
>>>
>>> Atsushi Eno
>>> _______________________________________________
>>> Mono-devel-list mailing list
>>> Mono-devel-list at lists.ximian.com
>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>
>>
>> _______________________________________________
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>
>
>