[Mono-dev] mcs patch for default encoding
atsushi at ximian.com
Tue Aug 23 05:50:58 EDT 2005
I don't think this is acceptable because of its significant
performance loss (reading the entire stream)...
Kornél Pál wrote:
> Character set detection.
> This code uses a UTF8Encoding with throwOnInvalidBytes. StreamReader
> BOM (UTF-8, Unicode, Unicode (Big-Endian)). UTF-8 is easy to validate as it
> has strict rules regarding the byte
> representation of character. So it's safe to assume that a text is UTF-8 if
> it can be parsed as UTF-8. UTF8Encoding (with throwOnInvalidBytes) throws
> ArgumentException when it is
> not UTF-8. In this case fall back to Encoding.Default.
> Unicode (16-bit) is not detected by csc.exe without BOM so I think we
> shouldn't deal with it.
More information about the Mono-devel-list