[Mono-dev] Fwd: [Mono-patches] r63710 - in trunk/mcs/class/System.Web: System.Web.UI.WebControls Test/System.Web.UI.WebControls

Atsushi Eno atsushi at ximian.com
Tue Aug 15 09:55:09 EDT 2006


Kornél Pál wrote:
> Hi,
> Atsushi Eno:
>> Is saving files in utf-8 without BOM possible in general western
>> editors land? If yes I like the idea. If not then maybe it is not
>> a good solution for us (yeah, not using non-ASCII letters is the
>> most pessimistic option).
>> (BTW I guess, with BOM you guys will get stuck, right?)
> Kornél Pál:
>> Usually I am using Windows XP that has support for UTF-8 and has no 
>> problem with BOM. For example Notepad has no support for saving UTF-8 
>> without BOM. Microsoft programs (Notepad, Visual Studio, csc, ...) can 
>> recognize UTF-8 without BOM (they try to parse the entire file as 
>> UTF-8 and they treat it as UTF-8 if it's valid UTF-8, otherwise they 
>> use the default ANSI code page). And they recognize BOM of course. For 
>> example Visual Studio is saving files with BOM when they had 
>> originally and save without BOM when they didn't.
> Miguel de Icaza:
>> Emacs can write files in UTF-8, I do not think it respects BOM though,
>> but I could be wrong.
> Jonathan Pryor:
>> vim also handles files in UTF-8 just fine.
>> Personally, I'd go for UTF-8 everywhere, but I know this has caused
>> problems before when it was attempted across the entire class library...
> Rafael Teixeira:
>> I normally use gedit for coding, and it works nicely with utf-8.
>> More recently I'm also using MonoDevelop that also deals with utf-8 in
>> the proper way.
>> BOM is just a visible space character for both editors and the
>> responsability for preserving it is therefore in the user hands.
> So using UTF-8 without BOM seems to be a better choice than UTF-8 with 
> BOM because some text editors handle BOM as a character and it will be 
> left in the middle of the text rather than being used in the beginning 
> of the file.

Hmm, I rather read Miguel's comment as "emacs has no control over
BOM". If it does emit BOMs, we cannot expect emacs users to *not*
output them.

But if it doesn't - I love the idea to replace Latin1 with UTF-8
(no worries; I wouldn't put my Kanji name there ;-).

Kornél, thanks for the investigation.

Atsushi Eno

More information about the Mono-devel-list mailing list