[Gtk-sharp-list] Re: [MonoDevelop] Encoding problems

Jonathan Pryor jonpryor@vt.edu
Fri, 16 Apr 2004 07:22:15 -0400

On Thu, 2004-04-15 at 17:38, Artur Brodowski wrote:
> W liƛcie z czw, 15-04-2004, godz. 22:52, John Luke pisze: 
> > Does it work if you add -codepage:utf8 to the mcs compile line?
> Yes, it works, thanks :)
> But shouldn't that be taken care of by MonoDevelop? 
> And also - is UTF-16 a standard for Gtk# applications?

Not having used MonoDevelop yet (yes, I'm evil!), I can only guess...

I suspect the problem is the lack of a BOM (Byte Order Mark), which
would let the compiler know the byte order of the file.

UTF-16 requires the presence of a BOM (0xFFFE or OXFEFF, depending on
big-endian or little-endian, not necessarily in that order), so if the
BOM is present the compiler will know what codepage to use.

UTF-8 doesn't require it.  Which means it is impossible to distinguish
between a UTF-8 encoded file and a file encoded in the local codepage. 
Consequently, mcs assumes that the local codepage is used.

The solution is to either tell mcs the correct codepage, which is what
-codepage:UTF-8 does, or to insert a UTF-8 encoded BOM at the beginning
of the file.

Or, fix MonoDevelop so that it always passes -codepage:UTF-8, as it's
using GtkSourceView, which edits UTF-8 text, so it's unlikely that
another encoding would be used...

 - Jon