[Mono-devel-list] Marshalling UTF-8 strings

Jonathan Pryor jonpryor at vt.edu
Thu Jul 8 19:25:52 EDT 2004


On Wed, 2004-07-07 at 07:54, Hemanth Yamijala wrote:
> Currently, there is support in the Marshal and related classes
> to marshal and de-marshal strings to Unicode, Ansi or platform
> default encodings. Is there any chance that UTF-8 gets in this
> list. Or is it already supported.

No.  This would require adding a member to the
System.Runtime.InteropServices.CharSet enumeration.  IIRC, there had
been discussions with Microsoft about adding additional values, but
Microsoft didn't want to.

> In some earlier messages in the newsgroup there're examples
> which are using the native g_utf16_to_utf8 functions for this
> - but as UTF-8 is a standard enough encoding, can there be 
> more direct support for it.

There's always the System.Text.Encoding.UTF8Encoding class.  This would
allow you to stay within .NET and portable, instead of directly invoking
g_utf16_to_utf8.

> Do the GTK# functions which require UTF-8 strings use the 
> native functions to achieve this functionality ?

>From my quick perusal of the Gtk# sources, Gtk# assumes that Ansi is
UTF-8.  GLib.Marshaller.PtrToStringGFree() uses
System.Runtime.InteropServices.Marshal.PtrToStringAnsi(), so
CharSet.Ansi is assumed.

Furthermore, DllImport statements just declare `string' as the parameter
type, so CharSet.Ansi will be used in the managed->unmanaged transition.

This should be correct for most current Linux distros, since UTF-8
should be the default encoding.  I'm not sure what this will do on
Windows, though I assume it won't be good...

 - Jon





More information about the Mono-devel-list mailing list