[Mono-devel-list] Marshalling UTF-8 strings
Jonathan Pryor
jonpryor at vt.edu
Thu Jul 8 19:25:52 EDT 2004
On Wed, 2004-07-07 at 07:54, Hemanth Yamijala wrote:
> Currently, there is support in the Marshal and related classes
> to marshal and de-marshal strings to Unicode, Ansi or platform
> default encodings. Is there any chance that UTF-8 gets in this
> list. Or is it already supported.
No. This would require adding a member to the
System.Runtime.InteropServices.CharSet enumeration. IIRC, there had
been discussions with Microsoft about adding additional values, but
Microsoft didn't want to.
> In some earlier messages in the newsgroup there're examples
> which are using the native g_utf16_to_utf8 functions for this
> - but as UTF-8 is a standard enough encoding, can there be
> more direct support for it.
There's always the System.Text.Encoding.UTF8Encoding class. This would
allow you to stay within .NET and portable, instead of directly invoking
g_utf16_to_utf8.
> Do the GTK# functions which require UTF-8 strings use the
> native functions to achieve this functionality ?
>From my quick perusal of the Gtk# sources, Gtk# assumes that Ansi is
UTF-8. GLib.Marshaller.PtrToStringGFree() uses
System.Runtime.InteropServices.Marshal.PtrToStringAnsi(), so
CharSet.Ansi is assumed.
Furthermore, DllImport statements just declare `string' as the parameter
type, so CharSet.Ansi will be used in the managed->unmanaged transition.
This should be correct for most current Linux distros, since UTF-8
should be the default encoding. I'm not sure what this will do on
Windows, though I assume it won't be good...
- Jon
More information about the Mono-devel-list
mailing list