[Mono-list] string encoding

Marcus mathpup@mylinuxisp.com
Sun, 22 Jun 2003 00:55:26 -0500


The standard marshaling provided by the runtime only permits conversion of 
System.String to "Ansi" (CharSet.Ansi) or unicode (CharSet.Unicode). It's 
also possible to use CharSet.Auto, which selects character type based on the 
platform. If you really want to convert System.String to UTF8 (and not just 
Ansi), I think that you'll need to marshal your strings manually. One way is 
to use System.Text.Encoding.UTF8.GetBytes(...), allocate the unmanged 
storage, and copy the encoded bytes from the managed byte[] to the unmanaged 
storage.

It's also possible to create a custom marshaler to do this.


On Saturday 21 June 2003 11:33 pm, Havoc Pennington wrote:
> Hi,
>
> As best I can tell from C# docs a string is a sequence of char, and a
> char is a 16-bit Unicode character. So strings are in UCS-2
> encoding. Trying to figure out then how to marshal/unmarshal UTF-8 via
> PInvoke.