[Mono-dev] Marshalling CharSet=CharSet.Unicode under Linux.

Jonathan Pryor jonpryor at vt.edu
Thu Aug 16 07:09:40 EDT 2007


On Thu, 2007-08-09 at 10:15 -0400, Gary M. Smithrud wrote:
> I have several unmanaged function defined in libraries that use UTF-32
> under Linux.  Marshalling does not appear to be working because it is
> providing strings as UTF-16.  This is based upon the test code I found
> with the Mono source.  This appears to be completely wrong and makes
> Marshalling pointless.  UTF-32 is the native Unicode format under UNIX
> (it is under HPUX, Solaris, AIX (64-bit), and Linux) and thus the
> Marshalling should provide strings in that format…or am I missing
> something?

Surprising as this may be, not all Unix libraries use UTF-32. :-)

(In particular, Qt uses UTF-16 strings, and I believe Mozilla does as
well.  There are likely others.)

Furthermore, keeping CharSet.Unicode as UTF-16 has a performance
advantage, as no marshaling is necessary during the P/Invoke, since
managed strings are UTF-16 anyway, so a pointer to the (pinned) managed
string need only be passed instead of needing to copy the entire string.

When you need UTF-32, there are two solutions: manually marshal the
string, or use a custom marshaler.  Both rely on
Mono.Unix.UnixMarshal.StringToHeap():

http://www.go-mono.org/docs/index.aspx?tlink=0@ecma%3a106%23UnixMarshal%
2fM%2fStringToHeap%2f14

A demonstration of manual marshaling vs. custom marshaling for UTF-32
strings is at:

    http://lists.ximian.com/pipermail/mono-list/2007-July/035633.html

 - Jon






More information about the Mono-devel-list mailing list