[Mono-list] Trouble with utf-16 marshaling

Maser, Dan Dan.Maser at inin.com
Fri Jun 29 18:23:05 EDT 2007


   I have debugged this some more, and found this.  (I'm not yet sure
how to convert this information into something actionable).
 
I was browsing some of the mono source code and found this function (and
its sisters):
      MonoString* mono_string_new_utf16 (MonoDomain *domain, const
guint16 *text, gint32 len);
 
which seem to be the function(s) that initialize internal C# strings
from C data.  This one in particular appears to be invoked when internal
C# strings are created from UTF-16 "C" data.   I hacked in a simple loop
that printf'd the hex values of the UTF-16 data (the 'text' parameter).
 
  What I see in my console window is interesting.  (After a bunch of
unrelated stuff) I see my C library returning a UTF-16 string that gets
correctly interpreted as MonoString:
 
    DBG: invocation of mono_string_new_utf16 with data:
                   002f  0068  006f  006d  0065  002f  0064  0061  006e
006d  002f  0069  006e  0074 ...
 
which is the correct string.  The next thing I see in the console window
is this:
 
    DBG: invocation of mono_string_new_utf16 with data:
                   682f  6d6f  2f65  6164  6d6e  692f  746e''
 
Notice that this second data is similiar to the first where each 2-bytes
in the second string is the corresponding *4* bytes of the first string
and re-ordered as if there were some endian issue.  Clearly this second
string is supposed to be the same as the first string but it's been
damaged by some translation process.
 
   Does that information mean anything to anyone?   As always, thanks
for any help.
        Dan Maser.


________________________________

From: Maser, Dan 
Sent: Friday, June 29, 2007 1:10 PM
To: Maser, Dan; 'mono-list at lists.ximian.com'
Subject: RE: [Mono-list] Trouble with utf-16 marshaling


   Furthermore, I see in the mono source code that there is a test
function in the mono/mono/tests/libtest.c
 
STDCALL unsigned short*
test_lpwstr_marshal (unsigned short* chars, long length)
{
...
}
 
   Which is basically the same thing I'm doing; further indicating that
this should work.

________________________________

From: mono-list-bounces at lists.ximian.com
[mailto:mono-list-bounces at lists.ximian.com] On Behalf Of Maser, Dan
Sent: Friday, June 29, 2007 9:13 AM
To: mono-list at lists.ximian.com
Subject: [Mono-list] Trouble with utf-16 marshaling




   My situation is this:  I've got a C library that has a lot of UTF-16
inputs and outputs.  The C type is always "unsigned short*" or "const
unsigned short*" (because clearly wchar_t* isn't portable because it's 4
bytes on linux).   All of my C# code has the
"[MarshalAs(UnsignedType.LPWStr)]" attribute specified.

   It works properly in windows with MS .NET, but doesn't work for me in
linux with mono.   I've verified in gdb that the C library is returning
the correct string, but immediately after the C dll returns and mono
does the LPWStr marshaling the string is total garbage characters.   I
am under the impression from previous posts that 2-byte UTF-16 should
marshal properly to mono with the LPWStr attribute.  In fact it looks
like some of the gdiplus calls use that same thing and work... any ideas
what I can check on because mine doesn't?

   For more clarification my C library has a function signature like
this: 

void my_function(unsigned short* myArg); 

    And my C# code looks like this: 


[DllImport("myCLib")] 
public static extern void my_function([MarshalAs(UnmanagedType.LPWStr)]
string myArg); 

   Thanks in advance for any ideas on what to check! 
      Dan Maser 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-list/attachments/20070629/f4be148e/attachment-0001.html 


More information about the Mono-list mailing list