[Mono-bugs] [Bug 75374][Wis] New - P/Invoke calls with CharSet=CharSet.Unicode do not properly convert returned string

bugzilla-daemon at bugzilla.ximian.com bugzilla-daemon at bugzilla.ximian.com
Fri Jun 24 19:17:18 EDT 2005

Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by chastamar at yahoo.com.


--- shadow/75374	2005-06-24 19:17:18.000000000 -0400
+++ shadow/75374.tmp.4554	2005-06-24 19:17:18.000000000 -0400
@@ -0,0 +1,65 @@
+Bug#: 75374
+Product: Mono: Runtime
+Version: 1.1
+OS Details: Linux 2.6 x86
+Status: NEW   
+Priority: Wishlist
+Component: interop
+AssignedTo: mono-bugs at ximian.com                            
+ReportedBy: chastamar at yahoo.com               
+QAContact: mono-bugs at ximian.com
+TargetMilestone: ---
+Summary: P/Invoke calls with CharSet=CharSet.Unicode do not properly convert returned string
+Please fill in this template when reporting a bug, unless you know what you
+are doing.
+Description of Problem:
+When specifying CharSet=CharSet.Unicode for the DllImport attribute of a
+PInvoke method, the expected behaviour is for string parameters and return
+value to be marshalled to/from unicode (utf16). However, it seems mono
+ignores this attribute for the return value and resortes to the default
+string conversion.
+I will attach a test case which includes:
+1. A .c file with a simple function (unicode_str_func) that just copies the
+given string as-is to a newly-allocated buffer and returns it.
+2. A .cs file which calls unicode_str_func (with DllImport specifying
+CharSet=CharSet.Unicode). The original and returned strings are printed.
+The returned string is truncated after the first character - a sign for the
+wrong (8-bit - utf8?) conversion which has occured.
+Steps to reproduce the problem:
+1. Compile the attached .cs file with:
+mcs -codepage:utf8 TestUnicodeMarshalling.cs
+2. Compile the attached .c file with:
+gcc -shared TestUnicodeFunc.c -o unicode_str_func.so
+3. Run (with LD_LIBRARY_PATH including "."):
+mono TestUnicodeMarshalling.exe
+4. Watch and enjoy :-)
+Actual Results:
+The returned unicode string is converted as an 8-bit string.
+Expected Results:
+Perform the correct conversion (which just copies the string verbatim, as
+utf16 is mono's native string encoding).
+How often does this happen? 
+All the time.
+Additional Information:
+ECMA-335 says: (14.5.2, asterisks added)
+"The attributes ansi, autochar, and unicode are mutually exclusive.  They
+govern how strings will be marshaled for calls to this method: ansi
+indicates that the native code will receive (and possibly ***return***) a
+platform-specific representation that corresponds to a string encoded in
+the ANSI character set (typically this would match the representation of a
+C or C++ string constant); autochar indicates a platform-specific
+representation that is “natural” for the underlying platform; and unicode
+indicates a platform-specific representation that corresponds to a string
+encoded for use with Unicode methods on that platform."

More information about the mono-bugs mailing list