[Mono-dev] PtrToStringAnsi

Paolo Molaro lupus at ximian.com
Thu Mar 9 09:51:51 EST 2006


On 03/08/06 Joshua Tauberer wrote:
> While debugging a SqliteClient issue, I came across an interesting bug.
>  The following returns null when I'm pretty sure it should not (it
> doesn't on Windows):
> 
> Marshal.PtrToStringAnsi(Marshal.StringToCoTaskMemAnsi("ü"))
> 
> In case the encoding of this email gets messed up, that's a u with
> umlauts, (char)0xFC.
> 
> The encoding half "works" (Marshal.ReadByte reports the bytes (0xFC
> 0x00)), on the assumption that I'm supposed to get ANSI out of this
> method.  Internally, g_utf16_to_utf8 is used, which means that (besides
> being surprised this call doesn't actually do ANSI encoding) I would

We don't do ANSI, because ANSI encoding doesn't mean anything.
Actually it means "whatever crap encoding is used on the current
windows system", which is not useful.
In mono we defined ANSI to mean utf8 on unixy systems, because utf8 is
the only sane option. We probably try to use the same unspecified
encoding on windows and this likely caused some bugs to creep in.
The proper way to deal with this is to always specify the encoding
either implictly or explicitly and flag as obsolete the *Ansi versions
(and no, using flags as suggested is not enough and it's wrong).
We could do this and expose the additional functionaly from corlib, but
then people may be worried about portability issues on the MS runtime.
Usually it's easy enough to just use the methods from the 2.0 encoding
classes with a bit of unsafe context.

In your case the bug is in StringToCoTaskMemAnsi(): it always uses the
latin1 encoding, but it should use utf8 on unix.
Could you file a bug in bugzilla with your test case?
Thanks.

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better



More information about the Mono-devel-list mailing list