[Mono-list] string encoding
Havoc Pennington
hp@redhat.com
Sun, 22 Jun 2003 10:51:48 -0400
Hi,
Hmm, on source code reading the MonoString stuff looks like it needs
some love...
First it looks like "ANSI" means Windows 1252, which isn't quite
Latin-1 and isn't ASCII either.
http://www.hclrss.demon.co.uk/demos/ansi.html
Hopefully "ANSI" doesn't mean "the 8-bit encoding for this local
version of Windows" and is always the 1252 flavor.
Mono looks like it uses UTF-8 instead of ANSI, see appended code for
example.
A couple other issues there:
- the "ANSI to UTF-16" conversion can't fail, but from UTF-8 can, and
so Mono PtrToStringAnsi has a failure mode that isn't in the docs.
- making one copy of the data in utf8_to_utf16 then another copy of
that string seems kind of inefficient.
- you can pass in NULL for the GError** if you don't care which
error occurs and are just going to free it.
I dunno. Anyhow, I guess I'll just use string for now without a custom
marshaller, and file a bug report.
If "ANSI" does change to mean a different encoding on different local
versions of Windows, then just pretending Linux is a strange Windows
version that uses UTF-8 instead maybe isn't breaking things more than
they are already. But I can't tell if that's how it works.
Havoc
MonoString*
mono_string_new (MonoDomain *domain, const char *text)
{
GError *error = NULL;
MonoString *o = NULL;
guint16 *ut;
glong items_written;
int l;
l = strlen (text);
ut = g_utf8_to_utf16 (text, l, NULL, &items_written, &error);
if (!error)
o = mono_string_new_utf16 (domain, ut, items_written);
else
g_error_free (error);
g_free (ut);
return o;
}
MonoString *
ves_icall_System_Runtime_InteropServices_Marshal_PtrToStringAnsi (char
*ptr)
{
MONO_ARCH_SAVE_REGS;
return mono_string_new (mono_domain_get (), ptr);
}