[Mono-list] Non ASCII characters in filenames / command line parameters

Uwe Oertel uweo@voelcker.com
Mon, 13 Oct 2003 19:12:52 +0200


Hello there,

regarding the problem using language specific characters,
like German =C4=D6=DC in filenames or command line characters,
mono (0.28) only supports UTF-8 character encoding for
filenames and command line parameters
( see: http://bugzilla.ximian.com/show_bug.cgi?id=3D30781 ).

Therefore you will have to enable UTF-8 mode on the console (e.g. with:
echo -e "\033%G"), if you want to call the following C# programm:

// ---------------- 8< ------------------ >8 ---------------------

using System;
using System.IO;

namespace CreatePath
{
	public class CreatePath
	{
		public static void Main(string []args)
		{
			Directory.CreateDirectory(args[0]);
		}
	}
}

// ---------------- 8< ------------------ >8 ---------------------

with:

$mono CreatePath.exe Dir=C4=D6=DC=E4=F6=FC=DF001

otherwise you will get the following exception:

Unhandled Exception: System.ArgumentNullException: Argument cannot be =
null
Parameter name: path
in <0x00036> System.IO.Directory:CreateDirectory (string)
in <0x00020> CreatePath.CreatePath:Main (string[])

.

On our linux environment we partly use "de_DE@euro" for language =
encoding
(LC_CTYPE =3D de_DE@euro), that works fine with Samba 2.2.7a, which does =
not
support UTF-8. I did a small patch in unicode.c =
(mono/io-layer/unicode.c)
and unicode.h (mono/io-layer/unicode.h) to test, whether it works with =
locale encoding for directory-/filenames and it does.

If you use the following conversion functions (primary the first and=20
second):

/*  ---------------- 8< ------------------ >8 --------------------- */

#include <glib.h>
#include <glib/gconvert.h>

gchar *_wapi_unicode_to_locale(const gunichar2 *uni)
{
	GError *error =3D NULL;
	gchar *res =3D NULL;
	gchar *utf8_ret;

	utf8_ret =3D g_utf16_to_utf8 (uni, -1, NULL, NULL, &error);

	if (utf8_ret)
	{
		res =3D _wapi_utf8_to_locale (utf8_ret);
	}

	g_assert(!error);

	return res;
}

gunichar2 *_wapi_locale_to_unicode(const gchar *locale)
{
	GError *error =3D NULL;
	gunichar2 *res =3D NULL;
	gchar *utf8_ret;

	utf8_ret =3D _wapi_locale_to_utf8 (locale);

	if (utf8_ret)
	{
		res =3D g_utf8_to_utf16(utf8_ret, -1, NULL, NULL, &error);
	}

	g_assert(!error);

	return res;
}

gchar *_wapi_locale_to_utf8(const gchar *locale)
{
	GError *error =3D NULL;
	gchar *res;
	gsize bytes_written =3D 0;

	res =3D g_locale_to_utf8 (locale, -1, NULL,
		&bytes_written, &error);

	g_assert(!error);

	return res;
}

gchar *_wapi_utf8_to_locale(const gchar *utf8)
{
	GError *error =3D NULL;
	gchar *res;
	gsize bytes_written =3D 0;

	res =3D g_locale_from_utf8 (utf8, -1, NULL,
		&bytes_written, &error);

	g_assert (!error);

	return res;
}

/*  ---------------- 8< ------------------ >8 --------------------- */

it should work for both, reading command line parameters and reading / =
creating directories and files with any locale character encoding
(even UTF-8).

It would be great, if such a conversion could be implemented in mono.

Thanks a lot,

Uwe Oertel