[Mono-devel-list] patch for: Non ASCII characters in filenames/ command line parameters

Jörg Rosenkranz joergr at voelcker.com
Sat Oct 18 10:08:55 EDT 2003


Hello Dick,

> -----Original Message-----
> From: Dick Porter [mailto:dick at ximian.com] 
> 
> I'm not convinced this is necessary, but even if it is this 
> patch won't
> be applied.  The reason is that the only interface to the io-layer
> functionality has to be via the windows API, because otherwise it will
> break portability to windows.

This statement I don't understand. The interface defined in unicode.h 
has not been changed, it's only extended. Uwe used glib functions for 
character set conversion like you did in _wapi_unicode_to_utf8 and 
changed this function to convert to the local character set instead 
of always UTF8. Maybe this is only usefull running under Linux, but
there are other operating system differences to be handled by Mono
too. How is this been done? Is there no option to do this file name 
conversion under Linux/Unix only?

> 
> I can see the benefits of dealing with non-utf8 encodings for 
> filenames
> and command line parameters (indeed, I had a long argument with a glib
> developer this week over this very issue), but it needs to be 
> solved in
> a portable way, and in a way that doesn't break passing arguments in
> utf8.

If UTF8 is set as locale it should work too. But command line parameters
are not the main problem. The problem we can't work around is the 
file name handling of Mono.

> 
> Unfortunately text encodings are a mess at the moment, and trying to
> guess the encoding of a stream of bytes is problematic at best.  The
> main linux distributions are trying to move everything to utf8.  That
> doesn't help legacy systems or data, but it is definitely the way
> forward.

You don't have to guess any encodings in this case. File names are 
created using the configured locale by all other applications. They have
to be read and written using this locale by Mono too or it simply doesn't 
work on non english/non UTF8 systems. Using UTF8 is no option for us 
because we don't have the option to switch to this encoding.

Jörg.



More information about the Mono-devel-list mailing list