[Mono-devel-list] The first (attempt to checkin) managed collation patch

Atsushi Eno atsushi at ximian.com
Mon Jul 25 10:45:40 EDT 2005


Hello,

Paolo Molaro wrote:
> On 07/24/05 Miguel de Icaza wrote:
> 
>>>>It can be done in two ways: embed the files in the mono binary like we do
>>>>with the char tables or load the files from where mscorlib was loaded.
>>>>Both are trivial to implement.
>>>
>>>For now I took the latter approach. That means however we need some
>>>love on the build system.
>>
>>I personally like the idea of having this data on the dll files
>>themselves more, you can embed the resources with -resource and then get
>>the same IntPtr without adding a new icall like this:
>>
>>int size;
>>Module module;
>>IntPtr data = GetManifestResourceInternal (name, out size, out module);
>>
>>We reuse the existing framework, do not require build system love and do
>>not have to introduce a new icall.
> 
> 
> All of that is fine, except that you get incorrect data, since we have arrays
> of shorts or ints, so we either give incorrect results on bigendian systems
> or we introduce code to byteswap at runtime, needlessly slowing down the code.
> 
> The code should do what we already do for the char data: embed the arrays
> in mono so the endianess is correct and there are no build systems changes.
> Using managed resources for this is a net loss. Using and mmapping external files
> instead of embedding them has the same (well, less) build system complexity
> of embedding the correct endian files as managed resources.

Oh, I see. Then maybe we had better use C header. However as far as
I know, having the table as C header introduces some problems:

	- The file becomes too big. I created C header output and
	  it became about 800 KB:
	  http://monkey.workarea.jp/tmp/20050725/collation-tables.h

	- Runtime constants are fixed between versions. For example,
	  Char.GetUnicodeCategory() returns inconsistent values
	  between .NET 1.x and 2.0 (because of dependent Unicode
	  version difference). So if the results were inconsistent,
	  we need some kind of hack to fill the gap.

	  Since Microsoft seems to have lost the source of the
	  collation tables, it is not likely to happen.
	  http://blogs.msdn.com/michkap/archive/2005/06/30/434223.aspx
	  But am not sure if they can never find the sources and
	  fix bugs (or even that person above wrote the truth).

But I think the former is not critical and the latter *won't* happen
(I hope).

Atsushi Eno




More information about the Mono-devel-list mailing list