[Mono-devel-list] patch for Unicode Normalization support

Atsushi Eno atsushi at ximian.com
Wed Aug 3 15:18:59 EDT 2005


Hello,

As mentioned in the managed collation thread, I have been writing
Unicode Normalization support (String.Normalize() and .IsNormalized()
for NET_2_0 build), and today I have finished the implementation
(except for possible bugfixing).

The attached patches are required. I packed C header which is
about 165 KB.

This time, I *really* need runtime C header file to be added to
our metadata code, since it contains real chunk of int32/int16
arrays. The header file is generated in similar way as I create
collation resources/C header files in Mono.Globalization.Unicode.
(The sources are from unicode.org.).

For performance, I measured pure managed code (the source
generator also emits C# array) that holds the same size of arrays:

Total mem Method
########################
     127 KB Mono.Globalization.Unicode.Normalization::.cctor()
          70 KB        2 System.Int16[]
  Callers (with count) that contribute at least for 1%:
           1  100 %

Now I need decision and approval for both collation and normalization:

	- Should I really use C header for collation, which is mostly
	  byte array? Now I think having 800KB of header file is
	  not nice.
	- Should I really use C header for normalization, which is
	  totally unused under non-2.0 environment. Since it's just
	  165 KB in C header, I think we can hold two normalization
	  resources (one for little endian, one for big endian).

And if any comments on the patches, please tell me.

TIA,
Atsushi Eno
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: load-normalization-resource.patch
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20050804/782e13ff/attachment.pl 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: normalization-support.patch
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20050804/782e13ff/attachment-0001.pl 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: normalization-tables.bz2
Type: application/octet-stream
Size: 27317 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20050804/782e13ff/attachment.obj 


More information about the Mono-devel-list mailing list