[Mono-devel-list] How to handle huge string collation resources?
Atsushi Eno
atsushi at ximian.com
Tue Jun 21 15:26:13 EDT 2005
Hello,
Finally I got my managed collation engine working, though it is far
from complete form I aim and it is mostly conceptual for now (it
does not handle many things, performs so bad). For now it handles
ASCII case sensitivity, large part of CompareOptions flags, large
part of diacritical mark processing.
Here is the steps to make it available:
1. apply attached patch against mcs/class/corlib.
2. go to mcs/class/corlib/Mono.Globalization.Unicode
3. run "make". It will automatically downloads some files
from some sites. For now without this step the build
b0rks.
4. make corlib as usual.
5. set MONO_USE_MANAGED_COLLATION environment variable
as "yes".
Here is a serious problem. In step 3 it makes 1.2MB of a C#
source file that results in 500KB increase of mscorlib.dll.
It could be made as C header i.e. runtime source, like existing
culture-info-table.h. But it is still huge.
And for about 200KB of data, they are just for CJK cultures
so they won't be used unless we use those cultures to handle
culture-sensitive CJK collation. That is mostly waste of memory.
One possible solution idea is to create different assembly and
loads the tables like:
- CompareInfo or whatever holds those tables as static
variables.
- If the variable is null, then it tries to load the
"internally stored table" via runtime icall_1. However
at this stage it returns null, since nothing is stored.
- Then, CompareInfo or whatever loads "table-only assembly"
via reflection and loads table into memory, and
then invokes an icall_2 that sets the table as runtime
internal table.
- Next time CompareInfo tries to fill the table, icall_1
will return the table.
In fact the same discussion also applies to string Normalization
tables (to support String.Normalize() introduced in .NET 2.0).
Any good ideas for this problem?
Thanks,
Atsushi Eno
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: managed-collation-20050621.diff
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20050622/a9a0cee5/attachment.pl
More information about the Mono-devel-list
mailing list