[Mono-dev] Why do we need separate I18N assemblies?

Fri Jun 30 16:26:59 EDT 2006

Hi,

I think it would be worth it to try to remove the reflection overhead, 
because it is not just taking some time but also quite some amount of 
additional memory. Personally I would recommend to put the small encodings 
directly into corelib and the larger and seldom used ones in one or more 
additional assemblies that could be referenced without reflection and simply 
be deleted when not needed.
I did some simple testing on the potential benefits:
The basic overhead of a Mono Application is a little bit below this:
-----------------------------------------------------------------
Mono Jit statistics
Compiled methods:       35
Methods from AOT:       0
Methods cache lookup:   12
Method trampolines:     868
Basic blocks:           196
Max basic blocks:       14
Allocated vars:         141
Analyze stack repeat:   0
Compiled CIL code size: 1594
Native code size:       5095
Max code size ratio:    32,00 (Object::.ctor)
Biggest method:         1126 (Hashtable::.cctor)
Code reallocs:          3
Allocated code size:    5095
Inlineable methods:     0
Inlined methods:        0

Created object count:   51
Initialized classes:    53
Used classes:           30
Static data size:       96
VTable data size:       5160

Generic instances:      0
Initialized classes:    0
Inflated methods:       0 / 0
Inflated types:         0
Generics metadata size: 0
Total time spent compiling 35 methods (sec): 0
-----------------------------------------------------------------
if you do a single
Console.WriteLine ("Test");
It becomes:
-----------------------------------------------------------------
Test
Mono Jit statistics
Compiled methods:       466
Methods from AOT:       0
Methods cache lookup:   511
Method trampolines:     3333
Basic blocks:           3939
Max basic blocks:       237
Allocated vars:         3102
Analyze stack repeat:   0
Compiled CIL code size: 40443
Native code size:       93100
Max code size ratio:    32,00 (Object::.ctor)
Biggest method:         4930 (SimpleCollator::CompareInternal)
Code reallocs:          40
Allocated code size:    93100
Inlineable methods:     0
Inlined methods:        5

Created object count:   1800
Initialized classes:    235
Used classes:           153
Static data size:       510
VTable data size:       24312

Generic instances:      0
Initialized classes:    0
Inflated methods:       0 / 0
Inflated types:         0
Generics metadata size: 0
Total time spent compiling 466 methods (sec): 0,0625
Slowest method to compile (sec): 0,01563: I18N.Common.Handlers::.cctor()
-----------------------------------------------------------------

So this means you have about 20 times as much Native Code Size, need 150-200 
msec additional time (See attached textfile for JITTime+Runtime, run on a 
3500+ Athlon) and more than 100kb additional memory. Moreover the GC has to 
manage/kill 1800! Objects already, with no single line of Application code 
run yet.

The overhead actually comes mostly from two places:
a) I18N
b) Globalisation

b is triggered because of String handling mistakes within corelib, however 
the biggest share in runtime is I18N (see textfile - 78 ms out of 93 ms 
total app time).
In terms of memory it is more complicated to estimate, but from looking at 
the profile one could assume that there are also big potential 
optimizations.

mfg
Andreas

>I once dreamed to change encoding implementation like you but
> I noticed that it helps few people other than my personal
> satisfaction and spends too much time just for such a niche
> (at least for me).
>
> When you split conversion map data from I18N.*.dll which is mostly
> extraneous for people who don't use those them, feel free to try
> merging it into mscorlib. Otherwise, I don't like your idea.

Originally I was thinking of simply moving Encoding classes to corlib but
you and Jonathan are right that there are cases when it has advantages if
such large data can be excluded.

But you are right this would need a lot of time...

> Why do you quote Microsoft mscorlib size here? It is far from
> something to do with it. Stop making misleading. To my understanding
> they have different files for encoding maps (*.nlp).

I just tried to glorify the size of our mscorlib.:) If we add the size of
I18N assemblies to the size of our mscorlib our is sill smaller than
Microsoft's one. (And you are right that it has external dependencies so the
difference may be even more.) As long as our mscorlib can do the same as
their this only means that ours is better and nothing more.:)

> "Mono should be MS.NET compatible" is simply wrong as usual.
> We have broader supported environment which requires different
> thinking.

>From http://www.mono-project.com/FAQ:_General:
"Its objective is to enable UNIX developers to build and deploy
cross-platform .NET Applications."

And note that this is why I like Mono.:) This goal cannot be achieved
without MS.NET compatibility. Of course I don't mean compatiblitiy at any
costs. Or for example I don't like the bug compatibility at any cost policy
of MWF altough I admit that it helps run GUI applications that often relay
on weird SWF behaviours.

Kornél

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list at lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: With.txt
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20060630/a0965d95/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Test.cs
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20060630/a0965d95/attachment.pl