[Mono-dev] Replacing/Removing I18N

Andreas Nahr ClassDevelopment at A-SoftTech.com
Wed Oct 11 15:10:25 EDT 2006


> On 10/09/06 Andreas Nahr wrote:
>> Current situation:
>> I18N is located in multiple separate assemblies that contain encoding
>> classes that are autogenerated. The single-byte encodings (my current
>> focus) use a potentially big CASE-Structure to compute the output.
>>
>> Problems:
> [...]
>> Considerations for change:
>> * Data must not be in private memory but shared as much as possible
>> (currently most is shareable)
>> * If possible avoid internal-calls and other direct runtime-support
>> (currently does not need any)
>
> I'm all for dumping the current I18N assemblies, the problems you list
> are real, we just need someone to write and test all the code.
> What I don't agree with is avoiding internal calls and runtime support.
> The reason is simple: the most efficient way to export to the manage
> world arrays of data is to have them as C arrays in the mono runtime or
> as data files that mono loads and mmaps, returning a pointer for use in
> usafe code in the assembly implementation. See the locale data for
> examples. I18n tables are best put outside the mono binary, usually

Well this was my first thought also to do it this way. However 
internal-calls reduce the maintainability of the code, because you have to 
manually ensure it stays in synch and it makes using corelib much harder for 
other projects. So I tried to search for another way to archieve the same 
goal and my proposal could be the solution (Again: If I'm somewhere making 
wrong assumptions, please correct me).

It seems I couldn't make clear enough how this should work, so here is a 
simplified pseudocode example about how I imagine this would work:

class Enc
{
    byte* table;
    public Enc (int codepage)
    {
        IntPtrStream s = Assembly.GetManifestResourceStream(codepage);
        table = s.memStart;
    }
    public byte GetByte (char char)
    {
        return table[char];
    }
}

The real implementation wouldn't actually be much more complicated than 
this.
Also this would obviously be (much) faster than the current solution (I 
don't know how Atsushi got to the point that this would be slower than 
current code).
Moreover it has the advantage that you could remove resources from the file 
in a much simpler than removing code in the c source files...

Assumed benefits:
* Simple solution
* Simple code
* Fast
* Data completely shareable between processes

In fact it seems to be such a simple and good solution that I'm somehow even 
waiting some someone to say: Gotcha this doesn't work because...

> (except latin1 support). The aim is to optionally move outside also
> other tables that are currently hardcoded inside mono.
> Note: resources in assemblies are mmapped, but still using datadirectly
> from mono is more efficient.
>
> A small comment of your considerations, though: using a 64KB table when
> a much smaller one is sufficient would be a waste. We need to keep in
> mind also disk-space issues.

Currently my focus was actually trying to find out if this could work at 
all, the change to two tables (already wrote about it) would only be a small 
change. 




More information about the Mono-devel-list mailing list