[Mono-dev] Replacing/Removing I18N
Andreas Nahr
ClassDevelopment at A-SoftTech.com
Wed Oct 11 15:10:25 EDT 2006
> On 10/09/06 Andreas Nahr wrote:
>> Current situation:
>> I18N is located in multiple separate assemblies that contain encoding
>> classes that are autogenerated. The single-byte encodings (my current
>> focus) use a potentially big CASE-Structure to compute the output.
>>
>> Problems:
> [...]
>> Considerations for change:
>> * Data must not be in private memory but shared as much as possible
>> (currently most is shareable)
>> * If possible avoid internal-calls and other direct runtime-support
>> (currently does not need any)
>
> I'm all for dumping the current I18N assemblies, the problems you list
> are real, we just need someone to write and test all the code.
> What I don't agree with is avoiding internal calls and runtime support.
> The reason is simple: the most efficient way to export to the manage
> world arrays of data is to have them as C arrays in the mono runtime or
> as data files that mono loads and mmaps, returning a pointer for use in
> usafe code in the assembly implementation. See the locale data for
> examples. I18n tables are best put outside the mono binary, usually
Well this was my first thought also to do it this way. However
internal-calls reduce the maintainability of the code, because you have to
manually ensure it stays in synch and it makes using corelib much harder for
other projects. So I tried to search for another way to archieve the same
goal and my proposal could be the solution (Again: If I'm somewhere making
wrong assumptions, please correct me).
It seems I couldn't make clear enough how this should work, so here is a
simplified pseudocode example about how I imagine this would work:
class Enc
{
byte* table;
public Enc (int codepage)
{
IntPtrStream s = Assembly.GetManifestResourceStream(codepage);
table = s.memStart;
}
public byte GetByte (char char)
{
return table[char];
}
}
The real implementation wouldn't actually be much more complicated than
this.
Also this would obviously be (much) faster than the current solution (I
don't know how Atsushi got to the point that this would be slower than
current code).
Moreover it has the advantage that you could remove resources from the file
in a much simpler than removing code in the c source files...
Assumed benefits:
* Simple solution
* Simple code
* Fast
* Data completely shareable between processes
In fact it seems to be such a simple and good solution that I'm somehow even
waiting some someone to say: Gotcha this doesn't work because...
> (except latin1 support). The aim is to optionally move outside also
> other tables that are currently hardcoded inside mono.
> Note: resources in assemblies are mmapped, but still using datadirectly
> from mono is more efficient.
>
> A small comment of your considerations, though: using a 64KB table when
> a much smaller one is sufficient would be a waste. We need to keep in
> mind also disk-space issues.
Currently my focus was actually trying to find out if this could work at
all, the change to two tables (already wrote about it) would only be a small
change.
More information about the Mono-devel-list
mailing list