[Mono-devel-list] String constants and localization

Tue Jul 15 11:41:43 EDT 2003

Hello Andreas,

> > object a = MyEnum.Value;
> >
> > And given the compiler structure, it is a lot easier to implement what I
> > described before than removing an enum after it has bene used.
> 
> As I said before: I do not want to change *anything* in the compiler. Just
> remove the code after compiling. If someone uses
> object a = MyEnum.Value;
> He will get a typeload exception or something like that.
> Also you can just leave the thing in (You definatelly would do that for
> desktop systems) which would bring the whole thing up to 175KB, which still
> is a lot less than 1000KB

So how do you plan on removing things afterwards?  If that involves some
kind of magic along the lines of disassembling, removing/patching and
reassembling, I am very much against that path.

As you point out, people who do:

	object a= MyEnum.Value

Would get code that breaks, but I have a problem with having enumeration
with a special restrictions, just  because it happens to be part of the
localization process.

Not to mention that it still makes the build more complicated, and
deviates from the obvious path, which again, maintainership wise, I do
not want to do.

> Please don't forget one thing: In .Net you can set the language *PER
> THREAD*. I don't see a chance to replace the hardcoded string with that.
> Also replacing all hardcoded strings will probably make the code much harder
> to maintain.
> It is true that for a solution where you are not doing any localization simp
> ly including the strings into the assembly produces an optimum result
> (disregarding the loss that comes with encoding)

For an embedded device with limited resources, I do not think it will be
a problem to support a single language;  In particular, I do not believe
that a portable device needs to support a different language on each
thread.  It is perfectly fine to have runtime restrictions for a smaller
profile of the framework.

> I wrote above : *full memory cache* - under this circumstance it is *true*

Ok, but that is not the case that we should implement, as we have seen,
there are viable alternatives, and comparing the viable alternative to
the worst-case-scenario was what I was pointing out as being inadequate.

> > As I repeated a number of times: you do *not* need to use a Hashtable.
> > In fact, Microsoft .NET does *not* use a Hashtable, they use an
> > "internal" method that maps strings to their index, using a binary
> > search.
> 
> MS *does* load *everything* into memory at the first access of the first
> string. So they are using a binary search *in memory*

Nope.  Microsoft creates a Stream that *points* to an mmaped region of
the file, which will let the kernel do on-demand-loading, and will let
the kernel discard the pages when it needs the memory.

We should, btw, implement this feature.

> As I wrote in the last post if you load it directly from HDD you can easily
> get situations with thousands of HDD/ Network Seeks+Reads per second - if
> your HDD is fast enough to deliver that in a second ;)

I tested this with NFS using Monodoc with binary search on an index with
2971 entries, and it took 14 NFS accesses on the first search, searchs
after that did use up to two accesses (sometimes none, am guessing the
cache is playing here).

It is working very well here.  

That being said, very few software runs from a network share, and in
those cases, people already have solutions for the problem (the
/opt/depot structure for example).

> It would use most memory in *every* scenario except the one where no
> localization takes place at all

A binary search with a full string would use slightly more memory (and
by slightly I mean, less than 40 bytes on average) than the
Microsoft.NET version.

> I'm wondering if this solution with directly performing a binary search on
> HDD would work then why MS decided to load the whole thing into memory.

They dont load the thing in memory.  They mmap the file (which looks
like loading into memory), but in reality the kernel is in control, and
they *do* perform a binary search on the stream they loaded.

The proof is on the Rotor source code.

The kernel can provide the page on demand when a part of it is touched,
and can safely drop it at any point (since it has the backing store
available, and can hence re-load it at any point).

Miguel.