[Mono-devel-list] String constants and localization
Piers Haken
piersh at friskit.com
Mon Jul 14 09:37:51 EDT 2003
You're forgetting enum.ToString().
Piers.
> -----Original Message-----
> From: Andreas Nahr [mailto:ClassDevelopment at A-SoftTech.com]
> Sent: Monday, July 14, 2003 1:39 AM
> To: Miguel de Icaza
> Cc: mono-devel-list at lists.ximian.com
> Subject: Re: [Mono-devel-list] String constants and localization
>
>
> Hi,
>
> I've read your answer, but it seems that at quite some points
> you overlooked advantages (maybe I'm also wrong with any of
> these, but I don't think so). So I added some addidional
> comments to it
>
> > > right now there is nearly no localization support in the
> Mono class
> > > libraries and all strings (mainly for errors) are
> hardcoded into the
> > > source-files.
> >
> > Thanks for this proposal, I have some comments in this
> email about the
> > specifics of the proposal.
> >
> > Initially, I wanted to use it, but it meant that we would have to:
> >
> > * deviate from the standard practice (something I would not
> > mind, if there were strong enough arguments for)
>
> The basic arguments are:
> * Much faster
> * Much smaller Assembly size (see below)
> * Much smaller RAM need
> * More safe when programing because of compile errors for e.g. typos
>
> > * Create and maintain a new infrastructure for localization.
> > Not bad per-se, but it would minimize the reuse of existing
> > knowledge that people might acquire or obtain from the NET.
> >
> > * Reimplement the chunks we already have for handling resources
> > in corlib to cope with all the CultureInfo bits.
>
> This is not neccesarily true. The sample implementation I did
> is using System.Resources namespace to get it's localized
> data internally. More specifically it ALLOWS to use it if you
> want, but does not force you to if there is a better solution
> somebody wants to implement. And you can change this solution
> at any time without having to change anything in the sources.
> You could even use the now-used string tables and still save
> a litte memory (see below: Strings as 16bit).
>
> > I also want to avoid loading all the strings in memory, but it is
> > possible to do so:
> >
> > > with a call like
> > > Print (MS.GetString (MonoString.GenericENullNotAllowed));
> >
> > We should use the Resource infrastructure in .NET here:
> there are many
> > issues related to loading the proper assembly given the selected
> > CultureInfo, and the code is mostly implemented.
> >
> > The file format for resources allows for this case: it is
> possible to
> > fetch the information without having to load all the strings to
> > localize.
> >
> > What we need to do is improve the implementation of
> > ResourceSet.GetObject. Basically we should define an
> internal method
> > in the ResourceReader that can do lookups based on strings, without
> > having to use the resource enumerator.
>
> OK - but IMHO your solution just has two flaws:
> * Reimplement the chunks we already have for handling resources
> in corlib to cope with all the CultureInfo bits (which is
> exactly what you wanted to avoid above)
> * Sooner or later you will always come to the
> GetResourceStream function, which actually provides a memory
> stream, which is: loading all things into memory (and if you
> want to provide a complete second infrastructure for strings,
> then the work that has to be done would be IMHO FAR more work
> than anything you might have to do to implement something
> like my suggested
> solution)
>
> > We already have an API that can load a string from an index, so the
> > only thing we have to do is perform a binary search on the
> strings in
> > the file (like Monodoc does now for its help).
>
> Sorry but IMHO this it total overkill. You want to perform a
> binary search DIRECTLY on a file containing an estimated
> 200KB string values EVERY time we do a string lookup. Are you
> sure this won't totally fry your HDD. And what about if the
> assembly we are accessing is on e.g. a network share that has
> slow access times? IMHO you will need to load string index
> that into memory in any case to perform a binary search (or
> probably ANY other search)
>
> As I already said: Even with a binary search you will just
> get search speeds of O(ln n) while my solution would get O(1)
> and that is without taking into account that you have to do
> the binary search on STRINGS, not on int's
>
> > > The Advantages are:
> > > * Smaller Assemblies (probably leads to faster runtime
> performance
> > > in Jit also because Jiting a constant int should be faster than
> > > Jiting a constant string)
> >
> > Well, the space that you save on strings, say the string:
> >
> > "Null not provided"
> >
> > Would be encoded into an enumeration:
> >
> > Null_Not_Provided
> >
> > And that would end up in the metadata as well, so the
> change in size
> > is only half the size (strings are stored in 16-bit ucs-2 encoding).
>
> I didn't even think about savings from not-having-to store as
> unicode ;) - that even adds to data savings :)
>
> I think you are overlooking a LOT of things here:
>
> First example:
> 1. Mono now: Key = "Null not provided", Translation = "Null
> not provided" 2. Suggestion: Key = Null_Not_Provided,
> Translation = "Null not provided"
>
> In that case key equals about the size of Translation. As you
> said we only need half the size for the enum value. So we
> need: 1. Memory: SuggestionKey * 2 * 2 (we also need it in
> the lookup table) + Translation 2. Memory: SuggestionKey * 1
> + Translation SAVING is: SuggestionKey * 3 If you want to
> store the string somewhere to not have to hardcode it into
> each individual class to prevent e.g. spelling errors (seems
> to be what MS
> does) this even grows to a saving of:
> SuggestionKey * 4
> with inlining? active at compiling to a saving of: SuggestionKey * 6
>
> Second example (IMHO somewhere about what it could be in
> reality): 1. Mono now: Key = "Null not provided because we
> have never provided null", Translation = "Null not provided
> because we have never provided null" 2. Suggestion: Key =
> Null_Never_Provided, Translation = "Null not provided because
> we have never provided null"
>
> This should show the savings at a SuggestionKeySize about
> half of the size of the string itself (I would estimate this
> to be a good total average): 1. Memory: SuggestionKey * 2 * 2
> (we also need it in the lookup table) * 2 (double the size) +
> Translation 2. Memory: SuggestionKey * 1 + Translation SAVING
> is: SuggestionKey * 7 !!!!!! for the other options described
> above it would even save: SuggestionKey * 8 or SuggestionKey * 12
>
> A saving of SuggestionKey * 7 (with the settings of the
> second example) would in reality mean a saving of about 70%
> TOTAL size (including the
> translation)
> In the first example we would save about 60% size
>
> Also for extremely memory limited devices you probably can
> remove the enumeration completely after compiling (all enum
> members are compiled into int's), which increases savings even more.
>
> All that I stated now are just savings in assembly size. At
> runtime the savings are EVEN HIGHER! At runtime Mono should
> never need to access the enumeration keys (everything is int
> now) so the need for RAM is probably about 80% LESS than the
> current solution!!!! That all with more programing safety and
> much higher access speeds at much lower CPU usage.
>
> Andreas
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-> list
>
More information about the Mono-devel-list
mailing list