[Mono-devel-list] String constants and localization

Piers Haken piersh at friskit.com
Mon Jul 14 10:34:19 EDT 2003


Yes, but using an enum doesn't save you any space in the assembly
because its definition must include the names of its members.

Piers.

> -----Original Message-----
> From: A - Soft Technologies [mailto:A-Soft at A-SoftTech.com] 
> Sent: Monday, July 14, 2003 6:43 AM
> To: Piers Haken; Miguel de Icaza
> Cc: mono-devel-list at lists.ximian.com
> Subject: Re: [Mono-devel-list] String constants and localization
> 
> 
> Hi,
> 
> I'm not sure what you are meaning with this. But if you mean 
> that you cannot remove the enumeration after compiling 
> because somebody might use
> enum.ToString() then you are right.
> However there is no case in which anybody should use ToString 
> on that enum anywhere. The enum in this case basically serves 
> as an auto-(re)indexing int field
> 
> Andreas
> 
> ----- Original Message ----- 
> From: "Piers Haken" <piersh at friskit.com>
> To: "Andreas Nahr" <ClassDevelopment at A-SoftTech.com>; "Miguel 
> de Icaza" <miguel at ximian.com>
> Cc: <mono-devel-list at lists.ximian.com>
> Sent: Monday, July 14, 2003 3:37 PM
> Subject: RE: [Mono-devel-list] String constants and localization
> 
> 
> You're forgetting enum.ToString().
> 
> Piers.
> 
> > -----Original Message-----
> > From: Andreas Nahr [mailto:ClassDevelopment at A-SoftTech.com]
> > Sent: Monday, July 14, 2003 1:39 AM
> > To: Miguel de Icaza
> > Cc: mono-devel-list at lists.ximian.com
> > Subject: Re: [Mono-devel-list] String constants and localization
> >
> >
> > Hi,
> >
> > I've read your answer, but it seems that at quite some points you 
> > overlooked advantages (maybe I'm also wrong with any of 
> these, but I 
> > don't think so). So I added some addidional comments to it
> >
> > > > right now there is nearly no localization support in the
> > Mono class
> > > > libraries and all strings (mainly for errors) are
> > hardcoded into the
> > > > source-files.
> > >
> > > Thanks for this proposal, I have some comments in this
> > email about the
> > > specifics of the proposal.
> > >
> > > Initially, I wanted to use it, but it meant that we would have to:
> > >
> > > * deviate from the standard practice (something I would not
> > >   mind, if there were strong enough arguments for)
> >
> > The basic arguments are:
> > * Much faster
> > * Much smaller Assembly size (see below)
> > * Much smaller RAM need
> > * More safe when programing because of compile errors for e.g. typos
> >
> > > * Create and maintain a new infrastructure for localization.
> > >   Not bad per-se, but it would minimize the reuse of existing
> > >   knowledge that people might acquire or obtain from the NET.
> > >
> > > * Reimplement the chunks we already have for handling resources
> > >   in corlib to cope with all the CultureInfo bits.
> >
> > This is not neccesarily true. The sample implementation I 
> did is using 
> > System.Resources namespace to get it's localized data 
> internally. More 
> > specifically it ALLOWS to use it if you want, but does not 
> force you 
> > to if there is a better solution somebody wants to 
> implement. And you 
> > can change this solution at any time without having to 
> change anything 
> > in the sources. You could even use the now-used string tables and 
> > still save a litte memory (see below: Strings as 16bit).
> >
> > > I also want to avoid loading all the strings in memory, but it is 
> > > possible to do so:
> > >
> > > > with a call like
> > > > Print (MS.GetString (MonoString.GenericENullNotAllowed));
> > >
> > > We should use the Resource infrastructure in .NET here:
> > there are many
> > > issues related to loading the proper assembly given the selected 
> > > CultureInfo, and the code is mostly implemented.
> > >
> > > The file format for resources allows for this case: it is
> > possible to
> > > fetch the information without having to load all the strings to 
> > > localize.
> > >
> > > What we need to do is improve the implementation of 
> > > ResourceSet.GetObject.  Basically we should define an
> > internal method
> > > in the ResourceReader that can do lookups based on 
> strings, without 
> > > having to use the resource enumerator.
> >
> > OK - but IMHO your solution just has two flaws:
> > * Reimplement the chunks we already have for handling resources
> >    in corlib to cope with all the CultureInfo bits (which 
> is exactly 
> > what you wanted to avoid above)
> > * Sooner or later you will always come to the GetResourceStream 
> > function, which actually provides a memory stream, which 
> is: loading 
> > all things into memory (and if you want to provide a 
> complete second 
> > infrastructure for strings, then the work that has to be 
> done would be 
> > IMHO FAR more work than anything you might have to do to implement 
> > something like my suggested
> > solution)
> >
> > > We already have an API that can load a string from an 
> index, so the 
> > > only thing we have to do is perform a binary search on the
> > strings in
> > > the file (like Monodoc does now for its help).
> >
> > Sorry but IMHO this it total overkill. You want to perform a binary 
> > search DIRECTLY on a file containing an estimated 200KB 
> string values 
> > EVERY time we do a string lookup. Are you sure this won't 
> totally fry 
> > your HDD. And what about if the assembly we are accessing 
> is on e.g. a 
> > network share that has slow access times? IMHO you will 
> need to load 
> > string index that into memory in any case to perform a 
> binary search 
> > (or probably ANY other search)
> >
> > As I already said: Even with a binary search you will just 
> get search 
> > speeds of O(ln n) while my solution would get O(1) and that 
> is without 
> > taking into account that you have to do the binary search 
> on STRINGS, 
> > not on int's
> >
> > > > The Advantages are:
> > > > * Smaller Assemblies (probably leads to faster runtime
> > performance
> > > > in Jit also because Jiting a constant int should be faster than 
> > > > Jiting a constant string)
> > >
> > > Well, the space that you save on strings, say the string:
> > >
> > > "Null not provided"
> > >
> > > Would be encoded into an enumeration:
> > >
> > > Null_Not_Provided
> > >
> > > And that would end up in the metadata as well, so the
> > change in size
> > > is only half the size (strings are stored in 16-bit ucs-2 
> encoding).
> >
> > I didn't even think about savings from not-having-to store 
> as unicode 
> > ;) - that even adds to data savings :)
> >
> > I think you are overlooking a LOT of things here:
> >
> > First example:
> > 1. Mono now: Key = "Null not provided", Translation = "Null not 
> > provided" 2. Suggestion: Key = Null_Not_Provided, 
> Translation = "Null 
> > not provided"
> >
> > In that case key equals about the size of Translation. As 
> you said we 
> > only need half the size for the enum value. So we
> > need: 1. Memory: SuggestionKey * 2 * 2 (we also need it in 
> the lookup 
> > table) + Translation 2. Memory: SuggestionKey * 1
> > + Translation SAVING is: SuggestionKey * 3 If you want to
> > store the string somewhere to not have to hardcode it into each 
> > individual class to prevent e.g. spelling errors (seems to 
> be what MS
> > does) this even grows to a saving of:
> > SuggestionKey * 4
> > with inlining? active at compiling to a saving of: SuggestionKey * 6
> >
> > Second example (IMHO somewhere about what it could be in
> > reality): 1. Mono now: Key = "Null not provided because we 
> have never 
> > provided null", Translation = "Null not provided because we 
> have never 
> > provided null" 2. Suggestion: Key = Null_Never_Provided, 
> Translation = 
> > "Null not provided because we have never provided null"
> >
> > This should show the savings at a SuggestionKeySize about 
> half of the 
> > size of the string itself (I would estimate this to be a good total 
> > average): 1. Memory: SuggestionKey * 2 * 2 (we also need it in the 
> > lookup table) * 2 (double the size) + Translation 2. Memory: 
> > SuggestionKey * 1 + Translation SAVING
> > is: SuggestionKey * 7 !!!!!! for the other options 
> described above it 
> > would even save: SuggestionKey * 8 or SuggestionKey * 12
> >
> > A saving of SuggestionKey * 7 (with the settings of the second 
> > example) would in reality mean a saving of about 70% TOTAL size 
> > (including the
> > translation)
> > In the first example we would save about 60% size
> >
> > Also for extremely memory limited devices you probably can 
> remove the 
> > enumeration completely after compiling (all enum members 
> are compiled 
> > into int's), which increases savings even more.
> >
> > All that I stated now are just savings in assembly size. At runtime 
> > the savings are EVEN HIGHER! At runtime Mono should never need to 
> > access the enumeration keys (everything is int
> > now) so the need for RAM is probably about 80% LESS than 
> the current 
> > solution!!!! That all with more programing safety and much higher 
> > access speeds at much lower CPU usage.
> >
> > Andreas
> >
> > _______________________________________________
> > Mono-devel-list mailing list
> > Mono-devel-list at lists.ximian.com 
> > http://lists.ximian.com/mailman/listinfo/mono-devel-> list
> >
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com 
> http://lists.ximian.com/mailman/listinfo/mono-devel-> list
> 
> 
> 



More information about the Mono-devel-list mailing list