[Mono-list] Re: bug report for new JIT

Paolo Molaro lupus@ximian.com
Wed, 9 Apr 2003 13:14:11 +0200

On 04/09/03 dietmar wrote:
> On Tue, 2003-03-25 at 19:23, Zoltan.2.Varga@nokia.com wrote: 
> > class Bug {
> > 
> >         static public void Main() {
> >                 string s2 = "C";
> >                 string s = "AB" + s2;
> > 
> >                 switch (s) {
> >                 case "ABC":
> >                         Console.WriteLine ("OK.");
> >                         break;
> >                 default:
> >                         Console.WriteLine ("NOT OK");
> >                         break;
> >                 }
> >         }
> > }
> mcs creates the following code for the switch (s)
>         IL_001a: ldloc.2
>         IL_001b: call string valuetype [corlib]System.String::IsInterned(string)
>         IL_0020: stloc.2
>         IL_0021: ldloc.2
>         IL_0022: ldstr "ABC"
>         IL_0027: bne.un IL_0040
> The problem is that "ABC" is not interned when IsInterned is called.
> csc produces the following code instead:
> /* generate a ldstr for each case label*/ 
>         IL_0012: ldstr "ABC"
> 	/* pop all values from evaluation stack */
>         IL_0017: leave.s IL_0019
> 	...
>         IL_001e: ldloc.2
>         IL_001f: call string valuetype [corlib]System.String::IsInterned(string)
>         IL_0024: stloc.2
>         IL_0025: ldloc.2
>         IL_0026: ldstr "ABC"
>         IL_002b: beq.s IL_002f
> This code makes sure that all case values are interned before IsInterned
> is called. This code also works with AppDomains.
> The AOT compiler also triggers this problem.
> Paolo said the the correct fix would be to have some runtime magic to
> intern all used strings before a method is called. I agree that this
> would work, but its complex to implement and produces slow code.

Here is the answer I got from the MS people a year ago when I brought up
the issue:
	Easiest way is just to intern each string that is loaded by ldstr when
	you JIT the method containing the ldstr opcode. That way the JITted
	version of ldstr can just load the reference that is already held by the
	intern table.

I think it's easier for them, because they use a sort of root appdomain
where mscorlib is loaded and I guess they intern the strings in this
root appdomain: maybe someone can write a test that uses ldstr to load
the same string in two different appdomains and check the string objects have
the same address. We can either implement this same solution (it would
solve also other issues, like the need for the icall for the
process_guid lluis found recently and it would lower memory usage).
Or we can do it just for the string intern pool: have it shared across
appdomains (it should not be too complex).

And since this is done at compile time it's way faster than your
proposed workaround that needs to call ldstr for each of the case stmts
every time the code is executed.

> My suggestion is to modify mcs to produce the same code as csc, which
> should be easy to implement.

But this would not solve the issue, since one can always write the
(correct) IL code mcs currently produces and the bug resurfaces.


lupus@debian.org                                     debian/rules
lupus@ximian.com                             Monkeys do it better