[Mono-list] Performance / array access
Piers Haken
piersh@friskit.com
Sun, 19 Jan 2003 13:29:31 -0800
This is a multi-part message in MIME format.
------_=_NextPart_001_01C2C001.DACA076E
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Also significant optimizations that MS does here are frame-pointer
omission and passing params in registers.
Tom, can you try your tests on MS with the following change: ?
- for(int i=3D0; i< size; i++)
---
+ for(int i=3D0; i< array.Length; i++)
Piers.
> -----Original Message-----
> From: Tom Fransen [mailto:t.fransen@mailned.nl]=20
> Sent: Sunday, January 19, 2003 8:02 AM
> To: Mitchell Skinner
> Cc: Mono-list@ximian.com
> Subject: RE: [Mono-list] Performance / array access
>=20
>=20
> Hello all,
>=20
> Mitch, to come back on the issue of bound checking. I looked=20
> further into the problem and my first conclusion was wrong.=20
> Piers already stated that he did not expect that the MS=20
> compiler would be able to optimize all these check out of the loop.
>=20
> I again started looking into the assembly code and although=20
> I'm not a assembly programmer (it has been too long since I=20
> done some 6502/Z80 :-) I think the performance difference=20
> (>2) comes from the optimization performered by the MS JIT=20
> compiler (I am using version 1.0 of the SDK).
>=20
> The MS JIT seems to be better at allocating processor=20
> registers to certain variables resulting in much faster code.=20
> The execution speed of the Mono code is in the same order as=20
> the unoptimized code generated by the MS JIT compiler (for=20
> tight loops accessing arrays). In case of my bubble sorting=20
> algorithm this results in a speed difference of a factor 3. I=20
> have made other tests like applying matrix multiplications on=20
> a 2-dimensional array (to simulate an image processing=20
> operation) and also here Mono lacks behind. I haven't sorted=20
> this out but my first guess is that its again the register=20
> optimization that makes the difference.
>=20
> For the code snippets below both the MS and Mono code is given:
>=20
> private void TestMethod(int size)
> {
> array =3D new int[size];
> for(int i=3D0; i< size; i++)
> {
> array[i] =3D size - i;
> }
> Console.WriteLine("Done" );
> string s =3D Console.ReadLine();
> }
>=20
> The MS JIT generates the following code. Note that
> I first run the program and latter when the program reads
> from the console attach the debugger. If I run the program=20
> from the MS development environment the JIT seems to=20
> generate unoptimized code so I need this trick.
>=20
>=20
>=20
> MS Code generated by the JIT unoptimized=20
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
>=20
> private void TestMethod(int size)
> {
> array =3D new int[size];
> 00000000 push ebp
> 00000001 mov ebp,esp
> 00000003 sub esp,14h
> 00000006 push edi
> 00000007 push esi
> 00000008 push ebx
> 00000009 mov ebx,ecx
> 0000000b mov esi,edx
> 0000000d xor edi,edi
> 0000000f mov dword ptr [ebp-10h],0
> 00000016 test esi,80000000h
> 0000001c je 00000025
> 0000001e xor ecx,ecx
> 00000020 call 763A2CFF
> 00000025 mov edx,esi
> 00000027 mov ecx,2DE6B3Ah
> 0000002c call FD4A2040
> 00000031 lea edx,[ebx+4]
> 00000034 call 76479768
> for(int i=3D0; i< size; i++)
> 00000039 xor edi,edi
> 0000003b nop
> 0000003c jmp 00000056
> array[i] =3D size - i;
> 0000003e mov eax,dword ptr [ebx+4] <=3D=3D=3D=3D Start=20
> loop, fetch the
> size of the array
> 00000041 cmp edi,dword ptr [eax+4] <=3D=3D=3D=3D Check if=20
> within bounds
> 00000044 jb 0000004D
> 00000046 xor ecx,ecx
> 00000048 call 762EED87 <=3D=3D=3D=3D=20
> Trigger exception
> 0000004d mov edx,esi <=3D=3D=3D=3D=20
> esi contain variable size
> 0000004f sub edx,edi
> 00000051 mov dword ptr [eax+edi*4+8],edx
> for(int i=3D0; i< size; i++)
> 00000055 inc edi <=3D=3D=3D=3D edi=20
> contain variable i
> 00000056 cmp edi,esi
> 00000058 jl 0000003E <=3D=3D=3D=3D Jump to=20
> start of for
> loop
> Console.WriteLine("Done" );
> 0000005a mov ecx,dword ptr ds:[01C40178h]
> 00000060 call 76AA3250
> string s =3D Console.ReadLine();
> 00000065 call 76AA2F10
> 0000006a mov dword ptr [ebp-14h],eax
> 0000006d mov eax,dword ptr [ebp-14h]
> 00000070 mov dword ptr [ebp-10h],eax
> }
> 00000073 nop
> 00000074 pop ebx
> 00000075 pop esi
> 00000076 pop edi
> 00000077 mov esp,ebp
> 00000079 pop ebp
> 0000007a ret
>=20
>=20
> MS Code generated by the JIT optimized=20
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
>=20
> private void TestMethod(int size)
> {
> array =3D new int[size];
> 00000000 push edi
> 00000001 push esi
> 00000002 mov edi,ecx
> 00000004 mov esi,edx
> 00000006 test esi,80000000h
> 0000000c jne 0000005C
> 0000000e mov edx,esi
> 00000010 mov ecx,2DE6A22h
> 00000015 call FD5B2098
> 0000001a lea edx,[edi+4]
> 0000001d call 765897C0
> for(int i=3D0; i< size; i++)
> 00000022 xor ecx,ecx
> 00000024 cmp esi,0
> 00000027 jle 00000040
> 00000029 mov edx,dword ptr [edi+4]
> 0000002c mov edi,dword ptr [edx+4]
> array[i] =3D size - i;
> 0000002f cmp ecx,edi <=3D=3D=3D=20
> start of loop
> 00000031 jae 00000064 =09
> <=3D=3D=3D index out of bounds
> 00000033 mov eax,esi <=3D=3D=3D=20
> esi contain
> variable size
> 00000035 sub eax,ecx
> 00000037 mov dword ptr [edx+ecx*4+8],eax
> for(int i=3D0; i< size; i++)
> 0000003b inc ecx <=3D=3D=3D=3D=20
> ecx contain variable i
> 0000003c cmp ecx,esi
> 0000003e jl 0000002F =09
> <=3D=3D=3D=3D jump to start of for loop
> Console.WriteLine("Done" );
> 00000040 mov ecx,dword ptr ds:[01C4086Ch]
> 00000046 mov edx,dword ptr ds:[01C40110h]
> 0000004c mov eax,dword ptr [ecx]
> 0000004e call dword ptr [eax+000000D8h]
> string s =3D Console.ReadLine();
> 00000054 call 76BB2F68
> 00000059 pop esi
> }
> 0000005a pop edi
> 0000005b ret
> 0000005c xor ecx,ecx
> 0000005e call 764B2D57
> 00000063 int 3
> 00000064 xor ecx,ecx =09
> <=3D=3D=3D=3D Exception
> 00000066 call 763FEDDF
> 0000006b int 3
>=20
>=20
> The Mono (0.18) JIT generates the following code (--dump-asm)=20
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> Disassembly of section .text:
>=20
> 00000000 <SpeedBenchmarks.BubbleSort_TestMethod>:
> 0: 55 push %ebp
> 1: 8b ec mov %esp,%ebp
> 3: 53 push %ebx
> 4: 56 push %esi
> 5: 83 ec 28 sub $0x28,%esp
> 8: c7 45 d8 00 00 00 00 movl $0x0,0xffffffd8(%ebp)
> f: 8b 45 08 mov 0x8(%ebp),%eax
> 12: 05 08 00 00 00 add $0x8,%eax
> 17: 83 38 00 cmpl $0x0,(%eax)
> 1a: 8b 4d 0c mov 0xc(%ebp),%ecx
> 1d: 50 push %eax
> 1e: 51 push %ecx
> 1f: 52 push %edx
> 20: 51 push %ecx
> 21: 68 54 7e 18 08 push $0x8187e54
> 26: e8 b5 c2 e4 ff call ffe4c2e0
> <SpeedBenchmarks.BubbleSort_TestMethod+0xffe4c2e0>
> 2b: 83 c4 08 add $0x8,%esp
> 2e: 5a pop %edx
> 2f: 59 pop %ecx
> 30: 8b c8 mov %eax,%ecx
> 32: 58 pop %eax
> 33: 89 08 mov %ecx,(%eax)
> 35: be 00 00 00 00 mov $0x0,%esi
> <=3D=3D=3D esi contain variable i
> 3a: e9 29 00 00 00 jmp 68
> <SpeedBenchmarks.BubbleSort_TestMethod+0x68>
> 3f: 8b 45 08 mov 0x8(%ebp),%eax
> <=3D=3D=3D start loop
> 42: 8b 40 08 mov 0x8(%eax),%eax
> \
> 45: 8b ce mov %esi,%ecx
> |
> 47: 3b 48 0c cmp 0xc(%eax),%ecx
> | Bounds check
> 4a: 72 0a jb 56
> <SpeedBenchmarks.BubbleSort_TestMethod+0x56>/
> 4c: 68 e1 26 11 08 push $0x81126e1
> 51: e8 ce 56 ed ff call ffed5724
> <SpeedBenchmarks.BubbleSort_TestMethod+0xffed5724> <=3D=3D=3D=20
> Generate exception
> 56: 8d 44 88 10 lea 0x10(%eax,%ecx,4),%eax
> 5a: 8b 4d 0c mov 0xc(%ebp),%ecx
> <=3D=3D retrieve size
> 5d: 2b ce sub %esi,%ecx
> 5f: 89 08 mov %ecx,(%eax)
> 61: 8b c6 mov %esi,%eax
> <=3D=3D=3D move i to eax
> 63: 40 inc %eax =09
> <=3D=3D=3D i++
> 64: 8b d8 mov %eax,%ebx
> <=3D=3D=3D (detour??)
> 66: 8b f3 mov %ebx,%esi
> <=3D=3D=3D move i back to esi
> 68: 8b 45 0c mov 0xc(%ebp),%eax
> <=3D=3D=3D retrieve size
> 6b: 3b f0 cmp %eax,%esi
> <=3D=3D=3D check if i < size
> 6d: 0f 8c cc ff ff ff jl 3f
> <SpeedBenchmarks.BubbleSort_TestMethod+0x3f> <=3D=3D=3D end loop
> 73: 68 38 da 16 08 push $0x816da38
> 78: 8b 05 78 df 16 08 mov 0x816df78,%eax
> 7e: 50 push %eax
> 7f: 83 38 00 cmpl $0x0,(%eax)
> 82: 8b 00 mov (%eax),%eax
> 84: ff 90 a0 00 00 00 call *0xa0(%eax)
> 8a: 83 c4 08 add $0x8,%esp
> 8d: 8b 05 78 df 16 08 mov 0x816df78,%eax
> 93: 50 push %eax
> 94: 83 38 00 cmpl $0x0,(%eax)
> 97: 8b 00 mov (%eax),%eax
> 99: ff 90 c0 00 00 00 call *0xc0(%eax)
> 9f: 83 c4 04 add $0x4,%esp
> a2: 8b 05 80 df 16 08 mov 0x816df80,%eax
> a8: 50 push %eax
> a9: 83 38 00 cmpl $0x0,(%eax)
> ac: 8b 00 mov (%eax),%eax
> ae: ff 90 7c 00 00 00 call *0x7c(%eax)
> b4: 83 c4 04 add $0x4,%esp
> b7: 8b f0 mov %eax,%esi
> b9: 8b de mov %esi,%ebx
> bb: 8d 65 f8 lea 0xfffffff8(%ebp),%esp
> be: 5e pop %esi
> bf: 5b pop %ebx
> c0: c9 leave
> c1: c3 ret
>=20
>=20
> -----Original Message-----
> From: Mitchell Skinner [mailto:mitchskin@attbi.com]
> Sent: Wednesday, January 15, 2003 12:42 AM
> To: Tom Fransen
> Cc: Mono-list@ximian.com
> Subject: RE: [Mono-list] Performance / array access
>=20
>=20
> Hello,
>=20
> The runtime inserts array bounds checks to prevent things=20
> like buffer overflows. So the following code:
> int len =3D array.Length;
> for (int i =3D 0; i < len; ++i)
> sum +=3D array [i];
> becomes something like:
> int len =3D array.Length;
> for (int i =3D 0; i < len; ++i) {
> if (i >=3D array.Length) throw new Exception; //forgot which
> sum +=3D array [i];
> }
>=20
> In the fast case, the MS JIT is smart enough to analyze the=20
> following code and figure out that "i" will not go out of the=20
> bounds of the array (because i starts at zero and stays below=20
> array.Length):
> for (int i =3D 0; i < array.Length; ++i)
> sum +=3D array [i];
>=20
> If the for loop isn't set up exactly like that, I don't think=20
> the MS JIT is able to eliminate the bounds check (unless=20
> perhaps there are constants involved). Is the code you have=20
> below (comparing i with a "size" variable) the exact code you=20
> tested? If "size" is not a constant, I would be suprised to=20
> find that there was a major difference between MS and mono.
>=20
> Are you using the MS 1.0 release or the 1.1 release?
>=20
> Mitch
>=20
>=20
> On Tue, 2003-01-14 at 13:38, Tom Fransen wrote:
> > Piers,
> >
> > can you spend a few more words on this. Why is the second=20
> case slower?
> >
> > Furthermore if I use a simple loop to fill an array the
> > speed difference is a factor 3. I have a bubble sort method that I=20
> > execute a larger number of times. On Mono it takes 19=20
> seconds on the=20
> > MS stuff ~6 seconds. This is a big difference. So why I am losing a=20
> > factor 3 on the following loop.
> >
> > // Fill aray worst case, all elements need=20
> to be swapped
> > for(int i=3D0; i< size; i++)
> > {
> > array[i] =3D size - i;
> > }
> >
> > I am testing with some small benchmarks, but real=20
> applications (e.g.=20
> > an
> MP3
> > decoder)
> > often use tables stored in arrays to do certain calculations. So=20
> > although the benchmark maybe syntetic certain application=20
> will suffer=20
> > from these penalties.
> >
> > regards,
> > Tom
> >
> > -----Original Message-----
> > From: Miguel de Icaza [mailto:miguel@ximian.com]
> > Sent: Monday, January 13, 2003 3:22 AM
> > To: Piers Haken
> > Cc: Tom Fransen; Mono-list@ximian.com
> > Subject: RE: [Mono-list] Performance / array access
> >
> >
> > Hello,
> >
> > > Yeah, Microsoft's JIT lifts invariant bounds-checks. But=20
> I believe=20
> > > it's pretty limited.
> > >
> > > For example, the check is removed in the following case:
> > >
> > > for (int i =3D 0; i < array.Length; ++i)
> > > sum +=3D array [i];
> > >
> > > But not here:
> > >
> > > int len =3D array.Length;
> > > for (int i =3D 0; i < len; ++i)
> > > sum +=3D array [i];
> > >
> > > So the first case is (counter-intuitively) faster than the second.
> > >
> > > I don't believe Mono's JIT makes this optimization. Maybe the new=20
> > > JIT will ;-)
> >
> > This is a very good observation. Because it seems that this=20
> > particular kind of loop is detected by the JIT engine.
> >
> > Array-bounds-check elimination is something we want to do=20
> with the new=20
> > JIT, but it is not implemented at this point. The new JIT=20
> features a=20
> > new intermediate representation that simplifies=20
> implementing this sort=20
> > of thing, but it is still on its bootstrapping phases of life.
> >
> > Miguel
> >
> >
> > _______________________________________________
> > Mono-list maillist - Mono-list@ximian.com=20
> > http://lists.ximian.com/mailman/listinfo/mono-list
> --
> Mitchell Skinner <mitchskin@attbi.com>
>=20
>=20
> _______________________________________________
> Mono-list maillist - Mono-list@ximian.com=20
> http://lists.ximian.com/mailman/listinfo/mono-list
>=20
------_=_NextPart_001_01C2C001.DACA076E
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.0.4417.0">
<TITLE>RE: [Mono-list] Performance / array access</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=3D2>Also significant optimizations that MS does here are =
frame-pointer omission and passing params in registers.</FONT>
</P>
<P><FONT SIZE=3D2>Tom, can you try your tests on MS with the following =
change: ?</FONT>
</P>
<P><FONT =
SIZE=3D2>- &nb=
sp; for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>---</FONT>
<BR><FONT =
SIZE=3D2>+ &nb=
sp; for(int i=3D0; i< array.Length; i++)</FONT>
</P>
<P><FONT SIZE=3D2>Piers.</FONT>
</P>
<P><FONT SIZE=3D2>> -----Original Message-----</FONT>
<BR><FONT SIZE=3D2>> From: Tom Fransen [<A =
HREF=3D"mailto:t.fransen@mailned.nl">mailto:t.fransen@mailned.nl</A>] =
</FONT>
<BR><FONT SIZE=3D2>> Sent: Sunday, January 19, 2003 8:02 AM</FONT>
<BR><FONT SIZE=3D2>> To: Mitchell Skinner</FONT>
<BR><FONT SIZE=3D2>> Cc: Mono-list@ximian.com</FONT>
<BR><FONT SIZE=3D2>> Subject: RE: [Mono-list] Performance / array =
access</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Hello all,</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Mitch, to come back on the issue of bound =
checking. I looked </FONT>
<BR><FONT SIZE=3D2>> further into the problem and my first conclusion =
was wrong. </FONT>
<BR><FONT SIZE=3D2>> Piers already stated that he did not expect that =
the MS </FONT>
<BR><FONT SIZE=3D2>> compiler would be able to optimize all these =
check out of the loop.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> I again started looking into the assembly code =
and although </FONT>
<BR><FONT SIZE=3D2>> I'm not a assembly programmer (it has been too =
long since I </FONT>
<BR><FONT SIZE=3D2>> done some 6502/Z80 :-) I think the performance =
difference </FONT>
<BR><FONT SIZE=3D2>> (>2) comes from the optimization performered =
by the MS JIT </FONT>
<BR><FONT SIZE=3D2>> compiler (I am using version 1.0 of the =
SDK).</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> The MS JIT seems to be better at allocating =
processor </FONT>
<BR><FONT SIZE=3D2>> registers to certain variables resulting in much =
faster code. </FONT>
<BR><FONT SIZE=3D2>> The execution speed of the Mono code is in the =
same order as </FONT>
<BR><FONT SIZE=3D2>> the unoptimized code generated by the MS JIT =
compiler (for </FONT>
<BR><FONT SIZE=3D2>> tight loops accessing arrays). In case of my =
bubble sorting </FONT>
<BR><FONT SIZE=3D2>> algorithm this results in a speed difference of =
a factor 3. I </FONT>
<BR><FONT SIZE=3D2>> have made other tests like applying matrix =
multiplications on </FONT>
<BR><FONT SIZE=3D2>> a 2-dimensional array (to simulate an image =
processing </FONT>
<BR><FONT SIZE=3D2>> operation) and also here Mono lacks behind. I =
haven't sorted </FONT>
<BR><FONT SIZE=3D2>> this out but my first guess is that its again =
the register </FONT>
<BR><FONT SIZE=3D2>> optimization that makes the difference.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> For the code snippets below both the MS and Mono =
code is given:</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> =
private void TestMethod(int size)</FONT>
<BR><FONT SIZE=3D2>> =
{</FONT>
<BR><FONT =
SIZE=3D2>> =
array =3D new int[size];</FONT>
<BR><FONT =
SIZE=3D2>> =
for(int i=3D0; i< size; i++)</FONT>
<BR><FONT =
SIZE=3D2>> =
{</FONT>
<BR><FONT =
SIZE=3D2>> =
array[i] =3D size - i;</FONT>
<BR><FONT =
SIZE=3D2>> =
}</FONT>
<BR><FONT =
SIZE=3D2>> =
Console.WriteLine("Done" );</FONT>
<BR><FONT =
SIZE=3D2>> =
string s =3D Console.ReadLine();</FONT>
<BR><FONT SIZE=3D2>> =
}</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> The MS JIT generates the following code. =
Note that</FONT>
<BR><FONT SIZE=3D2>> I first run the program and latter when =
the program reads</FONT>
<BR><FONT SIZE=3D2>> from the console attach the debugger. If I =
run the program </FONT>
<BR><FONT SIZE=3D2>> from the MS development environment the =
JIT seems to </FONT>
<BR><FONT SIZE=3D2>> generate unoptimized code so I need this =
trick.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> MS Code generated by the JIT unoptimized =
</FONT>
<BR><FONT SIZE=3D2>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> =
private void TestMethod(int size)</FONT>
<BR><FONT SIZE=3D2>> =
{</FONT>
<BR><FONT =
SIZE=3D2>> =
array =3D new int[size];</FONT>
<BR><FONT SIZE=3D2>> 00000000 =
push ebp</FONT>
<BR><FONT SIZE=3D2>> 00000001 =
mov ebp,esp</FONT>
<BR><FONT SIZE=3D2>> 00000003 =
sub esp,14h</FONT>
<BR><FONT SIZE=3D2>> 00000006 =
push edi</FONT>
<BR><FONT SIZE=3D2>> 00000007 =
push esi</FONT>
<BR><FONT SIZE=3D2>> 00000008 =
push ebx</FONT>
<BR><FONT SIZE=3D2>> 00000009 =
mov ebx,ecx</FONT>
<BR><FONT SIZE=3D2>> 0000000b =
mov esi,edx</FONT>
<BR><FONT SIZE=3D2>> 0000000d =
xor edi,edi</FONT>
<BR><FONT SIZE=3D2>> 0000000f =
mov dword ptr =
[ebp-10h],0</FONT>
<BR><FONT SIZE=3D2>> 00000016 =
test esi,80000000h</FONT>
<BR><FONT SIZE=3D2>> 0000001c =
je 00000025</FONT>
<BR><FONT SIZE=3D2>> 0000001e =
xor ecx,ecx</FONT>
<BR><FONT SIZE=3D2>> 00000020 =
call 763A2CFF</FONT>
<BR><FONT SIZE=3D2>> 00000025 =
mov edx,esi</FONT>
<BR><FONT SIZE=3D2>> 00000027 =
mov ecx,2DE6B3Ah</FONT>
<BR><FONT SIZE=3D2>> 0000002c =
call FD4A2040</FONT>
<BR><FONT SIZE=3D2>> 00000031 =
lea edx,[ebx+4]</FONT>
<BR><FONT SIZE=3D2>> 00000034 =
call 76479768</FONT>
<BR><FONT =
SIZE=3D2>> =
for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>> 00000039 =
xor edi,edi</FONT>
<BR><FONT SIZE=3D2>> 0000003b nop</FONT>
<BR><FONT SIZE=3D2>> 0000003c =
jmp 00000056</FONT>
<BR><FONT =
SIZE=3D2>> =
array[i] =3D size - i;</FONT>
<BR><FONT SIZE=3D2>> 0000003e =
mov eax,dword ptr =
[ebx+4] <=3D=3D=3D=3D Start </FONT>
<BR><FONT SIZE=3D2>> loop, fetch the</FONT>
<BR><FONT SIZE=3D2>> size of the array</FONT>
<BR><FONT SIZE=3D2>> 00000041 =
cmp edi,dword ptr =
[eax+4] <=3D=3D=3D=3D Check if </FONT>
<BR><FONT SIZE=3D2>> within bounds</FONT>
<BR><FONT SIZE=3D2>> 00000044 =
jb 0000004D</FONT>
<BR><FONT SIZE=3D2>> 00000046 =
xor ecx,ecx</FONT>
<BR><FONT SIZE=3D2>> 00000048 =
call 762EED87 =
=
=
<=3D=3D=3D=3D </FONT>
<BR><FONT SIZE=3D2>> Trigger exception</FONT>
<BR><FONT SIZE=3D2>> 0000004d =
mov edx,esi =
=
=
<=3D=3D=3D=3D </FONT>
<BR><FONT SIZE=3D2>> esi contain variable size</FONT>
<BR><FONT SIZE=3D2>> 0000004f =
sub edx,edi</FONT>
<BR><FONT SIZE=3D2>> 00000051 =
mov dword ptr =
[eax+edi*4+8],edx</FONT>
<BR><FONT =
SIZE=3D2>> =
for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>> 00000055 =
inc edi =
=
<=3D=3D=3D=3D edi </FONT>
<BR><FONT SIZE=3D2>> contain variable i</FONT>
<BR><FONT SIZE=3D2>> 00000056 =
cmp edi,esi</FONT>
<BR><FONT SIZE=3D2>> 00000058 =
jl =
0000003E  =
; <=3D=3D=3D=3D Jump to </FONT>
<BR><FONT SIZE=3D2>> start of for</FONT>
<BR><FONT SIZE=3D2>> loop</FONT>
<BR><FONT =
SIZE=3D2>> =
Console.WriteLine("Done" );</FONT>
<BR><FONT SIZE=3D2>> 0000005a =
mov ecx,dword ptr =
ds:[01C40178h]</FONT>
<BR><FONT SIZE=3D2>> 00000060 =
call 76AA3250</FONT>
<BR><FONT =
SIZE=3D2>> =
string s =3D Console.ReadLine();</FONT>
<BR><FONT SIZE=3D2>> 00000065 =
call 76AA2F10</FONT>
<BR><FONT SIZE=3D2>> 0000006a =
mov dword ptr =
[ebp-14h],eax</FONT>
<BR><FONT SIZE=3D2>> 0000006d =
mov eax,dword ptr =
[ebp-14h]</FONT>
<BR><FONT SIZE=3D2>> 00000070 =
mov dword ptr =
[ebp-10h],eax</FONT>
<BR><FONT SIZE=3D2>> =
}</FONT>
<BR><FONT SIZE=3D2>> 00000073 nop</FONT>
<BR><FONT SIZE=3D2>> 00000074 =
pop ebx</FONT>
<BR><FONT SIZE=3D2>> 00000075 =
pop esi</FONT>
<BR><FONT SIZE=3D2>> 00000076 =
pop edi</FONT>
<BR><FONT SIZE=3D2>> 00000077 =
mov esp,ebp</FONT>
<BR><FONT SIZE=3D2>> 00000079 =
pop ebp</FONT>
<BR><FONT SIZE=3D2>> 0000007a ret</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> MS Code generated by the JIT optimized </FONT>
<BR><FONT SIZE=3D2>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> =
private void TestMethod(int size)</FONT>
<BR><FONT SIZE=3D2>> =
{</FONT>
<BR><FONT =
SIZE=3D2>> =
array =3D new int[size];</FONT>
<BR><FONT SIZE=3D2>> 00000000 =
push edi</FONT>
<BR><FONT SIZE=3D2>> 00000001 =
push esi</FONT>
<BR><FONT SIZE=3D2>> 00000002 =
mov edi,ecx</FONT>
<BR><FONT SIZE=3D2>> 00000004 =
mov esi,edx</FONT>
<BR><FONT SIZE=3D2>> 00000006 =
test esi,80000000h</FONT>
<BR><FONT SIZE=3D2>> 0000000c =
jne 0000005C</FONT>
<BR><FONT SIZE=3D2>> 0000000e =
mov edx,esi</FONT>
<BR><FONT SIZE=3D2>> 00000010 =
mov ecx,2DE6A22h</FONT>
<BR><FONT SIZE=3D2>> 00000015 =
call FD5B2098</FONT>
<BR><FONT SIZE=3D2>> 0000001a =
lea edx,[edi+4]</FONT>
<BR><FONT SIZE=3D2>> 0000001d =
call 765897C0</FONT>
<BR><FONT =
SIZE=3D2>> =
for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>> 00000022 =
xor ecx,ecx</FONT>
<BR><FONT SIZE=3D2>> 00000024 =
cmp esi,0</FONT>
<BR><FONT SIZE=3D2>> 00000027 =
jle 00000040</FONT>
<BR><FONT SIZE=3D2>> 00000029 =
mov edx,dword ptr =
[edi+4]</FONT>
<BR><FONT SIZE=3D2>> 0000002c =
mov edi,dword ptr =
[edx+4]</FONT>
<BR><FONT =
SIZE=3D2>> =
array[i] =3D size - i;</FONT>
<BR><FONT SIZE=3D2>> 0000002f =
cmp ecx,edi =
=
=
<=3D=3D=3D </FONT>
<BR><FONT SIZE=3D2>> start of loop</FONT>
<BR><FONT SIZE=3D2>> 00000031 =
jae 00000064 =
=
=
=
</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D index out of bounds</FONT>
<BR><FONT SIZE=3D2>> 00000033 =
mov =
eax,esi =
&=
nbsp; <=3D=3D=3D </FONT>
<BR><FONT SIZE=3D2>> esi contain</FONT>
<BR><FONT SIZE=3D2>> variable size</FONT>
<BR><FONT SIZE=3D2>> 00000035 =
sub eax,ecx</FONT>
<BR><FONT SIZE=3D2>> 00000037 =
mov dword ptr =
[edx+ecx*4+8],eax</FONT>
<BR><FONT =
SIZE=3D2>> =
for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>> 0000003b =
inc ecx =
=
=
<=3D=3D=3D=3D </FONT>
<BR><FONT SIZE=3D2>> ecx contain variable i</FONT>
<BR><FONT SIZE=3D2>> 0000003c =
cmp ecx,esi</FONT>
<BR><FONT SIZE=3D2>> 0000003e =
jl 0000002F =
=
=
=
</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D=3D jump to start of for =
loop</FONT>
<BR><FONT =
SIZE=3D2>> =
Console.WriteLine("Done" );</FONT>
<BR><FONT SIZE=3D2>> 00000040 =
mov ecx,dword ptr =
ds:[01C4086Ch]</FONT>
<BR><FONT SIZE=3D2>> 00000046 =
mov edx,dword ptr =
ds:[01C40110h]</FONT>
<BR><FONT SIZE=3D2>> 0000004c =
mov eax,dword ptr =
[ecx]</FONT>
<BR><FONT SIZE=3D2>> 0000004e =
call dword ptr =
[eax+000000D8h]</FONT>
<BR><FONT =
SIZE=3D2>> =
string s =3D Console.ReadLine();</FONT>
<BR><FONT SIZE=3D2>> 00000054 =
call 76BB2F68</FONT>
<BR><FONT SIZE=3D2>> 00000059 =
pop esi</FONT>
<BR><FONT SIZE=3D2>> =
}</FONT>
<BR><FONT SIZE=3D2>> 0000005a =
pop edi</FONT>
<BR><FONT SIZE=3D2>> 0000005b ret</FONT>
<BR><FONT SIZE=3D2>> 0000005c =
xor ecx,ecx</FONT>
<BR><FONT SIZE=3D2>> 0000005e =
call 764B2D57</FONT>
<BR><FONT SIZE=3D2>> 00000063 =
int 3</FONT>
<BR><FONT SIZE=3D2>> 00000064 =
xor ecx,ecx =
=
=
=
</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D=3D Exception</FONT>
<BR><FONT SIZE=3D2>> 00000066 =
call 763FEDDF</FONT>
<BR><FONT SIZE=3D2>> 0000006b =
int 3</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> The Mono (0.18) JIT generates the following code =
(--dump-asm) </FONT>
<BR><FONT SIZE=3D2>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Disassembly of section .text:</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> 00000000 =
<SpeedBenchmarks.BubbleSort_TestMethod>:</FONT>
<BR><FONT SIZE=3D2>> 0: =
55  =
; push =
%ebp</FONT>
<BR><FONT SIZE=3D2>> 1: 8b =
ec  =
; mov %esp,%ebp</FONT>
<BR><FONT SIZE=3D2>> 3: =
53  =
; push =
%ebx</FONT>
<BR><FONT SIZE=3D2>> 4: =
56  =
; push =
%esi</FONT>
<BR><FONT SIZE=3D2>> 5: 83 ec =
28  =
; sub $0x28,%esp</FONT>
<BR><FONT SIZE=3D2>> 8: c7 45 d8 00 00 00 00 =
movl $0x0,0xffffffd8(%ebp)</FONT>
<BR><FONT SIZE=3D2>> f: 8b 45 =
08  =
; mov 0x8(%ebp),%eax</FONT>
<BR><FONT SIZE=3D2>> 12: 05 08 00 00 =
00 =
add $0x8,%eax</FONT>
<BR><FONT SIZE=3D2>> 17: 83 38 =
00  =
; cmpl $0x0,(%eax)</FONT>
<BR><FONT SIZE=3D2>> 1a: 8b 4d =
0c  =
; mov 0xc(%ebp),%ecx</FONT>
<BR><FONT SIZE=3D2>> 1d: =
50  =
; push =
%eax</FONT>
<BR><FONT SIZE=3D2>> 1e: =
51  =
; push =
%ecx</FONT>
<BR><FONT SIZE=3D2>> 1f: =
52  =
; push =
%edx</FONT>
<BR><FONT SIZE=3D2>> 20: =
51  =
; push =
%ecx</FONT>
<BR><FONT SIZE=3D2>> 21: 68 54 7e 18 =
08 push =
$0x8187e54</FONT>
<BR><FONT SIZE=3D2>> 26: e8 b5 c2 e4 =
ff call =
ffe4c2e0</FONT>
<BR><FONT SIZE=3D2>> =
<SpeedBenchmarks.BubbleSort_TestMethod+0xffe4c2e0></FONT>
<BR><FONT SIZE=3D2>> 2b: 83 c4 =
08  =
; add $0x8,%esp</FONT>
<BR><FONT SIZE=3D2>> 2e: =
5a  =
; pop =
%edx</FONT>
<BR><FONT SIZE=3D2>> 2f: =
59  =
; pop =
%ecx</FONT>
<BR><FONT SIZE=3D2>> 30: 8b =
c8  =
; mov %eax,%ecx</FONT>
<BR><FONT SIZE=3D2>> 32: =
58  =
; pop =
%eax</FONT>
<BR><FONT SIZE=3D2>> 33: 89 =
08  =
; mov =
%ecx,(%eax)</FONT>
<BR><FONT SIZE=3D2>> 35: be 00 00 00 =
00 =
mov $0x0,%esi</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D esi contain variable i</FONT>
<BR><FONT SIZE=3D2>> 3a: e9 29 00 00 =
00 =
jmp 68</FONT>
<BR><FONT SIZE=3D2>> =
<SpeedBenchmarks.BubbleSort_TestMethod+0x68></FONT>
<BR><FONT SIZE=3D2>> 3f: 8b 45 =
08  =
; mov 0x8(%ebp),%eax</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D start loop</FONT>
<BR><FONT SIZE=3D2>> 42: 8b 40 =
08  =
; mov 0x8(%eax),%eax</FONT>
<BR><FONT SIZE=3D2>> \</FONT>
<BR><FONT SIZE=3D2>> 45: 8b =
ce  =
; mov %esi,%ecx</FONT>
<BR><FONT SIZE=3D2>> |</FONT>
<BR><FONT SIZE=3D2>> 47: 3b 48 =
0c  =
; cmp 0xc(%eax),%ecx</FONT>
<BR><FONT SIZE=3D2>> | Bounds check</FONT>
<BR><FONT SIZE=3D2>> 4a: 72 =
0a  =
; jb 56</FONT>
<BR><FONT SIZE=3D2>> =
<SpeedBenchmarks.BubbleSort_TestMethod+0x56>/</FONT>
<BR><FONT SIZE=3D2>> 4c: 68 e1 26 11 =
08 push =
$0x81126e1</FONT>
<BR><FONT SIZE=3D2>> 51: e8 ce 56 ed =
ff call =
ffed5724</FONT>
<BR><FONT SIZE=3D2>> =
<SpeedBenchmarks.BubbleSort_TestMethod+0xffed5724> <=3D=3D=3D =
</FONT>
<BR><FONT SIZE=3D2>> Generate exception</FONT>
<BR><FONT SIZE=3D2>> 56: 8d 44 88 =
10 =
lea 0x10(%eax,%ecx,4),%eax</FONT>
<BR><FONT SIZE=3D2>> 5a: 8b 4d =
0c  =
; mov 0xc(%ebp),%ecx</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D retrieve size</FONT>
<BR><FONT SIZE=3D2>> 5d: 2b =
ce  =
; sub %esi,%ecx</FONT>
<BR><FONT SIZE=3D2>> 5f: 89 =
08  =
; mov =
%ecx,(%eax)</FONT>
<BR><FONT SIZE=3D2>> 61: 8b =
c6  =
; mov %esi,%eax</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D move i to eax</FONT>
<BR><FONT SIZE=3D2>> 63: =
40  =
; inc =
%eax =
</FONT>
<BR><FONT SIZE=3D2>> =
=
<=3D=3D=3D i++</FONT>
<BR><FONT SIZE=3D2>> 64: 8b =
d8  =
; mov %eax,%ebx</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D (detour??)</FONT>
<BR><FONT SIZE=3D2>> 66: 8b =
f3  =
; mov %ebx,%esi</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D move i back to esi</FONT>
<BR><FONT SIZE=3D2>> 68: 8b 45 =
0c  =
; mov 0xc(%ebp),%eax</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D retrieve size</FONT>
<BR><FONT SIZE=3D2>> 6b: 3b =
f0  =
; cmp %eax,%esi</FONT>
<BR><FONT SIZE=3D2>> <=3D=3D=3D check if i < size</FONT>
<BR><FONT SIZE=3D2>> 6d: 0f 8c cc ff ff =
ff jl 3f</FONT>
<BR><FONT SIZE=3D2>> =
<SpeedBenchmarks.BubbleSort_TestMethod+0x3f> <=3D=3D=3D =
end loop</FONT>
<BR><FONT SIZE=3D2>> 73: 68 38 da 16 =
08 push =
$0x816da38</FONT>
<BR><FONT SIZE=3D2>> 78: 8b 05 78 df 16 =
08 mov =
0x816df78,%eax</FONT>
<BR><FONT SIZE=3D2>> 7e: =
50  =
; push =
%eax</FONT>
<BR><FONT SIZE=3D2>> 7f: 83 38 =
00  =
; cmpl $0x0,(%eax)</FONT>
<BR><FONT SIZE=3D2>> 82: 8b =
00  =
; mov =
(%eax),%eax</FONT>
<BR><FONT SIZE=3D2>> 84: ff 90 a0 00 00 =
00 call *0xa0(%eax)</FONT>
<BR><FONT SIZE=3D2>> 8a: 83 c4 =
08  =
; add $0x8,%esp</FONT>
<BR><FONT SIZE=3D2>> 8d: 8b 05 78 df 16 =
08 mov =
0x816df78,%eax</FONT>
<BR><FONT SIZE=3D2>> 93: =
50  =
; push =
%eax</FONT>
<BR><FONT SIZE=3D2>> 94: 83 38 =
00  =
; cmpl $0x0,(%eax)</FONT>
<BR><FONT SIZE=3D2>> 97: 8b =
00  =
; mov =
(%eax),%eax</FONT>
<BR><FONT SIZE=3D2>> 99: ff 90 c0 00 00 =
00 call *0xc0(%eax)</FONT>
<BR><FONT SIZE=3D2>> 9f: 83 c4 =
04  =
; add $0x4,%esp</FONT>
<BR><FONT SIZE=3D2>> a2: 8b 05 80 df 16 =
08 mov =
0x816df80,%eax</FONT>
<BR><FONT SIZE=3D2>> a8: =
50  =
; push =
%eax</FONT>
<BR><FONT SIZE=3D2>> a9: 83 38 =
00  =
; cmpl $0x0,(%eax)</FONT>
<BR><FONT SIZE=3D2>> ac: 8b =
00  =
; mov =
(%eax),%eax</FONT>
<BR><FONT SIZE=3D2>> ae: ff 90 7c 00 00 =
00 call *0x7c(%eax)</FONT>
<BR><FONT SIZE=3D2>> b4: 83 c4 =
04  =
; add $0x4,%esp</FONT>
<BR><FONT SIZE=3D2>> b7: 8b =
f0  =
; mov %eax,%esi</FONT>
<BR><FONT SIZE=3D2>> b9: 8b =
de  =
; mov %esi,%ebx</FONT>
<BR><FONT SIZE=3D2>> bb: 8d 65 =
f8  =
; lea 0xfffffff8(%ebp),%esp</FONT>
<BR><FONT SIZE=3D2>> be: =
5e  =
; pop =
%esi</FONT>
<BR><FONT SIZE=3D2>> bf: =
5b  =
; pop =
%ebx</FONT>
<BR><FONT SIZE=3D2>> c0: =
c9  =
; leave</FONT>
<BR><FONT SIZE=3D2>> c1: =
c3  =
; ret</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> -----Original Message-----</FONT>
<BR><FONT SIZE=3D2>> From: Mitchell Skinner [<A =
HREF=3D"mailto:mitchskin@attbi.com">mailto:mitchskin@attbi.com</A>]</FONT=
>
<BR><FONT SIZE=3D2>> Sent: Wednesday, January 15, 2003 12:42 =
AM</FONT>
<BR><FONT SIZE=3D2>> To: Tom Fransen</FONT>
<BR><FONT SIZE=3D2>> Cc: Mono-list@ximian.com</FONT>
<BR><FONT SIZE=3D2>> Subject: RE: [Mono-list] Performance / array =
access</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Hello,</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> The runtime inserts array bounds checks to =
prevent things </FONT>
<BR><FONT SIZE=3D2>> like buffer overflows. So the following =
code:</FONT>
<BR><FONT SIZE=3D2>> int len =3D =
array.Length;</FONT>
<BR><FONT SIZE=3D2>> for (int i =3D 0; i < len; =
++i)</FONT>
<BR><FONT SIZE=3D2>> sum +=3D =
array [i];</FONT>
<BR><FONT SIZE=3D2>> becomes something like:</FONT>
<BR><FONT SIZE=3D2>> int len =3D =
array.Length;</FONT>
<BR><FONT SIZE=3D2>> for (int i =3D 0; i < len; =
++i) {</FONT>
<BR><FONT SIZE=3D2>> if (i =
>=3D array.Length) throw new Exception; //forgot which</FONT>
<BR><FONT SIZE=3D2>> sum +=3D =
array [i];</FONT>
<BR><FONT SIZE=3D2>> }</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> In the fast case, the MS JIT is smart enough to =
analyze the </FONT>
<BR><FONT SIZE=3D2>> following code and figure out that "i" =
will not go out of the </FONT>
<BR><FONT SIZE=3D2>> bounds of the array (because i starts at zero =
and stays below </FONT>
<BR><FONT SIZE=3D2>> array.Length):</FONT>
<BR><FONT SIZE=3D2>> for (int i =3D 0; i < =
array.Length; ++i)</FONT>
<BR><FONT SIZE=3D2>> sum +=3D =
array [i];</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> If the for loop isn't set up exactly like that, =
I don't think </FONT>
<BR><FONT SIZE=3D2>> the MS JIT is able to eliminate the bounds check =
(unless </FONT>
<BR><FONT SIZE=3D2>> perhaps there are constants involved). Is =
the code you have </FONT>
<BR><FONT SIZE=3D2>> below (comparing i with a "size" =
variable) the exact code you </FONT>
<BR><FONT SIZE=3D2>> tested? If "size" is not a =
constant, I would be suprised to </FONT>
<BR><FONT SIZE=3D2>> find that there was a major difference between =
MS and mono.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Are you using the MS 1.0 release or the 1.1 =
release?</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Mitch</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> On Tue, 2003-01-14 at 13:38, Tom Fransen =
wrote:</FONT>
<BR><FONT SIZE=3D2>> > Piers,</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > can you spend a few more words on this. Why =
is the second </FONT>
<BR><FONT SIZE=3D2>> case slower?</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > Furthermore if I use a simple loop to fill =
an array the</FONT>
<BR><FONT SIZE=3D2>> > speed difference is a factor 3. I have a =
bubble sort method that I </FONT>
<BR><FONT SIZE=3D2>> > execute a larger number of times. On Mono =
it takes 19 </FONT>
<BR><FONT SIZE=3D2>> seconds on the </FONT>
<BR><FONT SIZE=3D2>> > MS stuff ~6 seconds. This is a big =
difference. So why I am losing a </FONT>
<BR><FONT SIZE=3D2>> > factor 3 on the following loop.</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> =
> &nb=
sp; // Fill aray worst case, all elements need =
</FONT>
<BR><FONT SIZE=3D2>> to be swapped</FONT>
<BR><FONT SIZE=3D2>> =
> &nb=
sp; for(int i=3D0; i< size; i++)</FONT>
<BR><FONT SIZE=3D2>> =
> &nb=
sp; {</FONT>
<BR><FONT SIZE=3D2>> =
> &nb=
sp; array[i] =3D size - =
i;</FONT>
<BR><FONT SIZE=3D2>> =
> &nb=
sp; }</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > I am testing with some small benchmarks, =
but real </FONT>
<BR><FONT SIZE=3D2>> applications (e.g. </FONT>
<BR><FONT SIZE=3D2>> > an</FONT>
<BR><FONT SIZE=3D2>> MP3</FONT>
<BR><FONT SIZE=3D2>> > decoder)</FONT>
<BR><FONT SIZE=3D2>> > often use tables stored in arrays to do =
certain calculations. So </FONT>
<BR><FONT SIZE=3D2>> > although the benchmark maybe syntetic =
certain application </FONT>
<BR><FONT SIZE=3D2>> will suffer </FONT>
<BR><FONT SIZE=3D2>> > from these penalties.</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > regards,</FONT>
<BR><FONT SIZE=3D2>> > Tom</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > -----Original Message-----</FONT>
<BR><FONT SIZE=3D2>> > From: Miguel de Icaza [<A =
HREF=3D"mailto:miguel@ximian.com">mailto:miguel@ximian.com</A>]</FONT>
<BR><FONT SIZE=3D2>> > Sent: Monday, January 13, 2003 3:22 =
AM</FONT>
<BR><FONT SIZE=3D2>> > To: Piers Haken</FONT>
<BR><FONT SIZE=3D2>> > Cc: Tom Fransen; =
Mono-list@ximian.com</FONT>
<BR><FONT SIZE=3D2>> > Subject: RE: [Mono-list] Performance / =
array access</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > Hello,</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > > Yeah, Microsoft's JIT lifts invariant =
bounds-checks. But </FONT>
<BR><FONT SIZE=3D2>> I believe </FONT>
<BR><FONT SIZE=3D2>> > > it's pretty limited.</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > For example, the check is removed in =
the following case:</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > for (int i =3D 0; i < =
array.Length; ++i)</FONT>
<BR><FONT SIZE=3D2>> > > sum +=3D array =
[i];</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > But not here:</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > int len =3D =
array.Length;</FONT>
<BR><FONT SIZE=3D2>> > > for (int i =3D 0; i < =
len; ++i)</FONT>
<BR><FONT SIZE=3D2>> > > sum +=3D array =
[i];</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > So the first case is =
(counter-intuitively) faster than the second.</FONT>
<BR><FONT SIZE=3D2>> > ></FONT>
<BR><FONT SIZE=3D2>> > > I don't believe Mono's JIT makes this =
optimization. Maybe the new </FONT>
<BR><FONT SIZE=3D2>> > > JIT will ;-)</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > This is a very good observation. =
Because it seems that this </FONT>
<BR><FONT SIZE=3D2>> > particular kind of loop is detected by the =
JIT engine.</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > Array-bounds-check elimination is something =
we want to do </FONT>
<BR><FONT SIZE=3D2>> with the new </FONT>
<BR><FONT SIZE=3D2>> > JIT, but it is not implemented at this =
point. The new JIT </FONT>
<BR><FONT SIZE=3D2>> features a </FONT>
<BR><FONT SIZE=3D2>> > new intermediate representation that =
simplifies </FONT>
<BR><FONT SIZE=3D2>> implementing this sort </FONT>
<BR><FONT SIZE=3D2>> > of thing, but it is still on its =
bootstrapping phases of life.</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > Miguel</FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> ></FONT>
<BR><FONT SIZE=3D2>> > =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2>> > Mono-list maillist - =
Mono-list@ximian.com </FONT>
<BR><FONT SIZE=3D2>> > <A =
HREF=3D"http://lists.ximian.com/mailman/listinfo/mono-list">http://lists.=
ximian.com/mailman/listinfo/mono-list</A></FONT>
<BR><FONT SIZE=3D2>> --</FONT>
<BR><FONT SIZE=3D2>> Mitchell Skinner =
<mitchskin@attbi.com></FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2>> Mono-list maillist - =
Mono-list@ximian.com </FONT>
<BR><FONT SIZE=3D2>> <A =
HREF=3D"http://lists.ximian.com/mailman/listinfo/mono-list">http://lists.=
ximian.com/mailman/listinfo/mono-list</A></FONT>
<BR><FONT SIZE=3D2>> </FONT>
</P>
</BODY>
</HTML>
------_=_NextPart_001_01C2C001.DACA076E--