[Mono-dev] Performance of calls

Dax Dax at daxxfiles.net
Mon Jan 7 14:34:57 EST 2008


Aefvadh,

some recent benchmarks have led me to something I find quite funny:
when deriving a class from MarshalByRefObject, static and virtual calls
are faster than "regular" calls using simple classes derived from
objects. The instance calls are, however, slower.

Here some results for my machine (Debian/Sid x64 running Mono 1.2.6):

mono --optimize=-all test.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             4908.102ms              4849.006ms
instance call           5385.432ms              9879.17ms
virtual call            5646.535ms              5632.558ms

mono --optimize=all,-inline test.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             3822.344ms              3784.685ms
instance call           3244.855ms              9231.223ms
virtual call            3809.689ms              3251.055ms

mono --optimize=all test.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             3798.973ms              3757.991ms
instance call           3220.736ms              8053.65ms
virtual call            3758.477ms              3220.936ms

mono ./test.exe # after mono --aot --optimize=all test.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             588.39ms                3220.995ms
instance call           1074.873ms              9191.883ms
virtual call            3758.875ms              3757.815ms


Running this (gmcs compiled) program inside a Windows XP-VM (32 bit)
(not comparable, i know) yields different results:

Z:\test.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             569,3399ms              560,3502ms
instance call           558,1868ms              6902,4121ms
virtual call            3960,7318ms             3954,0762ms

Now, compiling the code compiled with csc:

Z:\Main.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             1142,0789ms             1129,467ms
instance call           1123,7126ms             7471,9289ms
virtual call            4066,8118ms             4063,5181ms

Same .exe, but run under Mono:

mono --optimize=-all Main.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             4888.398ms              5368.096ms
instance call           5905.224ms              10200.218ms
virtual call            5628.401ms              5680.349ms

mono --optimize=all,-inline Main.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             4248.302ms              4389.968ms
instance call           4409.458ms              8891.732ms
virtual call            4272.95ms               4390.157ms

mono --optimize=all Main.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             3810.54ms               3784.025ms
instance call           4320.234ms              8089.157ms
virtual call            3776.239ms              3774.333ms

./Main.exe # after mono --aot --optimize=all Main.exe
1 billion calls each
Action                  object                  MarschalByRefObject
static call             1975.254ms              4326.379ms
instance call           2204.766ms              9190.099ms
virtual call            4324.108ms              3782.83ms


Letting Mono do an AOT run on the various exe's bring static call
performance to near .net leve, while instance calls are only a bit
slower. Virtual calls even slow down by 0,5 seconds compared to maximum
optimization.

The big difference object-instance <-> MarshalByRefObject-instance seems
to be a matter of architecture, but it's striking how much slower the
csc-generated assembly is at times, especially after AOT.

Attached are the benchmark program used to time the calls and the gmcs
test.exe, as well as csc Main.exe

What actually causes this behaviour (csc assemblies and gmcs assemblies
differ in code efficiency, Marshal* instance calls are slower after AOT,
while Marshal* non-instance calls are faster without AOT, ...)?


bedah

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Main.cs
Type: text/x-csharp
Size: 1585 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080107/5b50cea6/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Main.exe
Type: application/x-msdos-program
Size: 4608 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080107/5b50cea6/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.exe
Type: application/x-msdos-program
Size: 4608 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080107/5b50cea6/attachment-0002.bin 


More information about the Mono-devel-list mailing list