[Mono-dev] difference in performance between mono OSX and linux?

Jonathan Shore jonathan.shore at gmail.com
Mon Jan 23 14:18:10 UTC 2012


On Jan 23, 2012, at 7:27 AM, Rodrigo Kumpera wrote:

> Mono os OSX has some performance issues due to multiple factors, some Reimer pointed out, but it's hard to tell without a good profiler run.
> 
> Boehm doesn't support the fast memory allocator on OSX.
> The lack of compiler support for fast TLS is another pain point, this has been mostly addressed on trunk so 2.12 will be closer.
> Another issue is that OSX primitives are much worse than linux's. Things like mutex'es and semephores perform much worse.
> Finally, the 32 x 64 bits issue can make a huge difference depending on your code. If you're doing a lot of 64bits math or your
> code is register happy, 32bits will be slower.
> 

I see.  I am using a SpinLock around every (very short lived) transaction.   Perhaps the primitives used in SpinWait have a different performance profile on OSX vs Linux?   Also, all of the transactions involve longs and doubles {comparisons, load/store, some arithmetic).   I am not at all clear on how longs and doubles would be rendered differently in terms of instructions on 32bit vs 64bit.    I thought even in the 32bit world, intel CPUs had a 64bit pipeline for doubles, though probably not for longs.   

What sort of instrumentation would be useful? 

> Jonathan, if you could do an Instruments run on osx and perf on linux of your add and send the decoded results, it would help a lot troubleshoot
> the issue.
> 
> 
> 
> On Sat, Jan 21, 2012 at 5:28 PM, Jonathan Shore <jonathan.shore at gmail.com> wrote:
> I am running a program that does millions of transactions on FIFO queues, etc.   Running on my OSX box (which is on a dual CPU / 8 core Xeon E5520) and on a linux box which is a core I7-950, I found a significant performance difference which is difficult to explain.
> 
> On a single threaded run, the OSX based E5520 took 21 seconds to process 8 million "transactions" and the linux box took only 8 seconds.   The i7-950 is marginally faster (~10% faster) in typical benchmarks that the E5520.
> 
> Both runs were done with -llvm enabled and with the same garbage collector.    There are probably 16 - 32 million small objects created during the course of this evaluation, but generally with good locality, so should be easy to GC.
> 
> The OSX-based mono performance for this app is 3x slower than the linux performance (whereas for typical programs, say in C, the difference would be about 10%).   Both the linux and OSX boxes have significant memory (in fact the OSX box has more memory).
> 
> Finally, the OSX version is using 2.10.8 32bit  and the linux version 2.10.6 64bit.  I doubt that there is a retrogression in performance between 2.10.6.  Thinking is more likely to be an implementation difference between OSX and linux?    I would also think that the 32bit vs 64bit would not make all that much difference?
> 
> So I am wondering whether there are differences in implementation between mono on these platforms that could account for a significant performance difference?
> 
> Thanks
> Jonathan
> 
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20120123/20437936/attachment.html>


More information about the Mono-devel-list mailing list