> On Jan 23, 2012, at 7:27 AM, Rodrigo Kumpera wrote:
> Mono os OSX has some performance issues due to multiple factors, some
> Reimer pointed out, but it's hard to tell without a good profiler run.
> Boehm doesn't support the fast memory allocator on OSX.
> The lack of compiler support for fast TLS is another pain point, this has
> been mostly addressed on trunk so 2.12 will be closer.
> Another issue is that OSX primitives are much worse than linux's. Things
> like mutex'es and semephores perform much worse.
> Finally, the 32 x 64 bits issue can make a huge difference depending on
> your code. If you're doing a lot of 64bits math or your
> code is register happy, 32bits will be slower.
> I see.  I am using a SpinLock around every (very short lived) transaction.
>   Perhaps the primitives used in SpinWait have a different performance
> profile on OSX vs Linux?   Also, all of the transactions involve longs and
> doubles {comparisons, load/store, some arithmetic).   I am not at all clear
> on how longs and doubles would be rendered differently in terms of
> instructions on 32bit vs 64bit.    I thought even in the 32bit world, intel
> CPUs had a 64bit pipeline for doubles, though probably not for longs.

doubles have hardware support on 32 and 64bits. longs don't have any under
32bits and things like division are done in software.

> What sort of instrumentation would be useful?

Nothing fancy, high frequency sampling to see what runtime code shows up.
On instruments this the template under: Mac OS X -> CPU -> Timer Profiler.
Then customize it from a sample every 1ms to every 100 us.
