[Mono-dev] More updates on Mono (before the call)
Sergei Dyshel
qyron.private at gmail.com
Thu Sep 16 13:15:47 EDT 2010
I'm very sorry, this post was intended for another mailing-list
(non-public).
Moderators, please delete it.
--
Regards,
Sergei Dyshel
On Wed, Sep 15, 2010 at 22:59, Sergei Dyshel <qyron.private at gmail.com>wrote:
> Hi,
> I've almost finished tuning Mono's Altivec performance. The results are ,
> as usual, in this table:
>
> https://spreadsheets.google.com/ccc?key=0AhjvSAvEoHopdG1LUE9Zdkd1TTZIQ0FCWl82bU5Fa1E&hl=en&authkey=COqyrPMD
>
>
> <https://spreadsheets.google.com/ccc?key=0AhjvSAvEoHopdG1LUE9Zdkd1TTZIQ0FCWl82bU5Fa1E&hl=en&authkey=COqyrPMD>There
> are much more "blue" ratios now but there are still some optimization issues
> I couldn't solve:
>
> 1) 'mmm_intrchage' uses a different expression for alignment checking
> (versioning) and this expression is somehow isn't constand-folded during
> JITing. This results in twice bigger code and register allocator just can't
> act effectively there. By enabling full optimizations in Mono I could
> partially solve this problem but is not the best solution (since this
> increases compilation time).
>
> 2) 'video_dissolve_fp', 'saxpy_fp', 'dscal_fp' are all variations of simple
> 'a[i]=b*c[i]+d[i]' floating-point loop. The aligned version, generated by
> vectorizer, looks (in Gimple) like: "*(&a+i) = b* (*(&c+i)) + *(&d+i)" and
> this is converted further to CIL. Since Mono has no inter-bb constant
> propagation and all array's addresses are know at JIT time, all 3 addresses
> are generated by Mono in each iteration (and it takes 3 PPC instruction for
> each address). I think this is the reason for bad results but the ratios
> these benchmarks behave rather differently. Anyway, it would be much better
> if arrays' addresses were saved to locals in loop prolog and then used in
> each iteration.
>
> 'video_dissolve_s8' and 'small_sad' still need to be implemented/analyzed.
> Tommorow I'll update the numbers for SSE. I anticipate an improvement after
> recent tweaks I've added to Mono but it won't so good as with Altivec,
> mostly because x87 instruction set is more stack-based so floating-point
> code doesn't get optimized as simply as on PowerPC. Anyway, let's
> wait until tomorrow's results...
>
> That's all, folks! (c)
> --
> Regards,
> Sergei Dyshel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20100916/e7baf2ad/attachment.html
More information about the Mono-devel-list
mailing list