[Mono-dev] [PATCH] A "fastpath" dead code elimination
vargaz at gmail.com
Thu Nov 17 07:23:29 EST 2005
The general JIT changes in the patch look harmless to me. There is a problem
tough: the patch makes tests/marhal2.exe crash when run with -O=all.
On 11/15/05, Massimiliano Mantione <massi at ximian.com> wrote:
> The attached patch implements deadce, but without using SSA.
> Instead, a simple alias analysis pass takes care of understanding
> when local variables [may] change value, and liveness computation
> has been modified to use this aliasing info if present.
> The alias analysis pass has O(n) complexity (n = code size), it is
> just a linear sweep on the list of instructions.
> Then, deadce operates one BB at a time, scanning the code linearly
> and using the liveness bits as start/end conditions (so it is O(n)
> as well).
> I positioned the deadce pass just between the liveness computation
> and the linear scan register allocator, so that:
> - it can reuse the liveness computation already computed for the
> linear regalloc, without requiring a new pass, and
> - it is "downstream" in the compilation pipeline, so it has the
> possibility to catch as many dead instructions as possible.
> The name "fastpath deadce" comes from Patrik Torstensson :-)
> He called it like this on IRC because it is on the fast compilation
> path (the one without SSA).
> But now, without other technical stuff, some numbers...
> First of all, the JIT overhead: the time to JIT mscorlib.dll with
> various compilation options, in seconds (time mono --compile-all):
> Options -all -all,ssa
> JIT time 1.05 1.35
> Here we see that just computing SSA has a 30% overhead with respect
> to a compilation with no optimizations at all.
> Options default ssa deadce ssa,deadce
> JIT time 1.32 1.55 1.45 1.6
> And here we see that the overhead of fastpath deadce on a "default"
> compilation is almost 10%, while ssa deadce has 21% (and note that
> in practice, since ssa does not work with aliasing, the new fastpath
> deadce is applied to more methods).
> So, the JIT overhead is reasonably small.
> And finally, some lovely benchmark...
> I tried SciMark, and here are the results:
> (MFlops, just the composite score)
> default 205.92
> -O=ssa,deadce 210.61
> -O=deadce 207.50
> -O=consprop,copyprop,deadce 207.72
> -O=consprop,copyprop,deadce,inline 209.82
> -O=inline 213.63
> -O=deadce,inline 214.52
> So, the idea is that fastpath deadce has some effect :-)
> It is not as effective as ssa deadce, but it is useful anyway and
> most of all it does not incur the same JIT time overhead.
> BTW, I spent a *lot* of time trying to be sure that those numbers
> are accurate. I had to kill evolution, gaim, beagle, the wireless
> card, and carefully monitor system load otherwise the results were
> erratic (just of a few percent, but this is what we're looking at).
> I then executed the benchmark several (>10) times for every case
> and took the *best* score for each (with the idea that it is the
> one where other things interfered less), and anyway all the best
> scores were almost identical.
> A "funny" thing: these are the results I have on a SciMark.exe
> compiled with MS .NET 2.0:
> default: 136.65
> -O=consprop,copyprop,deadce: 137.52
> As you can see, they are *low*!!!
> I still have to understand what the 2.0 csc does exactly, but it
> looks a real disaster for us :-(
> Watch out this quirk when doing/examining benchmarks!
> OK, that's all... please approve the patch :-)
> We can always refine it later...
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
More information about the Mono-devel-list