[Mono-dev] CIL to CIL optimizer

Bjarke Hammersholt Roune bjarke.roune at gmail.com
Thu Aug 10 06:41:23 EDT 2006


Hi,

I am considering implementing a CIL to CIL optimizer as suggested at
http://www.mono-project.com/Cecil. I have some questions and some ideas 
up for criticism.

Having previously implemented an optimizing compiler in ML in a course, 
I think that SML.net (http://www.cl.cam.ac.uk/Research/TSG/SMLNET/) 
would be better suited to the task than C#. F# would be nice too, but it 
doesn't have an open source compiler as far as I can tell.

The proof-of-concept I have in mind would do the following steps for a 
subset of the full CIL. This subset would include a conditional branch 
opcode, the addition opcode and full support of exception handling 
including (the dreaded) try...finally.

1. Read in an assembly using Cecil
2. Construct the control flow graph while converting to a quadruple 
format (i.e. eliminate the stack in favor of registers)
3. Convert to SSA form using the algorithm from 
http://www.hipersoft.rice.edu/grads/publications/dom14.pdf
4. Do intra-procedural conditional constant propagation as described on
page 447 of "modern compiler implementation in ML". 
(http://portal.acm.org/citation.cfm?id=103136)
5. Eliminate phi-nodes using moves and convert to CIL (i.e. reintroduce 
the stack)
6. Output assembly using Cecil

The next step would be to support at least the subset of CIL that is 
output by the Mono C# compiler. Then it would be nice to extend the 
optimization to be inter-procedural. Then I would focus on outputting 
better CIL code like eliminating some of those moves coming from SSA. I 
have plenty ideas for what to do after that, but I'll have to see how 
much time I have.

Having taken a look at the ECMA CIL spec, I don't see anything in there 
that should present really big problems other than exceptions and the 
finally construct (which is a problem even without exceptions).

Exceptions are annoying because of the way they complicate analyzing 
control flow, but after having thought and read about how to handle them 
for some time I think they will be manageable.

The try ... finally construct, on the other hand, now that just seems 
painful to me. As in red-hot-pincers-jammed-into-the-eyes- 
while-being-electrocuted-naked-in-a-snow-storm painful. Having scoured 
the web to the best of my ability, I have find PLENTY of material on how 
to deal with try ... catch in Java bytecode and CIL, but somehow people 
usually never bother to explain how they handle try ... finally. I guess 
how to efficiently construct a precise CFG and do SSA form in the face 
of try ... finally is just obvious to everyone but me.

A solution to the problem involving higher-order functions is described 
at http://flint.cs.yale.edu/flint/publications/lamjvm.pdf, though that 
solution seems unsuitable for an imperative representation.

A different solution used in SafeTSA is described at 
http://www.google.dk/url?sa=t&ct=res&cd=1&url=http%3A%2F%2Fwww.cs.utsa.edu%2F~vonronne%2Fpubs%2Fjvr-dissertation.pdf&ei=f43aRKLBJ7nWwgHZvsGhCA&sig2=nhWafzjm_synZsuIHpXhqg
which consists of making several copies of the code in the finally 
clause. The potential for code size explosion is huge, but on the other 
hand finally clauses are usually small... but it is still not a 
satisfying solution.

An obvious alternative would be to represent finally blocks as what they 
are and then having every single pass of the optimizer be aware of how 
to deal with this in some ad-hoc fashion. I don't like that either 
because all the details just seem horrendous to deal with as far as I 
can tell.

Another idea is putting the finally code in a function call with plenty 
of ref parameters. This should work nicely together with 
inter-procedural optimizations. It ought to be possible to reinsert 
these calls as a finally block in most cases when outputting the CIL. 
I'm not sure this will work well, though.

So far the best solution I can think of is to copy the finally code if 
it is small and insert function calls if the finally code is large. 
Always inserting function calls and having a general inliner pass might 
work well too.

If anyone has any ideas or experience they would like to share, I would 
love to hear it. Especially on the topic of try ... finally!

Regards
Bjarke Roune



More information about the Mono-devel-list mailing list