[Mono-list] [Off topic?] Java Bytecode -> IL Bytecode Compiler

Dominic Cooney dominic@dcooney.com
Sat, 12 Jan 2002 08:26:14 +1000


I had a stab at something like this for a CS project last semester. It
could convert "Hello, world!" but otherwise wasn't very successful.

In general, I would recommend implementing the IL converter, rather than
a Java compiler. This is because: (a) there are a lot of binary Java
components in the wild; (b) there are a lot of languages that target the
JVM other than Java; (c) Sun continues to change the Java language, yet
the JVM is remarkably static.

Here are some issues you will need to be aware of:

- .NET locals and arguments are typed; whereas JVM ones are not. Hence
you will probably need some kind of "locals/arguments manager" that
hands out .NET locs/args for a particular type and recycles them when
the Java bytecode starts treating them as a different type.

- .NET locals and arguments are different; whereas JVM ones are not. So
a JVM "load" may map to an IL ldarg or ldloc.

- You will need to simulate the type of the JVM evaluation stack. Some
JVM instructions (e.g. dup2) behave differently depending on the type of
what is on the top of the stack (i.e. for dup2, if the top of the stack
is a long or double the stack delta is 1 and maps to an IL dup; but
otherwise the stack detla is 2 and maps to an il stloc tmp1, stloc tmp2,
ldloc tmp1, ldloc tmp2, ldloc tmp1, ldloc tmp2; there are a few like
this).

- Some kind of optimizer would be quite good, because a lot of JVM
instructions (swap, the weird dups and weird pops) end up thrashing a
lot of locals in IL.

- Compiling finally is like unscrambling an egg, because Java uses a
kind of nasty go-local-sub instruction to do it. If language X abused
this feature to implement things other than finally, it could be tricky
to implement.

Supporting the API and JVM semantics is another matter...

J# includes a wrapper library that simulates the JDK 1.1.4 API atop
.NET. Another approach would be to do some kind of semantic translation
in the converter (e.g. map System.out.println to Console.WriteLine;
j.l.String to System.String; etc.)

Ultimately, though, I think implementing a JVM for .NET is the way to
go. This would probably provide the best support for the JVM semantics.
Unfortunately .NET interop may suffer some, but it is unlikely Java
would ever be a full CLS producer or consumer anyway (without some
serious language hacks).

Feel free to email me off the list (unless others thing this is not
offensively off-topic) if you want to discuss anything.

Just my $0.02,

Dominic Cooney