[Mono-list] JVM performance: JVM as a basis for CLR

Fergus Henderson fjh@cs.mu.oz.au
Wed, 25 Jul 2001 17:26:51 +1000


On 22-Jul-2001, Jay Freeman saurik" <saurik@saurik.com> wrote:
> 
> The standard basis of the argument for the CLR's fundmental speed
> improvements over Java is... well, now isn't that interesting.  There isn't
> one; at least one that I know of (so if anyone knows one, speak up).

There are several reasons that I can think of as to why it should be possible
to make the CLR faster than the JVM:

	- support for tail calls
	- less heap allocation, due to
		- value types
		- function pointer types (rather than heap-allocated closures)
		- byref arguments (rather than returning multiple values
		  in heap-allocated objects or arrays)
	- support for unverifiable code (which can avoid the need for some
		runtime checks)

> To me
> a lack of argument is a pretty good indication of something being seriously
> wrong with the assumption.  Some might claim that "tail." is an important
> feature for speed of functional languages, but that isn't an advantage in
> the slightest.

You're wrong.

Some functional languages, e.g. ML, need a *guarantee* that the
implementation will use constant stack space for tail-recursive calls.
If the VM doesn't have an explicit "tail call" instruction, then the
function language compiler has to resort to techniques such having every
function return the address of the next function to call to a driver loop.
These techniques have very significant effects on both efficiency and
ease of interoperability.

> It is important to note that many JVM's do tailcall
> optimization, detecting functions that would benefit and _automatically_
> emitting the right code, even if the original byte-code didn't explicitely
> specify it (and AFAIK C# never does anyway).  ORP, obviously my favorite
> JIT, happens to be one of the ones that does this.

There's two kinds of tail calls.  One is directly recursive tail calls,
where the callee is the very same function as the caller.  These are the
easy ones.  For this kind, there is no real need for a VM instruction,
or even for the JIT to optimize them, since they are easy to optimize
in the front-end, by compiling them to loops.

The other kind are when the tail call is to a different function.
These are known in GCC as "sibling calls".  
I'm pretty sure that ORP does *not* optimize these ones.
These kind are much harder to optimize, both for the front-end
and for the JIT.  To optimize these in the front-end, you need to resort
to techniques such as driver loops that slow down the code dramatically.
They can be optimized in the JIT, but I don't know if any existing
JITs perform this optimization -- from what I've heard, they don't.

(There is also another CLR-specific reason why a "tail call" instruction
is needed.  In the presence of unverifiable code and stack allocation,
it is very difficult for the VM to optimize tail calls which the front-end
may know are safe, because the VM may not be sure that the stack allocated
variables aren't still live (e.g. pointed to by unmanaged pointers that
have been stored in static variables).  This doesn't apply to the JVM,
because the JVM doesn't support stack allocation or unverifiable code.
But the lack of a tail call instruction in the JVM is still a problem
because of the issues with sibling call optimization mentioned earlier.)

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.