[Mono-devel-list] Trampolines...

Tue Feb 22 16:07:43 EST 2005

> The patching of the trampoline is done to optimize cases when the caller address
> cannot be patched (as in the case with delegates). You are right that
> it means other
> calls to the method will not get the caller address patched. So this
> is a tradeoff.

Quite a serious one currently: Due to the fact that
gpointer
mono_create_jit_trampoline (MonoMethod *method)
{
	MonoDomain *domain = mono_domain_get ();
	gpointer tramp;

does not have code like this:

	gpointer code = mono_jit_find_compiled_method (domain, method);
	if (code && (!method->klass->valuetype) &&
	    !(method->iflags & METHOD_IMPL_ATTRIBUTE_SYNCHRONIZED)) {
		return code;
	}

practically every call to an already compiled method goes over the stub 
too! Meaning mono practically never emits a direct call to a method and 
all goes over the stub jump indirection:
	mov StubOffset, %r11
	call *%r11
	mov MethodOffset, %r11
	jmp *%r11
is the default calling sequence for non virtual methods on amd64!

That's also why I thought that my refcounting was way off - instead it 
was correct! :-)

While I'm at it, I've some questions regarding the design of the 
trampolines alltogether:
Why was it decided to emit a stub (34 bytes on AMD64), instead of using 
jump tables like: <argptr_func1><target_ptr_func1> for the jit trampolines?
Initially, the argptr argument can be filled with the one pointer 
argument currently loaded by the stub and later on be replaced by a 
pointer to MonoJitInfo. The target_ptr argument would either be a 
pointer to a trampoline (which would also effectively load the argptr), 
or - when compiled - the pointer to the final method. Recompilation/Code 
moving would also be a piece of cake (and platform independant!), as it 
meant exchange of one pointer at a known location. This jump table would 
require 16 bytes per method on AMD64 instead of 34 for the stub, would 
make code patching almost unnecessary and due to the known jump table 
location, AMD64 could work with call-offset opcodes (that won't 
change).. Not to mention that the page would always be in memory due to 
the frequent accesses and mixing AOT/nonAOT code would be no problem at 
all. (We could even swap out seldomly used methods and read them back in 
when necessary.. ;-) )
I guess the AOT code already uses such tables..

The only trampolines really requiring stub code are the jump trampolines 
AFAIK.

Second question: Why do we have those unbox trampolines? I understand 
that virtual methods of value types need to get an unboxed 'this' 
pointer, but why generating a trampoline for this kind of code? Can't 
this be done when the method is JITed? (As a function prologue?)
These unbox trampolines are a pain in the arse, as you have constantly 
to check for this case and it limits code patching: I can not recompile 
virtual methods of value types..

Third (and final) question: Any ideas/thoughts about generating 
exception tables and getting rid of stack frames (freeing bp, partially 
sp). We could also include information about the this pointer, which 
would open up a way of retrieving MonoJitInfo quite fast - even if the 
method is virtual. I know this is some really serious change...

BTW: I'm pretty sure that by including some little C++ exception 
handling, we could get rid of LMFs on AMD64 too: AFAIK this would speed 
up P/Invoke..

> We might change this if it is a problem. In the meantime, you can
> disable this in
> your working copy.

No problem with that. Actually it simplifies code moving, as it disables 
code patching and all calls go over the stub.. Just two pointers to 
exchange and the old method can be freed! (Not in the virtual case, of 
course)

Thanks!
   Willi