[Mono-devel-list] RAPGO Proposal

Mon Nov 29 05:48:07 EST 2004

On 11/26/04 Willibald Krenn wrote:
> 	IMO it would be beneficial to somehow cache compiled code on
> 	disk along with the executable, so that the first time
> 	compilation may be replaced by loading the cached version...
> 	(Some sort of implicit AOT compilation?) Of course this can't
> 	be done with fully optimized versions..

Zoltan has already some code that does this, though I don't remember
if it's finished cleaned up. Look in aot.c.

> 	I've come up with following 'idea': Each method is called
> 	indirectly via	call *rax (where rax points to some GOT). So by
> 	changing the offset every call will go to the new location.
> 	Another technique would be to replace the existing method by
> 	some code that patches the caller's address to jump to the new
> 	code the next time directly. This however means that we would
> 	have to take care how long a given 'Patcher' needs to be
> 	preserved... (some problem GC could take care of..)

We have already the code to deal with this.

> 	Before freeing/overwriting a method we also have to ensure no
> 	thread is executing this piece of code anymore. Simple
> 	Entry/Exit counters should be able to handle that..

Counters are not needed and would be too slow. We can simply walk
the stack of the various threads and see if any is inside the method.
At first, there is no need to free the code, since we'll only recompile a 
method once or twice, so the 'leak' is bounded.

> 	(In case of an endless loop, code could be patched so that this
> 	thread generates a signal..)

One of the issues is how to handle methods that are never exited. If a method
is called many times it's easy to recompile it and make the code call the new
faster version. However, if most of the time is spent inside a single method
which is executed once, for example, just recompiling and changing the call 
sites is not going to work. We'd need to transfer the state from the old 
stack/registers to the state as needed by the new code: this is far from 
trivial and one of the reasons I prefer statistical profiling vs 
counter-based profiling with code embedded in the slow method compilation
(though I guess in such cases that code could be overwritten with nops).
With statistical profiling the code is at least not slowed down by
the profiling overhead.

> VI Placement
> ~~~~~~~~~~~~
> 	Currently every code that is being emitted gets copied to it's
> 	final location - smells like overhead to me..
> 	What about mmap and direct emit into this area?
> 	This would also save time for freeing/allocating memory for
> 	replacement code.

I don't think any of this is an issue: a memcpy of the code is going
to be a very tiny fraction of the time spent compiling. And I don't see
any relation with the freeing/allocating.

Thanks.
lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better