[Mono-bugs] [Bug 487846] PPC code gen is inefficient in several areas

Mon Mar 23 15:36:12 EDT 2009

https://bugzilla.novell.com/show_bug.cgi?id=487846

User munroesj at us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=487846#c1

--- Comment #1 from Steven Munroe <munroesj at us.ibm.com>  2009-03-23 13:36:09 MST ---
In the normal PPC compiler and linker, the compiler always generates a single
branch and link (bl) for calls. Later the linker will either resolve the bl
directly or generate a call stub and resolve the bl to the call stub.

Also the compiler never uses the 5 instruction (addis, ori, sldi, oris, ori)
sequence to generate a 64-bit pointer or constant. Instead it allocates the
slots in the GOT/TOC for the data/pointer and generates a single load
instruction. 

These techniques seem to be problematic to generate in the mono JIT. The JIT
does not have relocation infrastructure and the ability to defer relocations
until the final locations for all functions/data are know. In mono methods are
generated incrementally and some sequences might need to resolved multiple
times (to trampolines and then to final code).

The current design is also not thread safe as it requires patching multiple
instruction sequences. If the target is in reach (within +/- 32MB) it should be
possible to patch the final blrl into a simple bl instruction. Then as a second
step  we can patch the addis, ori, mflr sequence to nop's. This would be thread
safe but is still slower then it needs to be (1 to 3 cycle bubble to execute
the nop's).

There is a first stage of fix up that seems to occur after
mono_arch_emit_epilog() where the *code buffer is moved to its final location.
This  seems to be based on call-backs to mono_arch_patch_code() setup by
mono_add_patch_info().

I think a better strategy is to generate simple bl instructions into the body.
These still require fixup they have to be recorded via mono_add_patch_info().
Then during mono_arch_emit_epilog(), generate the call stubs following the
final   return sequence. The bl's in the body can be fixed up to transfer to
the matching call stub at this time but still need to marked with
mono_add_patch_info() so it can retargeted into a direct call if possible
(within +/- 32M). But we still need to mark the call stubs with
mono_add_patch_info() so we can fix up the ppc_load_sequence for when the
target is out of reach.

There may be some extra space (1 additional word per static call) and some of
the call stubs will be generated and not used. But there may be some
opportunity to common-up call stubs and save space that way. This sequence can
be made thread safe by changing the single bl instruction.

-- 
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.