[Mono-dev] Ping on nternal call builders

Paolo Molaro lupus at ximian.com
Wed Jul 30 13:20:20 EDT 2008

On 07/30/08 Kornél Pál wrote:
> >Zoltan Varga wrote:
> >   This patch replaces a small, fast, simple piece of code
> > in mono_emit_inst_for_method () with something far more complex. Also,
> The three icall builders are reference implementations for demonstrating
> what icall builders can do. I don't insist on OffsetToStringData being
> replaced with an icall builder. Note that however that my tests showed
> no difference in executing the code. (I didn't do tests on JIT time.)

If you propose a considerable amount of increased complexity and your
reference 'improvements' are actually a lot worse you should at least
choose better references to show.

> > about replacing
> > icalls with generated IL code:
> > - the code to generate the icalls is usually much bigger and more
> > complex than the icall
> >   itself.
> That's true but UnsafeAddrOfPinnedArrayElement (that is so simple code
> that it should do exactly the same in C and IL) for example was running
> 5.7x faster and I believe only because of omitting the managed to native
> transition.

This cases are rare and there are far better ways to solve this
particular issue: it could be done with a simple burg rule for the old
jit or with few generated instructions with the new IR.

> > - it replaces code generated by the gcc optimizing C compiler with
> > code generated by out JIT,
> >   which is of much lower quality. So even tough we incur a
> > managed-to-native transition
> > overhead by calling native code, we get some (or all) of it back by
> > having gcc compiled code
> >   in the icall.
> I don't believe that methods like UnsafeAddrOfPinnedArrayElement (and
> there are more methods similarly using fields not exposed to corlib that
> could benefit from this as well) could be optimized better by gcc than
> the JIT.

There are not many cases like UnsafeAddrOfPinnedArrayElement (the
implementation also likely needs security checks, so in the end the
implementation will be more complex and the transition overhead will be
less significant).

> I belive that memcpy for example should be implemented using native
> assembly but it still is implemented in C#.

For most common cases the current implementation is faster than a
generic memcpy implementation (I tested this at the time by simply using
the libc implementation as managed code).

> I assume you mean OffsetToStringData. When I wanted to implement
> UnsafeAddrOfPinnedArrayElement in mono_emit_inst_for_method it was not
> accepted.

You needed to introduce your own opcode with the old jit.
With the linear IR switch we can allow implementing more methods and
icalls by emitting multiple instructions instead of a single tree as in
the old code.
Also note another consideration: we'll add special cases only for
methods and icalls that actually have a perf impact is a real
application and so far it doesn't look like
UnsafeAddrOfPinnedArrayElement is a bottleneck (feel free to show us
applications where it is significant). There is no point in adding
complexity to mono for a microbenchmark only.

> Moving OffsetToStringData out of mono_emit_inst_for_method
> seems not to be accepted either.

Because you added lots of complexity for no gain and slowed down the

> I see OffsetToStringData and UnsafeAddrOfPinnedArrayElement being
> similar in that they do nothing complex just care about internal
> structure details of the runtime.

Sure, but your OffsetToStringData change makes things worse and
the UnsafeAddrOfPinnedArrayElement change is better done in other ways.

> icall builders are my proposed solution but I'm open to any other
> solution as well.

See above.


lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better

More information about the Mono-devel-list mailing list