[Mono-devel-list] Cross application domain call optimization

Fri Oct 15 18:40:27 EDT 2004

On Fri, 2004-10-15 at 22:53 +0200, Lluis Sanchez wrote:
> Here is the first version of a patch that improves the performance of
> cross app domain calls. I still have a small regression when using
> ContextBoundObjects but I hope I'll fix it soon. Other than that, the
> patch is fully working.

WOW! WOW! WOW! WOW!

You are my hero!

That having been said, let me pick apart your patch :-).

> The performance improvement varies depending on the signature of the
> method, since different types have different marshalling needs, and some
> marshalling operations can be faster than others.
> 
> Here are some numbers. I ran a series of 100.000 method invocations
> using different method signatures, which are the following: <...>
> We are getting an important speed up, 10x in some cases. The worst case
> is the one with the Data instance, since it needs to use serialization
> and most of the time is spent in BinaryFormatter. The other cases
> (primitive types and arrays of primitive types) don't need serialization
> and the improvement is really noticeable.
Can you please check this benchmark into mono/mono/benchmarks. You
should do it before we check in the patch.

> However, there is a drawback: we need to generate two additional methods
> for each remotable method. As an example, a monodevelop run creates
> around 80 wrapper methods + 80 helper methods, averaging 435 bytes of IL
> per couple, which means 34kb of IL code (plus the memory of internal
> data structures). And all this is dead code, since monodevelop does not
> use app domains, but the wrappers are generated for every call to a non-
> virtual MarshalByRefObject method. The reason is that those calls are
> made through the remoting-invoke-with-check wrapper, wich has a
> reference to the other wrappers.

Nope, this doesn't have to be done. What we do is: you emit this code:

call trampoline

That method will go into the runtime, generate the IL for the wrapper
method, and jit it. Then, it takes that jitted code and dynamically
patches the address so that it refers to the new method. This avoids the
lookup.

One other option would be to always call functions in MBRO's virtually.
It may end up being faster than having to do the call, etc.

Ok, now specific patch comments:

> +	object get_xappdomain_target (RealProxy rp)
> +	{

Why not actually write this method in C# and call it from the marshal
stuff?

> +	Exception mono_serialize_exception (Exception ex)
> +	{
> +		Exception loc_exc = ex;
> +		byte[] loc_data;
> +		int retry = 4;
> +		
> +		do {
> +			try {
> +				mono_thread_force_interruption_checkpoint ();
> +				loc_data = RemotingServices.SerializeObject (loc_exc);
> +				return loc_data;

Same comment here. Also, why do we need the interruption checkpoint?

> -static MonoMethod *
> -look_for_method_by_name (MonoClass *klass, const gchar *name);
> ...
> +MonoMethod *
> +mono_class_get_method_from_name (MonoClass *klass, const char *name, int param_count)

We should do this in another patch. This should help with the size of
the patch.

Also, you added this to the public header. I think this is a good idea,
but it needs a bit more approval.

> +	mono_mb_emit_byte (mb, CEE_LDC_I4_M1);
Use mono_mb_emit_icon (MonoMethodBuilder *mb, gint32 value), it is a bit
cleaner (it does the optimization for you).

> +		val->vtable = mono_class_vtable (domain, val->vtable->klass);
use the mono_object_class macro.

> +		return (MonoObject *) mono_string_new_utf16 (domain, (guint16 *) &(str->chars), str->length);
Use the macro for strings and chars.

> +			int i, len = mono_array_length (acopy);
> +			for (i = 0; i < len; i++) {
GCC is smart enough to do CSE. No need to do that for it.

Thats all for now, I'll think of more later.

Great job.

-- 
Ben Maurer <bmaurer at ximian.com>