[Mono-list] FW: [Mono-hackers-list] More implementation of the Marshal class

Daniel Morgan danmorg@sc.rr.com
Thu, 30 May 2002 11:23:26 -0400


I thought others maybe interested in how marshalling support could be
implemented in Mono.

Paolo gave me permission to email his message to the Mono-List.

-----Original Message-----
From: mono-hackers-list-admin@ximian.com
[mailto:mono-hackers-list-admin@ximian.com] On Behalf Of Paolo Molaro
Sent: Wednesday, May 29, 2002 7:11 AM
To: Mono Hackers
Subject: Re: [Mono-hackers-list] More implementation of the Marshal
class

On 05/28/02 Daniel Morgan wrote:
> I assume PtrToStructure() would be implemented as a internal call and
> placed in mono/mono/metadata/icall.c

Yes, but the logic needed to implement it needs to go into
metadata/marshal.c, because it's used when marshaling the structure
to P/Invoke methods.

> However, to gain a better understanding of what I may have to do,
> I would need to look at:
> /mono/mono/metadata/class.c and /mono/mono/metadata/class.h
> 
> Particularily, the functions are helpful:
> class_compute_field_layout ()
> mono_class_init ()
> mono_get_class()

Actually, you don't need to look at that, but only to the result of that
calls, i.e. the class->fields array.
The MonoClassField structure has some of the info you need: field type
and field offset inside the _object_. We still need to load the
marshaling info from the metadata tables, though.

> At first, I thought align was the same thing as the Pack described in
> the MSDN docs for StructLayoutAttribute, but Lupus pointed out that
pack
> is not the same thing as align.  I think Lupus recently committed
> something for a pack directive to cvs.  I don't know if it has
anything
> to do with the Pack field in StructLayoutAttribute.

Yes, it's the same thing, though it has no relevance to the
implementation of PtrToStructure().

> Any ideas on how to implement PtrToStructure() ?

I have some rough design ideas about how it should all work.
You can get away with a small hack in the mean time to get you started,
but I'd like the final code to look something like the following
proposal (if there are no objections).

Problem: implement the marshaling facilities (both for PtrToStructure()
and the reverse and for P/Invoke invocations).
There are a lot of details on how it should be implemented and some of
the details are quite messy: I want this code to be as generic as
possible so that it can be used by all the runtimes on all the
architectures if possible.

The basic idea is: for each type that we need to marshal we write the
code needed in a special custom bytecode. This bytecode can be either
'interpreted' or it can be easily translated into native code.
PtrToStructure() will likely interpret it, but the jit and the interp
will translate it to native code to implement P/Invoke: this way, all
the messy logic is in a single place and the arch or runtime specific
code only needs to do a simple translation.

Consider:
<C#>
struct S {
	[MarshalAs(UnmanagedType.LPStr)]
	string name;
	[MarshalAs(UnmanagedType.Bool)]
	bool val;
}
<C>
struct S {
	char* name;
	gboolean val;
}

Here the bool value takes one byte in the object, but it takes 4 bytes
in the C structure: this is only one of the several issues we need to be
aware of.
So, here is how the bytecode may look like to go from the C# object
("from") to the C struct ("dest"):

	ldobj dest
	ld.ptr c_name_offset
	free_if_needed
	ldobj dest
	add c_name_offset
	ldobj from
	ld.str name_offset
	conv.lpstr
	stind.ptr
	ldobj dest
	add c_val_offset
	ldobj from
	ld.bool
	conv.native_bool
	stind.native_bool

The code will have a small header to know in advance the stack size
needed. 

	ldobj 	loads the address of either the source or destination.
	add 	adds an integer offset to a pointer.
	ld.ptr 	loads a pointer from the pointer on the stack
	free_if_needed calls g_free/free on the pointer on the stack if
		required
	ld.str 	loads a MonoString* from the pointer on the stack
	conv.lpstr converts a MonoString* to a char*
	stdind.ptr indirect stores a pointer
	ld.bool loads a C# bool value
	stind.native_bool indirect stores a native bool

and so on...
There are about 50 or so opcodes needed, but they should be easy to
implement.
The same idea could be used for the P/Invoke calls with a couple more
special opcodes. For example when calling:

	double stof (string val);

we could have:

	localloc,ptr 0
	localloc.double 1
	local_addr 0
	ldarg.str 0
	conv.lpstr
	stind.ptr
	local_addr 0
	ld.ptr
	call func
	local_addr 1
	stind.double 1 // store retval from FP stack
	local_addr 0
	free
	local_addr 1
	ld.double

The bytecode->binary code translator could be the same for the interp
and the jit on each arch, because the only difference would be in how
the ldarg.* opcodes retreive the incoming arguments.

As you see, these are just rough ideas that still need much thinking.
Suggestions and comments are welcome.

lupus