[Mono-devel-list] Managed/Unmanaged Code Interop Documentation

Wed Sep 3 05:21:45 EDT 2003

I also would always just use DllImport in the sample code, instead of
the full type name, it's enough to mention "using
System.Runtime.InteropServices;" somewhere.

>    <monodoc:example id="simple-dllimport">
>       [System.Runtime.InteropServices.DllImport ("libc.so")]
>       private static extern int getpid ();
>    </monodoc:example>
> 
>    <p>The above C# function declaration would invoke the POSIX
>    getpid(2) system call on platforms that have the libc.so library (other
>    platforms would generate a 
>    <a href="T:System.MissingMethodException">MissingMethodException</a>).  
>    Simple.  Straightforward.  What could be easier?</p>

You may also want to note the issue that library names are
platform-specific and the dllmap mechanism provided by mono in the
config file to solve the issue.

>    <p>How does code interop work?  Given a managed call site (the function
>    call), and an unmanaged callee site (the function that's being called), each
>    parameter in the call site is "marshaled" (converted) into an unmanaged
>    equivalent.  The marshaled data is in turn placed on the runtime stack
>    (along with other data), and the unmanaged function is invoked.</p>
> 
>    <p>The complexity is due to the marshaling.  For simple types, such as 
>    integers and floating-point numbers, marshaling is a bitwise-copy
>    ("blitting"), just as would be the case for unmanaged code.  String types

In some cases, no marshaling and no copy of data happens, for example
when the data is blittable most of the time it's enough to pass a
pointer to the managed representation.

>    <p>What's the memory management policy for using "string" as a return
>    value?  Does the runtime expect to free it?</p>

Yes, though I don't remember if we insert the free() yet: note we will
use free by default on unix-like platforms and the MS-specified free
routine on windows.

>    <p>Conceptually, classes and structures are marshalled to native code
>    by:</p>
> 
>    <ol>
>       <li>The runtime allocates a chunk of unmanaged memory.</li>
>       <li>The managed class data is copied into the unmanaged memory.</li>
>       <li>The unmanaged function is invoked, passing it the unmanaged memory
>          information instead of the managed memory information.  This must be
>          done so that if a GC occurs, the unmanaged function doesn't need to
>          worry about it.  (And yes, you need to worry about GCs, as the
>          unmanaged function could call back into the runtime, generating a
>          GC.)</li>
>       <li>The unmanaged memory is copied into managed memory.</li>
>    </ol>
> 
>    <p>The principal difference between class and structure marshaling is which
>    of these conceptual steps actually occurs. :-)</p>
> 
>    <h4>Class Marshaling</h4>
> 
>    <p>Remember that classes are heap-allocated and garbage-collected in the
>    CLI.  As such, you cannot pass classes by value to unmanaged functions,
>    only by reference:</p>
> 
>    <monodoc:example id="class-marshal-example">
>       struct UnmanagedStruct {
>          int a, b, c;
>       };
> 
>       void WRONG (struct UnamangedStruct pass_by_value)
>       {
>       }
> 
>       void RIGHT (struct UnmanagedStruct *pass_by_reference)
>       {
>       }
>    </monodoc:example>
> 
>    <p>This means that you cannot use classes to invoke unmanaged functions
>    that expect a stack-allocated variable (such as the WRONG function,

Use the pass-by-value term, stack allocation is orthogonal.

>    <h4>Structure Marshaling</h4>
> 
>    <p>There are two primary differences between classes and structures.
>    First, structures do not need to be allocated on the heap; they can be
>    stack allocated.  Secondly, they use Sequential LayoutKind by default, so
>    structure declarations do not need any additional attributes to use them
>    with unmanaged code (assuming that the default sequential layout rules are
>    correct for the unmanaged structure).</p>
> 
>    <p>These differences permit structures to be passed by-value to unmanaged
>    functions, unlike classes.  Additionally, since structures are typically
>    located on the stack (unless they're boxed or part of a class instance), if
>    you pass a structure to an unmanaged function by-reference, the structure
>    will be passed directly to the unmanaged function, without an intermediate
>    unmanaged memory copy.  This means that you may not need to specify the Out

The unmanaged copy may not happen for classes, too: the object is simply
pinned in memory and a pointer to the start of the data is passed (if
the type is blittable and no marshaling is needed).
The main difference is that classes can't be passed by value; structs
can be passed by value unless ref or out is used. As a return type,
if you use a class, the unmanaged function is considered to return a
pointer to the unmanaged representation of the object. If you use a
struct, the data is supposed to be returned by value. It should be noted
that you can't return a struct by reference, while you can pass a struct
by reference. If you need to return a struct by reference you can make
the P/Invoke function return a IntPtr and use Marshal.PtrToStructure ().

>    <h3>Marshaling Class and Structure Members</h3>
> 
>    <p>Aside from the major differences between classes and structures outlined
>    above, the members of classes and structures are marshaled identically.</p>
> 
>    <p>The general rule of advice is this: never pass classes or structures
>    containing members of reference type (classes) to unmanaged code.
>    This is because unmanaged code can't do anything safely with the unmanaged 
>    reference (pointer), and the CLI runtime doesn't do a "deep marshal"
>    (marshal members of marshaled classes, and their members, ad
>    infinitum).</p>
> 
>    <p>The immediate net effect of this is that you can't have string and array
>    members of marshaled classes.</p>
> 
>    <p>It's not quite as bad as this makes out.  You can't pass strings and
>    arrays BY DEFAULT.  If you help the runtime marshaler by addorning the

I'm not sure this is right: you can use strings and arrays in types that
will be marshaled and by default they will be converted to pointers to
the data.

	string -> char* (or gunichar2*, depending on the charset property)
	int[]  -> gint32*
	etc.

>    <monodoc:example id="">
>       typedef struct _neo_err
>       {
>         int error;
>         int err_stack;
>         int flags;
>         char desc[256];
>         const char *file;
>         const char *func;
>         int lineno;
>         /* internal use only */
>         struct _neo_err *next;
>       } NEOERR;
>    </monodoc:example>
> 
>    <p>My philosophy of using unsafe struct pointers, and just accessing the
>    struct out in unmanaged memory is great, and it's exactly what I want
>    to do. However, handling "char dest[256]" is not straightforward.</p>
> 
>    <p>In C# arrays are reference types. Using one makes the struct a managed
>    type, and I can't put the array size in. The following is conceptually
>    what I want to do, however, it's obviously invalid:</p>
> 
>    <monodoc:example id="">
>       [StructLayout(LayoutKind.Sequential)]
>       unsafe struct NEOERR {
>         public int error;
>         public int err_stack;
>         public int flags;
>         public byte[256] desc;  // this is invalid, can't contain size

	[MarshalAs (UnmanagedType.ByValArray, SizeConst=256)]
	public byte[] desc;
should work.

	[MarshalAs (UnmanagedType.ByValTStr, SizeConst=256)]
	public string desc;
may work as well.

>    <p>UGH! First, this is obviously annoying. Second, the only way I can
>    figure to get access to "char dest[256]" is to use "char* dest =
>    &amp;nerr-&gt;dest_first_char;" and then just use dest as a pointer to the
>    string. I've dug through the documentation, and I can't find any
>    better solution.</p>
> 
>    <p>Obviously it would be ideal if there were a way to represent a
>    value-type array. I wonder how Managed C++ handles "char foo[256];" in
>    a struct.</p>

Using ByValArray is probably the best option here.
Thanks for writing this document: it looks like a good start.

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better