[Mono-list] Re: PInvoke:TNG

Robert Deviasse rdeviasse@hotmail.com
Thu, 26 Jul 2001 23:44:41 -0400


> > How exactly is the VM engine supposed to handle the [PosixType] custom 
>attribute?
> > Is it supposed to have a hard-coded list of types and their layouts,
>
>That's the assumption, yes, and it _is_ a drawback of this
>approach.  Any other approach will need something equally
>yucky.
>

Personally, I think it's possible to have a hybrid approach that won't
have that yuckiness. If the VM is too complicated, Mono will take as long to 
complete as the first Ada compiler took.

> > and to report an error if the type name doesn't match one of the types 
>in
> > its hard-coded list?
>
>There's two ways to handle this case: raise an error, or
>make a guess.

Personally, I'd rather not guess. Computers are notoriously bad at
making guesses and when a guess is made silently, you get a class of
errors that are notoriously difficult to debug because you have no
idea where the problem is.

I'd propose that errors be generated if there are any ambiguities
and that the user be given the option to override it.

>  In the definitions in my previous message,
>the "value" field was declared as "int" for "int_t" and as
>"long" for "off_t".  This information could be used to guess
>the native counterpart in the absence of other information.
>For example, consider the following definition of "size_t":
>
>[PosixType]
>public struct size_t
>{
>     private uint value;
>     ... constructor and conversion operators ...
>};
>
>If the VM didn't know about "size_t", it could guess
>from the type of "value" that a "native unsigned int"
>was intended.  It's another one of those "least yucky"
>things.


Let's think of a simpler approach. These are the things I'd look for:
1. simple implementation so something can be implemented *now* instead of
   five years from now.
2. it should be possible to write specifically for one OS yet provide a
   mechanism to have it work for another OS if portability becomes a
   concern.
3. it should be syntactically clean
4. it should be efficient and allow optimization so the implementation
   doesn't have to put in a lot of overhead in conversion. This overhead
   can be very large for reference or out arrays of PosixTypes.
5. it should not give any silent errors, particularly if they can lead to
   data corruption or core dumps.
6. it should allow you to work around any typing errors

Let's look at your example to figure out how to do this.

    [PosixStructType]
    public struct stat
    {
        public dev_t st_dev;
        public ino_t st_ino;
        public mode_t st_mode;
        public nlink_t st_nlink;
        public uid_t st_uid;
        public gid_t st_gid;
        public dev_t st_rdev;
        public dev_t st_rdev;
        public off_t st_size;
        public blksize_t st_blksize;
        public time_t st_atime;
        public time_t st_ctime;
        public time_t st_mtime;
    };

These are the issues you face when porting code between platforms:
a. The order of the parameters may be different.
b. There may be other members within the struct changing the offset of some 
members
c. Some types may not be defined in some implementations
d. Some types may be different (e.g. st_uid might be a string pointer on one 
OS and
   a struct on another)
e. Silent conversions can generate a class of problems that's very difficult 
to debug

So the requirements of any scheme we come up with must have the following 
properties:
a. The order of the fields of a PosixStructType is implementation defined. 
It's a
   default, not a required representation. It must be possible to change 
this default.
b. The offsets of the fields of a PosixStructType is implementation defined. 
It's a
   default, not a required representation. It must be possible to change 
this default.
c. It must be possible to define types of members that are not defined.
d. It must be possible to provide a way to convert between types of members.
e. All ambiguities must be flagged explicitly

So here's my proposal for dealing with this. Suppose we define an attribute 
called, "SystemDependent" that takes one parameter, the name of a namespace. 
Your example would look like the following:

    [SystemDependent(NativePosix)]
    public struct stat
    {
        public dev_t st_dev;
        public ino_t st_ino;
        public mode_t st_mode;
        public nlink_t st_nlink;
        public uid_t st_uid;
        public gid_t st_gid;
        public dev_t st_rdev;
        public dev_t st_rdev;
        public off_t st_size;
        public blksize_t st_blksize;
        public time_t st_atime;
        public time_t st_ctime;
        public time_t st_mtime;
    };

If none of the types in the structure appear within the NativePosix 
namespace, the
compiler will ensure that all member types are defined and compile it in the 
way
that is compatible with the "default compiler on the OS". This default, 
allows us
to instantly have a conforming implementation without any work. We can 
implement
the "namespace" customization feature at our leisure and begin using PInvoke
functions as soon as it's implemented.

If any of the types are defined in the NativePosix namespace, these types 
override
the corresponding type in this declaration. Any necessary transformation on 
the
parameter ordering or types is done. If no transformation is possible, an 
error
is flagged. Except for type conversions, this transformation process should 
be
done at compile-time, not run-time. There's no reason to introduce 
inefficiency
into any platform unless it's absolutely necessary.

The NativePosix namespace should be provided by each compiler implementation
(something like SWIG would go a log way towards generating a big collection
for any platform), but any programmer can add something to this namespace to
extend if.

So, if the above structure had no NativePosix implementation on Linux, it 
would
be assumed that the above implementation was compiled with the default 
compiler
of the O.S. If you wanted to port this code to SomeBSD, the order may be 
different,
so the NativePosix namespace would provide the following implementation:

    namespace NativePosix
	{
		public struct stat
		{
			public ino_t st_ino; // Order is different
			public dev_t st_dev; // Order is different
			public mode_t st_mode;
			public nlink_t st_nlink;
			public strange_t st_strange; // New field
			public uid_t st_uid;
			private long  st_gid_impl;  // Type is different
			public dev_t st_rdev;
			public dev_t st_rdev;
			public off_t st_size;
			public blksize_t st_blksize;
			public time_t st_atime;
			public time_t st_ctime;
			public time_t st_mtime;

			//- Properties to change the type of st_gid_impl to the
			//- expected type. This must be a property (not a conversion)
			//- if we want to be able to use it as an out parameter
			public public guid_t st_gid
			{
				get { return (guid_t)st_gid_impl; }
				set { st_gid_impl=(long)value; }
			}

		}
    }

The actual implementation details of the stat struct are invisible outside 
the
NativePosix namespace. As far as the outside code is concerned, the type 
flagged
by the [SystemDependent(NativePosix)] attribute is the actual structure 
(though
no necessarily the same order or offset) of the stat struct. So, outside the
namespace:
* the parameter st_strange is invisible.
* the offset of st_uid is different than it would be if stat weren't
  defined in NativePosix
* the order of the parameters st_ino and st_dev is different than it would 
be
  if stat weren't defined in NativePosix.
* the type of the st_gid parameter is different than it would be
  if stat weren't defined in NativePosix. Whenever a value is assigned or 
read
  from this parameter, the necessary getting and setting properties are 
called
  to ensure an accurate conversion. (If one of the conversions would
  cause a loss of data (e.g. int to long is okay, but not reverse) only
  one of the properties would be defined, ensuring type safety)

If the stat structure or one of it's member types was not defined in the
NativePosix implementation, the programmer would have to write his/her own
implementation. Of course, there's a wrinkle in this, if portability is 
desired
because other implementations may provide this type. Since a namespace can't
contain two identical types, conditional code needs to be generated. The
programmer would need to write:

    namespace NativePosix
	{
		[conditional ( "OS_IS_SomeBSD" )]
    	public struct stat
		{
                       // ....
		}
    }

So besides implementing the SystemDependent attribute and NativePosix 
namespace,
a conforming compiler would have to define a macro specifying what it is.
Alternately, you may wish to define a default type, if no definition were
provided in the NativePosix layer, so a conforming implementation would 
define
the [default_implementation] attribute that would be used if no other
implementation implemented this type.

Yes conditional compilation is yucky, but the good news is that it only 
needs
to be done if:
* You care about portability now (you can always define the alternate
  implentations later)
* The type isn't provided by the conforming compiler
* You want to override the structure or field types of the type being 
defined.

>
>I am becoming a little concerned though that the PInvoke
>mechanism will involve a huge amount of code to
>implement in any given VM.  This increases the chance
>of error, and hence interoperability will be affected.
>

What do you think?

>Cheers,
>
>Rhys.
>
>

Take care,
	Robert


_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp