[Mono-list] need some help with PInvoke..
Jonathan Pryor
jonpryor@vt.edu
10 Jul 2003 14:13:33 -0400
Comments inline...
On Thu, 2003-07-10 at 13:13, David Jeske wrote:
> Thanks for the help Jonathan, it's just what I needed!
>
> On Thu, Jul 10, 2003 at 10:58:16AM -0400, Jonathan Pryor wrote:
> > First of all, IntPtrs, shouldn't be exposed to client code, if at all
> > possible. Granted, this isn't always possible (S.W.F exposes them
> > everywhere so you can manually call Win32 functions, and the Gtk#
> > wrapper also exposes them), but ideally you could provide a complete
> > wrapper around a type, and not need to expose an IntPtr.
>
> We can all see that the reality is that they need to be exposed in
> many places.
>
> IMHO, DllImport should always be "unsafe", and HWND handles should be
> unsafe struct pointers. That way any code that wanted to load some new
> function and call it directly would have had to be marked unsafe to
> use both the struct pointer and DllImport. That seems to mirror the
> real world since that code will in fact be pretty unsafe.
I think the real problem is that "unsafe" is an overloaded term. It can
refer to the use of the "unsafe" C# keyword, and it can be used as
"anything that isn't safe", which, as you note, doesn't require the
"unsafe" keyword.
> I'm not sure what benefit we get by letting "safe" code mess around
> with IntPtr, or call DllImported functions with "allegedly correct"
> marshaling options.
>
> It seems like currently the unsafe definition means "may violate the
> type system", which makes it pretty odd that IntPtrs can be touched by
> "safe" code. If I had my way (fat chance), I would change that
> definition to "safe code should never cause a segfault". Anywhere that
> DllImport is being used can easily cause a segfault, and anywhere
> IntPtrs are passed to the wrong place can also cause a segfault
> (although it will occur elsewhere), thus they are pretty "unsafe" in
> my book. :)
>
> However, we're not redesigning .NET here, so none of that matters too
> much. Back to the regularly scheduled programming...
Well, to speak on .NET's behalf, .NET has a highly flexible security
system. You can't invoke DllImported functions unless your app has the
appropriate security rights -- generally, that the app is running on the
local machine. If you're running it from a network share, or from a web
site (similar to Java Applets), then your app will get a
SecurityException.
You can get lots of security exceptions for various things, actually.
Opening files can generate a security exception, for example.
So, "unsafe" can mean (a) C# keyword; (b) violates .NET type system
(similar to (a)); (c) may be insecure (reading files from a web client);
(d) capable of causing a segfault. There are likely other meanings
people can dream up as well. Note that (d) doesn't imply (b), as far as
.NET is concerned. The runtime could itself have a bug that generates a
segfault, but this doesn't violate the type system.
IntPtr doesn't require a violation of the type system, as you can't get
the address of a .NET object (unless you "pin" it, which would require
the appropriate Security rights), and is thus principally useful for
interacting with unmanaged code, which exists outside of the .NET type
system.
Surely, this is pure semantics, but I can see the designers perspective.
> > Alternatively, creating a new struct that just has an IntPtr member
> > should be an equivalent, which would allow some type safety. I'm
> > surprised I don't see this more often.
>
> That's what I tried to do initially in my code, but since all the
> marshal examples I had used classes, I was making the mistake of using
> classes also. My take away is this:
>
> - If I want to copy the data into managed memory by marshaling, I use
> a class.
Structures can also be used, as passing a structure by value results in
a copy, which must be marshaled.
> - If I want to reference the data in-place in unmanaged memory, I use
> an unsafe struct and a struct pointer.
>
> - Since an IntPtr is basically a void*, I don't see why I would ever
> use it, unless the external call actually takes a void*.
You would use it if you need to expose the member to languages other
than C#/C++. For example, Visual Basic has no syntax for "unsafe" code,
and thus you couldn't use Visual Basic to perform your 2nd option. If
you want your code to be usable by other languages, you'll need to
provide an appropriate wrapper.
Now, whether IntPtr can be considered an appropriate wrapper or not is
outside the bounds of this discussion. :-) The answer likely depends
upon expected "real-world" usage. Lots of S.W.F. programs are in VB, so
IntPtr is useful for that, as an example.
> > The `char' type is an unsigned 16-bit type. Your other functions
> > specify that string marshaling should be done as LPStrs (an 8 bit
> > type). Which means there's a mismatch between your structure and method
> > signatures.
>
> Actually, I used:
>
> Marshal.PtrToStringAnsi((IntPtr)p->name);
>
> Which did exactly the right thing even though you are correct about my
> mis-use of char *. I'll change it to "byte*".
>
> > You can convert it into a System.String by using the
> > System.String.String(sbyte*) constructor
>
> Oohh! That's exactly what I was looking for. My strings are actually
> in UTF, so I can do:
>
> string name = new String.String(p->name,0,strlen(p->name),UTF8Encoding);
>
> I was worried that I was going to have to marshal the byte* into a
> managed byte[], use convert to go from UTF8 to UCS2 in a byte[], and
> then convert to a String. Too many copies. Using String.String() is
> much better, thanks! (although the underlying implemenation might
> still do the copies, it theoretically can be optimized someday)
>
> > Going from System.String to a sbyte* would likely require that you
> > P/Invoke to malloc/free (or whatever memory management functions your C
> > code uses), allocate unmanaged memory, and do the copy yourself (or use
> > System.Runtime.InteropServices.Marshal.Copy(byte[], int, IntPtr, int)).
> > You'll have to convert the System.String to a byte[] first, though,
> > which will likely require using the System.Text.Encoding class (making
> > sure that you use the same encoding as your C code does).
>
> It looks like I can write a Custom Marshaler which handles UTF8:
>
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemruntimeinteropservicesicustommarshalerclasstopic.asp
>
> However, based on what I can find in Convert, it looks like I'll have
> to do the copying I talked about above (in reverse). My strings are
> pretty small, so this shouldn't be too big a deal. However, I'm going
> to be pushing strings out alot more often than I'm pulling strings
> back in, so this is unfortunate.
>
> If anyone knows of a way to marshal a .NET string into a UTF8 encoded
> sbyte* in a single copy, speak up. :)
Haven't tried compiling, but this might work:
UTF8Encoding enc = new UTF8Encoding (false, true);
string input = GetSomeString ();
byte[] marshalled = enc.GetBytes (input);
IntPtr dest = malloc (marshalled.Length);
Marshal.Copy (marshalled, 0, dest, marshalled.Length);
Not ideal, by any means, primarily because GetBytes() calls
String.ToCharArray(), which allocates a copy, and itself allocates the
byte[] array, and you have to call malloc and copy again, so you wind up
having 4 different copies of the string in memory (original
System.String, char[] array from GetBytes(), byte[] array, and unmanaged
copy). Yech.
Even worse, it looks like this is the best that a Custom Marshaller
could do, assuming a C# custom marshaller.
In theory, the class library could optimize this so that the char[] copy
can be removed (probably through the introduction of some internal
calls), but I don't see any portable way of removing the other extra
copies.
Well, I suppose Marshal.StringToCoTaskMemAnsi() would work, except (1)
it's unimplemented in Mono, and (2) it assumes a Unicode->Ansi
conversion, not Unicode->UTF-8 conversion.
Marshal.StringToCoTaskMemUni() would output UTF-16 characters, so this
isn't appropriate for you either.
- Jon