[Mono-devel-list] Interop with unmanaged code without copying or memory allocation?

Jonathan Pryor jonpryor at vt.edu
Mon Jan 12 22:20:08 EST 2004


It sounds like you're trying to wrap a difficult API.  Good luck.

Regarding Problem 1 (interning strings), is there any particular reason
you want the strings interned?  The only time it's useful is if you want
to use pointer comparison instead of string comparison for strings,
which would require that users do this:

	((object) String1) == ((object) String2)

I suspect most people will stick with the typical:

	String1 == String2

which calls Object.Equals, so there's no reason to intern the string
(unless you want to require your users use the first code).

Assuming you do want to intern the string, you could create a hashtable,
manually hash the "const XML_Char *name", and use this hash value to
lookup the interned string.  This would likely require writing your own
hash function (so it can operate on a "const XML_Char*"), and you'd have
to consider hash table conflicts, but this could be made to work.

Personally, I wouldn't worry about it until you've done the performance
profiling (mono --profile is your friend!) and determined that string
interning would actually be a benefit.

Regarding Problem 2, I can't think of any good way to avoid the
marshaling/copying overhead.  Managed and unmanaged memory must be kept
separate (to permit the use of non-conservative garbage collectors). 
You could employ C# "unsafe" code in the callback methods, but this
would prevent non-C# languages (VB, JavaScript, etc.) from being used as
callbacks...

This is really the problem behind Problem 1 -- memory must be kept
separate, and coming up with efficient ways to bridge the barrier is
difficult, hence the marshaling overhead...

 - Jon

On Sun, 2004-01-11 at 20:53, Karl Waclawek wrote:
> I am fairly new to C#/.NET/Mono, so this may be a trivial problem
> (however, I did read http://www.jprl.com/~jon/interop.html):
> 
> My task is to write a C# wrapper for the Expat XML parser,
> which is available as a dynamic library (.dll, .so).
> I want this wrapper to have as little overhead as possible,
> as the main "raison d'etre" for Expat is its speed.
> 
> The Expat API is mostly character pointer based, two of the
> most commonly used call-back functions have this signature
> (i.e. call-back from Dll back to application):
> 
> /* atts is array of name/value pairs, terminated by 0;
>    names and values are 0 terminated.
> */
> typedef void (XMLCALL *XML_StartElementHandler) (void *userData,
>                                                  const XML_Char *name,
>                                                  const XML_Char **atts);
> 
> 
> /* s is not 0 terminated. */
> typedef void (XMLCALL *XML_CharacterDataHandler) (void *userData,
>                                                   const XML_Char *s, 
>                                                   int len);
> 
> where XML_Char is a two byte entity (ushort or wchar_t).
> 
> What I would like to do is wrap these call-backs in a way
> that the actual call-backs look like this in C#:
> 
> void StartElement(string Name, IAttributes atts);
>   and
> void Characters(char[] ch, int start, int length);
> 
> (The wrapper routines themselves will have to be delegates, 
> as far as I could tell).
> Ignoring the "IAttributes atts" argument, I have two problems:
> 
> 1) I would like to "intern" the first occurrence of a given name passed to
>    XML_StartElementHandler() and let subsequent occurrences of the same name
>    pass the same interned string object to the StartElement function in C#.
> However, it seems I cannot look up an instance in the string pool
> passing some array or pointer, I have to allocate a new string instance
> based on the contents of name just to call String.Intern. This
> defeats the purpose.
> 
> 2) It would be nice if I could somehow cast the "const XML_Char *s"
>    argument passed to XML_CharacterDataHandler() to an array.
> It seems that the best I can do in C# is to re-use the same array
> instance (for the ch argument) and copy the characters from s into it.
> 
> So, for case 1) I have the problem of new string allocation
> for each call, and for case 2) I have the problem of copying.
> 
> I guess going from unmanaged to managed will always require
> copying, so I won't be able to improve on 2), but case 1) really
> bothers me as it should be possible to look up an interned string
> using something else but another string.
> 
> Any advice?
> 
> Karl
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list




More information about the Mono-devel-list mailing list