[Mono-winforms-list] swf

stephen@covidimus.net stephen@covidimus.net
Wed, 27 Oct 2004 10:56:51 -0500 (CDT)


(This email contains an answer to a specific question someone asked, but also a general question for those in the know regarding mono's native library loader, further down).

I've seen this "can't find gdiplus.dll" error in several places, and I myself had the problem last week.  I found several other posts from people who have had the same trouble, and all the suggestions I could find through googling and asking around were to check LD_LIBRARY_PATH, etc., but I actually found an entirely different cause for the problem.  Heres a step-by-step, for all the people like me who have hit this.

The first attempt was to try different permutations of moving and renaming libgdiplus.so and modifying the dll mappings /etc/mono/config to let mono see the library.  Eventually, I managed to figure out how to turn on trace output to watch the loader go through its search process, and discovered that the library was being searched for in the right places, but dlopen was failing on it.  The error: unresolved symbol cairo_set_target_drawable.  Of course, anything named cairo* probably isnt supposed to be in libgdiplus.so anyway.  Libgdiplus.so depends on libcairo, however, and nm libcairo showed that the function was indeed missing.

I downloaded the cairo source from CVS late PM CST on Thursday, 10/21 (so unless its been fixed in the past week, the problems still there), and hacked around in it until I found the problem.  Even though cairos configure found XLIB, it wasnt building cairo_xlib_surface.c, so several functions werent making it into libcairo.  Cairo_xlib_surface.c _did_ appear in the source file list in the makefile, but not in the object list.  If I added cairo_xlib_surface.lo to the am_libcairo_la_OBJECTS statement in the makefile by hand, make then picked up cairo_xlib_surface.c correctly, and the rebuilt library worked fine.  The cant find gdiplus.dll problem in mono went away and I was able to show my first empty form using the new SWF library!

Im not on any cairo mailing lists (and I kind of forgot about it...) so I havent reported this apparent ./configure bug to them yet.  If anybody on this list is also a cairo developer and wants to cross-post for me, Id appreciate it; if I dont hear anything Ill go enter a bug report or something.

So that solves the problem, but the experience of figuring it out got me to thinking.  When dlopen failed, mono didnt distinguish between a missing DLL and one that couldnt be opened for other reasons.  It kept trying the lib in other places (a few of which were symlinks, so it actually hit the same dlopen failure multiple times), then eventually threw DllNotFoundException.  I suppose that the thrown exception has to be DllNotFoundException for compatibility with MS, but is there any way that additional information could be included as well?  For instance, is there any reason that mono couldn't include an InnerException that included extra information about the specific reason that the DLL couldn't be loaded in the case where the file exists but dlopen/LoadLibrary fails anyway?  Really, I think it's important for the text description of the system error that was encountered to be available somehow without having to learn enough mono hackery to turn on those debug traces.  (I 
 actually never figured out how to turn them on; I finally modded the code to use g_print b/c I couldn't find any documentation on how to see the output from the existing mono_trace statements in that code; I'm not (as yet) a real mono hacker :-) ).
 
A related question: if mono tries to load a library that really does exist, but dlopen/LoadLibrary fails, should mono really continue to try additional places?  Failure to load a library that exists is an entirely different problem than failure to find a library when doing a multiple-path library search.

A related observation: the same problem appears if you try to load an SWF form when X isn't started.  The failure gets reported as a NullReferenceException, but if you turn on the lib loader trace output, you'll see that it's caused by a failure to load libX11.so.  It seems like there ought to be _some_ way, even if non-standard, to get that very important information back out to the user.  Most apps say something like "Cannot connect to display..." when that happens.

I'll be happy to try my hand at making any modifications that come about from this discussion, but maybe I'm just off in left field and things should stay exactly as they are anyway.  Any takers?