[Mono-list] Help debugging program failing randomly

Danny dgortonii at gmail.com
Mon Apr 8 17:14:14 UTC 2013


Thanks to Ian and Alan for the replies.  I have done some further 
elimination (by removing runtime components) and I don't think it is the 
new board interface causing this.  I think it is another component, that 
isn't quite as new, but I had forgotten is new in this context (ubuntu 
server).  This component periodically uses a graphing library (ZedGraph) 
to generate line graphs from the data collected from the input boards. 
I have included the entire capture of the stack trace that mono sends to 
stdout.  Note that this is a capture of the console, not a log file, so 
it includes system status messages from my code as well - I left them in 
for context, whether it matters or not.

http://pastebin.com/kQFF4TUB

I currently have a test running that eliminates this graphing component, 
but includes the new board component, and it seems promising so far. 
I'll feel better after it runs for a week though, since I've had it run 
for almost 5 days before it crashed.

At any rate, if it is this new component and the graphing library 
causing this issue, I need figure out how to fix it.  Also, I have used 
ZedGraph for a very long time to generate images like this, but the 
frequency used to be limited to once per day.  Now it can be once per 
minute.  The once/day generation is done in yet another component, so it 
could be the two 'walking' on each other if the underlying code isn't 
thread-safe.  I would expect some kind of time correlation if that was 
the case, and I just don't see that.  I have some ideas on how to 
serialize all of these operations to a single thread, but I'd need to be 
fairly sure of the problem before I went to the effort to implement that.

If I could get a good bead on what I'm doing that causes this error I 
can work around it.

Thanks again for the help,
Danny


On 04/08/2013 06:21 AM, Alan wrote:
> I'm not sure if fontconfig is threadsafe and the finalizer thread is
> directly unreffing some fontconfig objects. This could easily be causing
> the corruption you're seeing if that's the case. Can you paste the full
> stacktrace of your crash (including all threads!) in a pastebin, or
> attach it to your email in some way?
>
> Alan
>
>
> On 8 April 2013 08:42, Ian Norton
> <ian.norton-badrul at thales-esecurity.com
> <mailto:ian.norton-badrul at thales-esecurity.com>> wrote:
>
>     I'd be sure to check your struct packing and call conventions
>     properly. And
>     perhaps be sure that you aren't passing in any "ref System.String"
>     instead of
>     StringBuilders
>
>     Ian
>
>     On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
>      > Hello,
>      >
>      > I'm having a difficult time with an application I have written.  I
>      > recently made some changes and I'm having a problem with it
>     failing at
>      > seemingly random times and locations (within the code), with sigsegv
>      > errors.  This is a multithreaded plugin-style daemon/service (can be
>      > launched from CLI) and I recently added a new component to it to
>     poll a
>      > data acquisition board via USB using FTDI.
>      >
>      > Almost all of our integrations like this use a shared library (or
>     DLL on
>      > Windows) and p/invoke to access hardware.  I have done dozens of
>     these
>      > integrations over USB without a persistent issue like this.  But
>     still
>      > at first I suspected this new component, as I had initially
>     thought it
>      > was trashing RAM because of the problems I had developing the
>     shared library
>      >
>      > However, at the same time as I made this addition, I was also
>     (somewhat)
>      > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
>      > 10.04).  So unfortunately, I have more than one variable changing
>     at a
>      > time.  So I confirmed, with a configuration that eliminates the newly
>      > developed component, that this problem occurs without that running.
>      >
>      > That's good and bad, since now it seems likely that the offending
>     code
>      > is out of my control.  I am hoping to get some information on the
>      > error(s) I was able to capture, or some advice on how to debug
>     the root
>      > cause of this problem.
>      >
>      > I have a couple of stack traces captured and I'll include what I
>     believe
>      > is the crucial part of one here.  It's worth noting that not all
>     of the
>      > stack traces are the same.  It's also worth noting that I have seen
>      > libgdiplus.so in other traces that I didn't get captured.
>      >
>      > I tried setting up a 10.04 machine to test with, but one of our newer
>      > dependencies (ServiceStack) introduced a class that is not in the
>      > default mono on that platform, giving a startup error trying to
>     resolve
>      > the IgnoreDataMemberAttribute class.  So I then got the latest
>     mono set
>      > up on that machine now, but fear that this will result in the
>     same error
>      > I am reporting (ie: I believe this to be a mono problem), since it
>      > should be the same mono framework running there.
>      >
>      > Any help is greatly appreciated.
>      >
>      >
>      >
>      > <snip - a bunch of standard output msgs from the service />
>      >
>      > Stacktrace:
>      >
>      >    at (wrapper managed-to-native)
>     System.Drawing.GDIPlus.GdipDeleteFont
>      > (intptr) <0xffffffff>
>      >    at System.Drawing.Font.Dispose () <0x0002b>
>      >    at (wrapper remoting-invoke-with-check)
>     System.Drawing.Font.Dispose
>      > () <0xffffffff>
>      >    at System.Drawing.Font.Finalize () <0x00013>
>      >    at (wrapper runtime-invoke)
>      > object.runtime_invoke_virtual_void__this__
>     (object,intptr,intptr,intpt$
>      >
>      > Native stacktrace:
>      >
>      >          mono() [0x80e16fc]
>      >          mono() [0x81209fc]
>      >          mono() [0x806094d]
>      >          [0xb770240c]
>      >
>      > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15)
>      > [0xb4b1b9b5]
>      >          /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43)
>     [0xb4b29b43]
>      >
>      > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82)
>      > [0xb4b29e12]
>      >          /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132)
>     [0xb5004642]
>      >          /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
>      >          [0xaf711940]
>      >          [0xaf7118cc]
>      >          [0xaf711870]
>      >          [0xaf7117ec]
>      >          [0xb5cddf41]
>      >          mono() [0x8150107]
>      >
>      > <snip - 42 thread stack details>
>      >
>      > =================================================================
>      > Got a SIGSEGV while executing native code. This usually indicates
>      > a fatal error in the mono runtime or one of the native libraries
>      > used by your application.
>      > =================================================================
>      >
>      >
>      >
>      >
>      > Danny
>      > _______________________________________________
>      > Mono-list maillist  - Mono-list at lists.ximian.com
>     <mailto:Mono-list at lists.ximian.com>
>      > http://lists.ximian.com/mailman/listinfo/mono-list
>     _______________________________________________
>     Mono-list maillist  - Mono-list at lists.ximian.com
>     <mailto:Mono-list at lists.ximian.com>
>     http://lists.ximian.com/mailman/listinfo/mono-list
>
>


More information about the Mono-list mailing list