[Mono-list] Help debugging program failing randomly
Danny
dgortonii at gmail.com
Tue Apr 9 15:34:17 UTC 2013
(Sent this earlier, but it didn't post to the list)
Thanks to Ian and Alan for the replies. I have done some further
elimination (by removing runtime components) and I don't think it is the
new board interface causing this. I think it is another component, that
isn't quite as new, but I had forgotten is new in this context (ubuntu
server). This component periodically uses a graphing library (ZedGraph)
to generate line graphs from the data collected from the input boards.
I have included the entire capture of the stack trace that mono sends to
stdout. Note that this is a capture of the console, not a log file or
core dump (which I'd like to know how to get from monoservice2), so it
includes system status messages from my code as well - I left them in
for context, whether it matters or not.
http://pastebin.com/kQFF4TUB
I currently have a test running that eliminates this graphing component,
but includes the new board component, and it seems promising so far.
I'll feel better after it runs for a week though, since I've had it run
for almost 5 days before it crashed.
At any rate, if it is this new component and the graphing library
causing this issue, I need figure out how to fix it. Also, I have used
ZedGraph for a very long time to generate images like this, but the
frequency used to be limited to once per day. Now it can be once per
minute. The once/day generation is done in yet another component, so it
could be the two 'walking' on each other if the underlying code isn't
thread-safe. I would expect some kind of time correlation if that was
the case, and I just don't see that. I have some ideas on how to
serialize all of these operations to a single thread, but I'd need to be
fairly sure of the problem before I went to the effort to implement that.
If I could get a good bead on what I'm doing that causes this error I
can work around it.
Thanks again for the help,
Danny
On 04/08/2013 06:21 AM, Alan wrote:
> I'm not sure if fontconfig is threadsafe and the finalizer thread is
> directly unreffing some fontconfig objects. This could easily be causing
> the corruption you're seeing if that's the case. Can you paste the full
> stacktrace of your crash (including all threads!) in a pastebin, or
> attach it to your email in some way?
>
> Alan
>
>
> On 8 April 2013 08:42, Ian Norton
> <ian.norton-badrul at thales-esecurity.com
> <mailto:ian.norton-badrul at thales-esecurity.com>> wrote:
>
> I'd be sure to check your struct packing and call conventions
> properly. And
> perhaps be sure that you aren't passing in any "ref System.String"
> instead of
> StringBuilders
>
> Ian
>
> On Mon, Apr 08, 2013 at 04:21:32AM +0100, Danny wrote:
> > Hello,
> >
> > I'm having a difficult time with an application I have written. I
> > recently made some changes and I'm having a problem with it
> failing at
> > seemingly random times and locations (within the code), with sigsegv
> > errors. This is a multithreaded plugin-style daemon/service (can be
> > launched from CLI) and I recently added a new component to it to
> poll a
> > data acquisition board via USB using FTDI.
> >
> > Almost all of our integrations like this use a shared library (or
> DLL on
> > Windows) and p/invoke to access hardware. I have done dozens of
> these
> > integrations over USB without a persistent issue like this. But
> still
> > at first I suspected this new component, as I had initially
> thought it
> > was trashing RAM because of the problems I had developing the
> shared library
> >
> > However, at the same time as I made this addition, I was also
> (somewhat)
> > forced to upgrade our base OS to the latest LTS Ubuntu 12.04 (was on
> > 10.04). So unfortunately, I have more than one variable changing
> at a
> > time. So I confirmed, with a configuration that eliminates the newly
> > developed component, that this problem occurs without that running.
> >
> > That's good and bad, since now it seems likely that the offending
> code
> > is out of my control. I am hoping to get some information on the
> > error(s) I was able to capture, or some advice on how to debug
> the root
> > cause of this problem.
> >
> > I have a couple of stack traces captured and I'll include what I
> believe
> > is the crucial part of one here. It's worth noting that not all
> of the
> > stack traces are the same. It's also worth noting that I have seen
> > libgdiplus.so in other traces that I didn't get captured.
> >
> > I tried setting up a 10.04 machine to test with, but one of our newer
> > dependencies (ServiceStack) introduced a class that is not in the
> > default mono on that platform, giving a startup error trying to
> resolve
> > the IgnoreDataMemberAttribute class. So I then got the latest
> mono set
> > up on that machine now, but fear that this will result in the
> same error
> > I am reporting (ie: I believe this to be a mono problem), since it
> > should be the same mono framework running there.
> >
> > Any help is greatly appreciated.
> >
> >
> >
> > <snip - a bunch of standard output msgs from the service />
> >
> > Stacktrace:
> >
> > at (wrapper managed-to-native)
> System.Drawing.GDIPlus.GdipDeleteFont
> > (intptr) <0xffffffff>
> > at System.Drawing.Font.Dispose () <0x0002b>
> > at (wrapper remoting-invoke-with-check)
> System.Drawing.Font.Dispose
> > () <0xffffffff>
> > at System.Drawing.Font.Finalize () <0x00013>
> > at (wrapper runtime-invoke)
> > object.runtime_invoke_virtual_void__this__
> (object,intptr,intptr,intpt$
> >
> > Native stacktrace:
> >
> > mono() [0x80e16fc]
> > mono() [0x81209fc]
> > mono() [0x806094d]
> > [0xb770240c]
> >
> > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcCharSetDestroy+0x15)
> > [0xb4b1b9b5]
> > /usr/lib/i386-linux-gnu/libfontconfig.so.1(+0x17b43)
> [0xb4b29b43]
> >
> > /usr/lib/i386-linux-gnu/libfontconfig.so.1(FcPatternDestroy+0x82)
> > [0xb4b29e12]
> > /usr/lib/libgdiplus.so.0(GdipDeleteFontFamily+0x132)
> [0xb5004642]
> > /usr/lib/libgdiplus.so.0(GdipDeleteFont+0x2c) [0xb500510c]
> > [0xaf711940]
> > [0xaf7118cc]
> > [0xaf711870]
> > [0xaf7117ec]
> > [0xb5cddf41]
> > mono() [0x8150107]
> >
> > <snip - 42 thread stack details>
> >
> > =================================================================
> > Got a SIGSEGV while executing native code. This usually indicates
> > a fatal error in the mono runtime or one of the native libraries
> > used by your application.
> > =================================================================
> >
> >
> >
> >
> > Danny
> > _______________________________________________
> > Mono-list maillist - Mono-list at lists.ximian.com
> <mailto:Mono-list at lists.ximian.com>
> > http://lists.ximian.com/mailman/listinfo/mono-list
> _______________________________________________
> Mono-list maillist - Mono-list at lists.ximian.com
> <mailto:Mono-list at lists.ximian.com>
> http://lists.ximian.com/mailman/listinfo/mono-list
>
>
More information about the Mono-list
mailing list