[Mono-dev] Possible deadlock in sgen garbage collector

Rodrigo Kumpera kumpera at gmail.com
Wed May 26 09:13:14 EDT 2010


On Wed, May 26, 2010 at 9:39 AM, Burkhard Linke <
blinke at cebitec.uni-bielefeld.de> wrote:

> Hi,
>
> I've stumpled over a possible deadlock in boehm GC some time ago. Since the
> sgen GC uses the same mechanism for stopping the world, it may also be a
> problem in that implementation.
>
> Thread termination is signalled to the GC by the mean of a thread exit
> handler
> (boehm) or a thread data key destructor (sgen). The function in question
> removes the thread from the internal management tables and does necessary
> cleanup.
>
> POSIX does not specify the state of the thread's signal mask during exit
> handlers or data key destructor. Linux pthreads afaik enable signals, so
> the
> signal based suspend/restart mechanism is OK. But Solaris/x86 blocks
> signals
> during these handlers. From the pthread_exit(3) manpage:
>
>     An exiting thread runs with all signals blocked. All  thread
>     termination   functions,   including   cancellation  cleanup
>     handlers and thread-specific data destructor functions,  are
>     called with all signals blocked.
>
> And at this point a (unlikely, but possible) race condition occurs. If
> thread
> A stop the world, it examines the thread table for active threads and sends
> a
> suspend signal to each of them. If this happens while thread B is
> terminating
> and executing its termination handlers, the signal will be blocked (and
> probably lost, since the manpage does not mention unblocking the signals
> again). The suspend handlers post to a semaphore thread A is waiting for.
> The
> post of thread B never happens and thread A blocks forever. This error is
> not
> reproducable in a reliable way, so no test case can be provided.
>
> The patch for boehm GC requires adding another mutex for thread
> termination/garbage collection and explicitly checking for pending signals
> in
> the termination handler. I'll try to port this patch to sgen GC, unless
> someone else has a better solution.
>
> Sounds great to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20100526/2b2bd357/attachment.html 


More information about the Mono-devel-list mailing list