[Mono-devel-list] [PATCH] Race condition when restarting threads
Ben Maurer
bmaurer at ximian.com
Sun Jul 3 11:29:19 EDT 2005
Hey,
In a Mono bug report, we noticed a very rare race in the GC when
restarting the world. GC_restart_handler states:
/* Let the GC_suspend_handler() know that we got a SIG_THR_RESTART. */
/* The lookup here is safe, since I'm doing this on behalf */
/* of a thread which holds the allocation lock in order */
/* to stop the world. Thus concurrent modification of the */
/* data structure is impossible. */
However, this comment is not always true. When starting the world, the
thread that does the restarting does *not* wait for all threads to get
past the point where they need the structures used by the lookup for it
to release the GC_lock.
So the sequence of events looked something like:
* T1 signals T2 to restart the world
* T1 releases the GC_lock
* T3 is a newborn thread and adds itself to the table
* T2 gets the signal and sees a corrupt table because T3 is
concurrently modifying it.
What would end up happening when we experienced the race was either a
deadlock or a SIGSEGV.
The race was extremely rare. It took 1-2 hours to reproduce on an SMP
machine. With the attached patch, it has not segfaulted or hung for 21
hrs.
-- Ben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc.patch
Type: text/x-patch
Size: 1309 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20050703/e09d1119/attachment.bin
More information about the Mono-devel-list
mailing list