[Mono-bugs] [Bug 77470][Nor] Changed -
mono_thread_attach/mono_thread_detach can cause
deadlock/segfault on OS X
bugzilla-daemon at bugzilla.ximian.com
bugzilla-daemon at bugzilla.ximian.com
Mon Mar 20 20:41:09 EST 2006
Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.
Changed by bryan at imeem.com.
http://bugzilla.ximian.com/show_bug.cgi?id=77470
--- shadow/77470 2006-02-09 19:38:03.000000000 -0500
+++ shadow/77470.tmp.619 2006-03-20 20:41:09.000000000 -0500
@@ -101,6 +101,79 @@
Program will always segfault or deadlock.
Additional Information:
On both ia32 Linux and OS X 10.4.4, the sample code causes a lot of warning output as
described in Bug #77468.
+
+------- Additional Comments From bryan at imeem.com 2006-03-20 20:41 -------
+So Allan and I revisited this today with mono HEAD (r58196), and here is our assessment
+of what's happening. On both OS X on Intel and OS X on PPC, racy still crashes attempting
+to dereference a member in the GC_mach_threads array. Here is the backtrace (on PPC,
+THREAD_COUNT = 256):
+
+Program received signal EXC_BAD_ACCESS, Could not access memory.
+Reason: KERN_PROTECTION_FAILURE at address: 0x01278e58
+[Switching to process 11906 thread 0x5803]
+0x011b5fa0 in GC_suspend_thread_list (act_list=0xa4000, count=247, old_list=0x0,
+old_count=0) at darwin_stop_world.c:316
+316 GC_mach_threads[GC_mach_threads_count].already_suspended = 0;
+(gdb) bt
+#0 0x011b5fa0 in GC_suspend_thread_list (act_list=0xa4000, count=247, old_list=0x0,
+old_count=0) at darwin_stop_world.c:316
+#1 0x011b6228 in GC_stop_world () at darwin_stop_world.c:409
+#2 0x0119ddf4 in GC_stopped_mark (stop_func=0x119d020 <GC_never_stop_func>) at
+alloc.c:504
+#3 0x0119da1c in GC_try_to_collect_inner (stop_func=0x119d020
+<GC_never_stop_func>) at alloc.c:386
+#4 0x0119f1c4 in GC_collect_or_expand (needed_blocks=1, ignore_off_page=0) at
+alloc.c:1046
+#5 0x0119f568 in GC_allocobj (sz=60, kind=1) at alloc.c:1126
+#6 0x011a60d8 in GC_generic_malloc_inner (lb=176, k=1) at malloc.c:136
+#7 0x011a62b0 in GC_generic_malloc (lb=176, k=1) at malloc.c:192
+#8 0x011a669c in GC_malloc (lb=176) at malloc.c:297
+#9 0x010d2668 in mono_object_allocate (size=176, vtable=0x1803e2c) at object.c:2301
+#10 0x010d25c8 in mono_object_new_alloc_specific (vtable=0x1803e2c) at object.c:2398
+#11 0x010d2514 in mono_object_new_specific (vtable=0x1803e2c) at object.c:2384
+#12 0x010d239c in mono_object_new (domain=0x5cf00, klass=0x160d7a0) at object.c:
+2345
+#13 0x0110d3cc in mono_thread_attach (domain=0x5cf00) at threads.c:408
+#14 0x000028ac in thread_function ()
+#15 0x9002b1e0 in _pthread_body ()
+(gdb)
+
+GC_mach_threads_count is:
+(gdb) p GC_mach_threads_count
+$1 = 44035
+...which is obviously wrong (you can see that count = 247 above).
+
+In the source, GC_mach_threads_count is statically defined right below GC_mach_threads,
+and so my guess is that GC_mach_threads_count is greater than THREAD_TABLE_SZ, and
+access into the GC_mach_threads array overflows and then overwrites the
+GC_mach_threads_count variable, and things go wrong from there. THREAD_TABLE_SZ is
+#define'd to be 128 elsewhere in the source.
+
+As a workaround, in our local tree, we've defined this to be 2048 and this particular crash
+appears to be at least mitigated. However, we then eventually get (at least on Intel):
+
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+[...]
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+warning: Error 6 getting port names from mach_port_names
+
+Program received signal EXC_BAD_ACCESS, Could not access memory.
+Reason: KERN_INVALID_ADDRESS at address: 0x00374084
+[Switching to process 25049 local thread 0x180f]
+Cannot remove breakpoints because program is no longer writable.
+It might be running in another process.
+Further execution is probably impossible.
+0x00374084 in __i686.get_pc_thunk.bx ()
+
+If not running under GDB, you get an Illegal instruction error and then the program exits.
+
More information about the mono-bugs
mailing list