[Mono-bugs] [Bug 56599][Nor] Changed - mcs compile hangs
bugzilla-daemon@bugzilla.ximian.com
bugzilla-daemon@bugzilla.ximian.com
Sat, 17 Apr 2004 02:04:10 -0400 (EDT)
Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.
Changed by joshhelmer@cox.net.
http://bugzilla.ximian.com/show_bug.cgi?id=56599
--- shadow/56599 2004-04-12 00:52:44.000000000 -0400
+++ shadow/56599.tmp.11546 2004-04-17 02:04:10.000000000 -0400
@@ -312,6 +312,244 @@
If there is any other info that I can provide you with let me know.
I can reproduce this at will... Exactly which library it will hang
on seems to be fairly random, but I have yet to get beyond the I18N
libs before the system hangs.
+
+------- Additional Comments From joshhelmer@cox.net 2004-04-17 02:04 -------
+I have been playing with this a little hoping to get you something a
+little more useful that what is in my previous post. I tweaked the
+EnterCriticalSection() code a little to try and get a better feel
+for where the hang-up was and to allow me to set a breakpoint in the
+code somewhere when I detected potential lock contention. In the
+end, that was a bust (running in gdb slows the process down enough
+that the hang never actually occurs), but it MIGHT explain the
+printf() in the stack... At least that's the best explanation I can
+come up with, although it eludes me how it got into the pthreads
+code from inside printf? Anyhow... here are my latest efforts to be
+helpful :-)
+
+Note: For some reason the 'thread apply all bt' command never gives
+me the stack for all my threads... That's why I just keep listing
+the damn things manually. I don't have a lot of experience with
+pthreads in C, so I don't know if this is normal or not... My gdb
+is misbehaving a lot lately.
+
+(gdb) info threads
+ 3 Thread 1096051632 (LWP 7671) 0xffffe410 in ?? ()
+ 2 Thread 1104087984 (LWP 7672) 0xffffe410 in ?? ()
+* 1 Thread 1088326624 (LWP 7657) 0xffffe410 in ?? ()
+(gdb) thread 1
+[Switching to thread 1 (Thread 1088326624 (LWP 7657))]#0 0xffffe410
+in ?? ()
+(gdb) bt
+#0 0xffffe410 in ?? ()
+#1 0xbfffdd78 in ?? ()
+#2 0x0000067a in ?? ()
+#3 0x00000000 in ?? ()
+#4 0x40bca260 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
+ from /lib/libpthread.so.0
+#5 0x400e377b in _wapi_handle_wait_signal_handle
+(handle=0xfffffffc)
+ at handles-private.h:87
+#6 0x400ed749 in WaitForSingleObject (handle=0x8,
+timeout=4294967295)
+ at wait.c:95
+#7 0x400edcd2 in WaitForMultipleObjects (numobjects=1,
+handles=0x813cd98,
+ waitall=1, timeout=4294967295) at wait.c:325
+#8 0x400b1d72 in wait_for_tids (wait=0x813cd98, timeout=4294967292)
+ at threads.c:1097
+#9 0x400b1ed3 in mono_thread_manage () at threads.c:1193
+#10 0x4009152e in mono_runtime_exec_managed_code (domain=0xfffffffc,
+ main_func=0xfffffffc, main_args=0xfffffffc) at object.c:1314
+#11 0x4006b0e3 in mono_main (argc=12, argv=0xbfffe034) at
+driver.c:788
+#12 0x08048f5f in main (argc=-4, argv=0xfffffffc) at main.c:6
+(gdb) thread 2
+[Switching to thread 2 (Thread 1104087984 (LWP 7672))]#0 0xffffe410
+in ?? ()
+(gdb) bt
+#0 0xffffe410 in ?? ()
+#1 0x41cf03ec in ?? ()
+#2 0x00000002 in ?? ()
+#3 0x00000000 in ?? ()
+#4 0x40bcc3ab in __lll_mutex_lock_wait () from /lib/libpthread.so.0
+#5 0x40bc9717 in _L_mutex_lock_75 () from /lib/libpthread.so.0
+#6 0x41cf03ec in ?? ()
+#7 0x40c4c05a in printf () from /lib/libc.so.6
+#8 0x400ded98 in EnterCriticalSection (section=0x40bcc3ab)
+ at critical-sections.c:151
+#9 0x40048529 in mono_create_jump_trampoline (domain=0x8093ed8,
+ method=0x82191d0, add_sync_wrapper=1) at mini.c:6284
+#10 0x40032872 in mono_ldftn (method=0x40bd0a48) at jit-icalls.c:20
+#11 0x4133e024 in ?? ()
+#12 0x082191d0 in ?? ()
+#13 0x41cf0650 in ?? ()
+#14 0x0813c8b8 in ?? ()
+#15 0x08102ad0 in ?? ()
+#16 0x41cf0498 in ?? ()
+#17 0x00000000 in ?? ()
+#18 0x4166e157 in ?? ()
+#19 0x4015fc50 in __JCR_LIST__ ()
+ from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0
+#20 0x4133e004 in ?? ()
+#21 0x41cf04c8 in ?? ()
+#22 0x41cfe2d0 in ?? ()
+#23 0x082191d0 in ?? ()
+#24 0x4166e157 in ?? ()
+#25 0x41cf04c8 in ?? ()
+#26 0x4004bc6e in mono_jit_compile_method (method=0xfffffffc) at
+mini.c:8016
+Previous frame inner to this frame (corrupt stack?)
+(gdb) thread 3
+[Switching to thread 3 (Thread 1096051632 (LWP 7671))]#0 0xffffe410
+in ?? ()
+(gdb) bt
+#0 0xffffe410 in ?? ()
+#1 0x41546818 in ?? ()
+#2 0x00000002 in ?? ()
+#3 0x00000000 in ?? ()
+#4 0x40bcc3ab in __lll_mutex_lock_wait () from /lib/libpthread.so.0
+#5 0x40bc9717 in _L_mutex_lock_75 () from /lib/libpthread.so.0
+#6 0x41546818 in ?? ()
+#7 0x40c4c05a in printf () from /lib/libc.so.6
+#8 0x400ded98 in EnterCriticalSection (section=0x40bcc3ab)
+ at critical-sections.c:151
+#9 0x400c68c7 in mono_jit_info_table_add (domain=0x8093edc,
+ji=0x8224170)
+ at domain.c:112
+#10 0x40048572 in mono_create_jump_trampoline (domain=0x8224170,
+ method=0x8201738, add_sync_wrapper=1) at mini.c:6301
+#11 0x40032872 in mono_ldftn (method=0x40bd0a48) at jit-icalls.c:20
+#12 0x4133e024 in ?? ()
+#13 0x08201738 in ?? ()
+#14 0x08106dc8 in ?? ()
+#15 0x08106db0 in ?? ()
+#16 0x08102ad0 in ?? ()
+#17 0x415468f4 in ?? ()
+#18 0x080e9640 in ?? ()
+#19 0x40133122 in ToUpperDataHigh ()
+ from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0
+#20 0x4015fc50 in __JCR_LIST__ ()
+ from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0
+#21 0x4133e004 in ?? ()
+#22 0x41546928 in ?? ()
+#23 0x41cfe246 in ?? ()
+#24 0x08201738 in ?? ()
+#25 0x080e9640 in ?? ()
+#26 0x40133122 in ToUpperDataHigh ()
+ from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0
+#27 0x41546928 in ?? ()
+#28 0x4004bc6e in mono_jit_compile_method (method=0xfffffffc) at
+mini.c:8016
+Previous frame inner to this frame (corrupt stack?)
+
+
+
+And then, for the big finale:
+
+
+
+
+(gdb) thread 3
+[Switching to thread 3 (Thread 1096051632 (LWP 7671))]#0 0xffffe410
+in ?? ()
+(gdb) f 9
+#9 0x400c68c7 in mono_jit_info_table_add (domain=0x8093edc,
+ji=0x8224170)
+ at domain.c:112
+112 mono_domain_lock (domain);
+(gdb) p domain
+$8 = (MonoDomain *) 0x8093edc
+(gdb) p *domain
+$9 = {
+ domain = 0x0,
+ lock = {
+ depth = 2,
+ mutex = {
+ __data = {
+ __lock = 1,
+ __count = 7672,
+ __owner = 1,
+ __kind = 1,
+ __nusers = 0
+ },
+ __size =
+"\001\000\000\000ø\035\000\000\001\000\000\000\001\000\000\000\000\000\000\000
+\006\006\b",
+ __align = 1
+ }
+ },
+ mp = 0x8062628,
+ code_mp = 0x8090f80,
+ env = 0x8062638,
+ assemblies = 0x8128eb0,
+ entry_assembly = 0x809ffc0,
+ setup = 0x8063428,
+ friendly_name = 0x0,
+ state = 134811392,
+ ldstr_table = 0x8090f60,
+ class_vtable_hash = 0x8090f40,
+ proxy_vtable_hash = 0x8090f20,
+ static_data_hash = 0x8062688,
+ jit_code_hash = 0x804f5d4,
+ jit_info_table = 0x8090ea0,
+ type_hash = 0x8090d20,
+ refobject_hash = 0x1,
+ domain_id = 0,
+ search_path = 0x0,
+ create_proxy_for_type_method = 0x0,
+ private_invoke_method = 0x80e5fc0,
+ default_context = 0x80e6fc0,
+ out_of_memory_ex = 0x80e6f90,
+ null_reference_ex = 0x80e6f60,
+ stack_overflow_ex = 0x0,
+ special_static_fields = 0x0,
+ jump_target_hash = 0x8090ee0,
+ class_init_trampoline_hash = 0x80626d8,
+ finalizable_objects_hash = 0x0
+}
+(gdb) thread 2
+[Switching to thread 2 (Thread 1104087984 (LWP 7672))]#0 0xffffe410
+in ?? ()
+(gdb) f 9
+#9 0x40048529 in mono_create_jump_trampoline (domain=0x8093ed8,
+ method=0x82191d0, add_sync_wrapper=1) at mini.c:6284
+6284 EnterCriticalSection (&trampoline_hash_mutex);
+(gdb) p trampoline_hash_mutex
+$10 = {
+ depth = 0,
+ mutex = {
+ __data = {
+ __lock = 2,
+ __count = 1,
+ __owner = 7671,
+ __kind = 1,
+ __nusers = 1
+ },
+ __size =
+"\002\000\000\000\001\000\000\000÷\035\000\000\001\000\000\000\001\000\000\000\000\000\000",
+ __align = 2
+ }
+}
+(gdb) info threads
+ 3 Thread 1096051632 (LWP 7671) 0xffffe410 in ?? ()
+* 2 Thread 1104087984 (LWP 7672) 0xffffe410 in ?? ()
+ 1 Thread 1088326624 (LWP 7657) 0xffffe410 in ?? ()
+
+
+I know that's a lot of crap to read through, but if you examine it
+closely, I THINK that it shows a deadlock somewhere (there is some
+wierdness in the output for thread 3 where everything seems to be 4
+bytes off - the count field holds what I suspect is supposed to be
+the __owner field - and the stack seems a little screwed up past
+frame 9) Thread 2 is trying to aquire a lock on
+trampoline_hash_mutex in mono_create_jump_table(), but that lock is
+currently held by thread 7671 (thread 3) and thread 3 is trying to
+acquire the domain lock in mono_jit_info_table but that lock is
+currently head by thread 7672 (thread 2).
+
+I will keep looking into this for a while, but figured that I would
+post it here so that the experts could look it over. Who knows, may
+be nothing, but it might be helpful. Good luck!