[Mono-bugs] [Bug 56599][Nor] Changed - mcs compile hangs

bugzilla-daemon@bugzilla.ximian.com bugzilla-daemon@bugzilla.ximian.com
Sat, 17 Apr 2004 02:04:10 -0400 (EDT)


Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by joshhelmer@cox.net.

http://bugzilla.ximian.com/show_bug.cgi?id=56599

--- shadow/56599	2004-04-12 00:52:44.000000000 -0400
+++ shadow/56599.tmp.11546	2004-04-17 02:04:10.000000000 -0400
@@ -312,6 +312,244 @@
  
  
 If there is any other info that I can provide you with let me know.  
 I can reproduce this at will...  Exactly which library it will hang 
 on seems to be fairly random, but I have yet to get beyond the I18N 
 libs before the system hangs. 
+
+------- Additional Comments From joshhelmer@cox.net  2004-04-17 02:04 -------
+I have been playing with this a little hoping to get you something a  
+little more useful that what is in my previous post.  I tweaked the  
+EnterCriticalSection() code a little to try and get a better feel  
+for where the hang-up was and to allow me to set a breakpoint in the  
+code somewhere when I detected potential lock contention.  In the  
+end, that was a bust (running in gdb slows the process down enough  
+that the hang never actually occurs), but it MIGHT explain the  
+printf() in the stack... At least that's the best explanation I can  
+come up with, although it eludes me how it got into the pthreads  
+code from inside printf?  Anyhow... here are my latest efforts to be  
+helpful :-)  
+  
+Note: For some reason the 'thread apply all bt' command never gives  
+me the stack for all my threads...  That's why I just keep listing  
+the damn things manually.  I don't have a lot of experience with  
+pthreads in C, so I don't know if this is normal or not...  My gdb  
+is misbehaving a lot lately.  
+  
+(gdb) info threads  
+  3 Thread 1096051632 (LWP 7671)  0xffffe410 in ?? ()  
+  2 Thread 1104087984 (LWP 7672)  0xffffe410 in ?? ()  
+* 1 Thread 1088326624 (LWP 7657)  0xffffe410 in ?? ()  
+(gdb) thread 1  
+[Switching to thread 1 (Thread 1088326624 (LWP 7657))]#0  0xffffe410  
+in ?? ()  
+(gdb) bt  
+#0  0xffffe410 in ?? ()  
+#1  0xbfffdd78 in ?? ()  
+#2  0x0000067a in ?? ()  
+#3  0x00000000 in ?? ()  
+#4  0x40bca260 in pthread_cond_timedwait@@GLIBC_2.3.2 ()  
+   from /lib/libpthread.so.0  
+#5  0x400e377b in _wapi_handle_wait_signal_handle  
+(handle=0xfffffffc)  
+    at handles-private.h:87  
+#6  0x400ed749 in WaitForSingleObject (handle=0x8,  
+timeout=4294967295)  
+    at wait.c:95  
+#7  0x400edcd2 in WaitForMultipleObjects (numobjects=1,  
+handles=0x813cd98,  
+    waitall=1, timeout=4294967295) at wait.c:325  
+#8  0x400b1d72 in wait_for_tids (wait=0x813cd98, timeout=4294967292)  
+    at threads.c:1097  
+#9  0x400b1ed3 in mono_thread_manage () at threads.c:1193  
+#10 0x4009152e in mono_runtime_exec_managed_code (domain=0xfffffffc,  
+    main_func=0xfffffffc, main_args=0xfffffffc) at object.c:1314  
+#11 0x4006b0e3 in mono_main (argc=12, argv=0xbfffe034) at  
+driver.c:788  
+#12 0x08048f5f in main (argc=-4, argv=0xfffffffc) at main.c:6  
+(gdb) thread 2  
+[Switching to thread 2 (Thread 1104087984 (LWP 7672))]#0  0xffffe410  
+in ?? ()  
+(gdb) bt  
+#0  0xffffe410 in ?? ()  
+#1  0x41cf03ec in ?? ()  
+#2  0x00000002 in ?? ()  
+#3  0x00000000 in ?? ()  
+#4  0x40bcc3ab in __lll_mutex_lock_wait () from /lib/libpthread.so.0  
+#5  0x40bc9717 in _L_mutex_lock_75 () from /lib/libpthread.so.0  
+#6  0x41cf03ec in ?? ()  
+#7  0x40c4c05a in printf () from /lib/libc.so.6  
+#8  0x400ded98 in EnterCriticalSection (section=0x40bcc3ab)  
+    at critical-sections.c:151  
+#9  0x40048529 in mono_create_jump_trampoline (domain=0x8093ed8,  
+    method=0x82191d0, add_sync_wrapper=1) at mini.c:6284  
+#10 0x40032872 in mono_ldftn (method=0x40bd0a48) at jit-icalls.c:20  
+#11 0x4133e024 in ?? ()  
+#12 0x082191d0 in ?? ()  
+#13 0x41cf0650 in ?? ()  
+#14 0x0813c8b8 in ?? ()  
+#15 0x08102ad0 in ?? ()  
+#16 0x41cf0498 in ?? ()  
+#17 0x00000000 in ?? ()  
+#18 0x4166e157 in ?? ()  
+#19 0x4015fc50 in __JCR_LIST__ ()  
+   from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0  
+#20 0x4133e004 in ?? ()  
+#21 0x41cf04c8 in ?? ()  
+#22 0x41cfe2d0 in ?? ()  
+#23 0x082191d0 in ?? ()  
+#24 0x4166e157 in ?? ()  
+#25 0x41cf04c8 in ?? ()  
+#26 0x4004bc6e in mono_jit_compile_method (method=0xfffffffc) at  
+mini.c:8016  
+Previous frame inner to this frame (corrupt stack?)  
+(gdb) thread 3  
+[Switching to thread 3 (Thread 1096051632 (LWP 7671))]#0  0xffffe410  
+in ?? ()  
+(gdb) bt  
+#0  0xffffe410 in ?? ()  
+#1  0x41546818 in ?? ()  
+#2  0x00000002 in ?? ()  
+#3  0x00000000 in ?? ()  
+#4  0x40bcc3ab in __lll_mutex_lock_wait () from /lib/libpthread.so.0  
+#5  0x40bc9717 in _L_mutex_lock_75 () from /lib/libpthread.so.0  
+#6  0x41546818 in ?? ()  
+#7  0x40c4c05a in printf () from /lib/libc.so.6  
+#8  0x400ded98 in EnterCriticalSection (section=0x40bcc3ab)  
+    at critical-sections.c:151  
+#9  0x400c68c7 in mono_jit_info_table_add (domain=0x8093edc,  
+ji=0x8224170)  
+    at domain.c:112  
+#10 0x40048572 in mono_create_jump_trampoline (domain=0x8224170,  
+    method=0x8201738, add_sync_wrapper=1) at mini.c:6301  
+#11 0x40032872 in mono_ldftn (method=0x40bd0a48) at jit-icalls.c:20  
+#12 0x4133e024 in ?? ()  
+#13 0x08201738 in ?? ()  
+#14 0x08106dc8 in ?? ()  
+#15 0x08106db0 in ?? ()  
+#16 0x08102ad0 in ?? ()  
+#17 0x415468f4 in ?? ()  
+#18 0x080e9640 in ?? ()  
+#19 0x40133122 in ToUpperDataHigh ()  
+   from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0  
+#20 0x4015fc50 in __JCR_LIST__ ()  
+   from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0  
+#21 0x4133e004 in ?? ()  
+#22 0x41546928 in ?? ()  
+#23 0x41cfe246 in ?? ()  
+#24 0x08201738 in ?? ()  
+#25 0x080e9640 in ?? ()  
+#26 0x40133122 in ToUpperDataHigh ()  
+   from /var/tmp/portage/mono-0.31/work/mono-0.31/mono/mini/.libs/libmono.so.0  
+#27 0x41546928 in ?? ()  
+#28 0x4004bc6e in mono_jit_compile_method (method=0xfffffffc) at  
+mini.c:8016  
+Previous frame inner to this frame (corrupt stack?)  
+  
+  
+  
+And then, for the big finale:  
+  
+  
+  
+  
+(gdb) thread 3  
+[Switching to thread 3 (Thread 1096051632 (LWP 7671))]#0  0xffffe410  
+in ?? ()  
+(gdb) f 9  
+#9  0x400c68c7 in mono_jit_info_table_add (domain=0x8093edc,  
+ji=0x8224170)  
+    at domain.c:112  
+112             mono_domain_lock (domain);  
+(gdb) p domain  
+$8 = (MonoDomain *) 0x8093edc  
+(gdb) p *domain  
+$9 = {  
+  domain = 0x0,  
+  lock = {  
+    depth = 2,  
+    mutex = {  
+      __data = {  
+        __lock = 1,  
+        __count = 7672,  
+        __owner = 1,  
+        __kind = 1,  
+        __nusers = 0  
+      },  
+      __size =  
+"\001\000\000\000\035\000\000\001\000\000\000\001\000\000\000\000\000\000\000  
+\006\006\b",  
+      __align = 1  
+    }  
+  },  
+  mp = 0x8062628,  
+  code_mp = 0x8090f80,  
+  env = 0x8062638,  
+  assemblies = 0x8128eb0,  
+  entry_assembly = 0x809ffc0,  
+  setup = 0x8063428,  
+  friendly_name = 0x0,  
+  state = 134811392,  
+  ldstr_table = 0x8090f60,  
+  class_vtable_hash = 0x8090f40,  
+  proxy_vtable_hash = 0x8090f20,  
+  static_data_hash = 0x8062688,  
+  jit_code_hash = 0x804f5d4,  
+  jit_info_table = 0x8090ea0,  
+  type_hash = 0x8090d20,  
+  refobject_hash = 0x1,  
+  domain_id = 0,  
+  search_path = 0x0,  
+  create_proxy_for_type_method = 0x0,  
+  private_invoke_method = 0x80e5fc0,  
+  default_context = 0x80e6fc0,  
+  out_of_memory_ex = 0x80e6f90,  
+  null_reference_ex = 0x80e6f60,  
+  stack_overflow_ex = 0x0,  
+  special_static_fields = 0x0,  
+  jump_target_hash = 0x8090ee0,  
+  class_init_trampoline_hash = 0x80626d8,  
+  finalizable_objects_hash = 0x0  
+}  
+(gdb) thread 2  
+[Switching to thread 2 (Thread 1104087984 (LWP 7672))]#0  0xffffe410  
+in ?? ()  
+(gdb) f 9  
+#9  0x40048529 in mono_create_jump_trampoline (domain=0x8093ed8,  
+    method=0x82191d0, add_sync_wrapper=1) at mini.c:6284  
+6284            EnterCriticalSection (&trampoline_hash_mutex);  
+(gdb) p trampoline_hash_mutex  
+$10 = {  
+  depth = 0,  
+  mutex = {  
+    __data = {  
+      __lock = 2,  
+      __count = 1,  
+      __owner = 7671,  
+      __kind = 1,  
+      __nusers = 1  
+    },  
+    __size =  
+"\002\000\000\000\001\000\000\000\035\000\000\001\000\000\000\001\000\000\000\000\000\000",  
+    __align = 2  
+  }  
+}  
+(gdb) info threads  
+  3 Thread 1096051632 (LWP 7671)  0xffffe410 in ?? ()  
+* 2 Thread 1104087984 (LWP 7672)  0xffffe410 in ?? ()  
+  1 Thread 1088326624 (LWP 7657)  0xffffe410 in ?? ()  
+  
+  
+I know that's a lot of crap to read through, but if you examine it  
+closely, I THINK that it shows a deadlock somewhere (there is some  
+wierdness in the output for thread 3 where everything seems to be 4  
+bytes off - the count field holds what I suspect is supposed to be  
+the __owner field - and the stack seems a little screwed up past  
+frame 9)  Thread 2 is trying to aquire a lock on  
+trampoline_hash_mutex in mono_create_jump_table(), but that lock is  
+currently held by thread 7671 (thread 3) and thread 3 is trying to  
+acquire the domain lock in mono_jit_info_table but that lock is  
+currently head by thread 7672 (thread 2).  
+  
+I will keep looking into this for a while, but figured that I would  
+post it here so that the experts could look it over.  Who knows, may  
+be nothing, but it might be helpful.  Good luck!