[Mono-dev] Deadlock in mono (2.0 branch)

Casey Marshall casey.s.marshall at gmail.com
Thu Aug 28 20:53:37 EDT 2008


I'm seeing a deadlock in the mono runtime, in particular while running 
the NUnit add-in for MonoDevelop -- the 'ExternalTestRunner.exe' program 
will start hanging after it's been running for a few hours or so.

I'm running this under revision 110293, in the mono-2-0 branch. This is 
a couple of weeks old now, so I'm also going to try this with a newer 
snapshot. Let me know if you think this is already fixed.

But here's the info I could get with GDB:

The locks in contention are the `loader_mutex', a global, and the domain 
lock of the main assembly (mdhost.exe):

> (gdb) thread 2
> [Switching to thread 2 (Thread 0x41d50950 (LWP 30666))]#0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> (gdb) print loader_mutex
> $16 = {depth = 0, mutex = {__data = {__lock = 2, __count = 1, __owner = 30662, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
>     __size = "\002\000\000\000\001\000\000\000�w\000\000\001\000\000\000\001", '\0' <repeats 22 times>, __align = 4294967298}}
> (gdb) thread 3
> [Switching to thread 3 (Thread 0x41f66950 (LWP 30662))]#0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> (gdb) frame 3
> #3  0x00000000004b5f01 in mono_assembly_get_object (domain=0x7fec00f84e00, assembly=0x80) at reflection.c:5596
> 5596            CHECK_OBJECT (MonoReflectionAssembly *, assembly, NULL);
> (gdb) print domain->lock
> $18 = {depth = 0, mutex = {__data = {__lock = 2, __count = 1, __owner = 30666, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
>     __size = "\002\000\000\000\001\000\000\000�w\000\000\001\000\000\000\001", '\0' <repeats 22 times>, __align = 4294967298}}

So, these two threads are acquiring these two locks, but in different 
order. Thread 3 is only acquiring the lock to access a cache, it looks 
like, which I imagine runs through very quickly, but can apparently 
deadlock.

> (gdb) thread 2
> [Switching to thread 2 (Thread 0x41d50950 (LWP 30666))]#0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> (gdb) bt
> #0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> #1  0x00007fec003e9b23 in _L_lock_261 () from /lib/libpthread.so.0
> #2  0x00007fec003e94e8 in pthread_mutex_lock () from /lib/libpthread.so.0
> #3  0x0000000000480a8e in mono_loader_lock () at loader.c:1867
> #4  0x00000000004ab0f9 in mono_class_setup_fields_locking (class=0x85fd28) at class.c:1123
> #5  0x00000000004ab179 in mono_class_get_fields (klass=0xd1d760, iter=0x80) at class.c:6366
> #6  0x00000000004c5d7d in compute_class_bitmap (class=0xd1d760, bitmap=0x41d4f7e0, size=256, offset=0, max_set=0x41d4f80c, static_fields=0) at object.c:608
> #7  0x00000000004c5fb0 in mono_class_compute_gc_descriptor (class=0xd1d760) at object.c:908
> #8  0x00000000004c69db in mono_class_vtable (domain=0x7fec00f84e00, class=0xd1d760) at object.c:1392
> #9  0x000000000055966f in mono_jit_compile_method (method=<value optimized out>) at mini.c:12973
> #10 0x000000000042c5e3 in mono_magic_trampoline (regs=0x41d4fbb0, code=0x41b41b48 "H\215u�H\213E�H\213@\030H\213�\2038", m=0xd1d730, tramp=<value optimized out>) at mini-trampolines.c:249
> #11 0x00000000413f7165 in ?? ()
> #12 0x00007febfe1eeaf8 in ?? ()
> #13 0x000000000044db26 in mono_arch_nullify_class_init_trampoline (code=0x7febfe187ab0 "\220&�", regs=0x41d4fc50) at tramp-amd64.c:143
> #14 0x00000000413f7170 in ?? ()
> #15 0x0000000000000000 in ?? ()
> (gdb) thread 3
> [Switching to thread 3 (Thread 0x41f66950 (LWP 30662))]#0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> (gdb) bt
> #0  0x00007fec003ee174 in __lll_lock_wait () from /lib/libpthread.so.0
> #1  0x00007fec003e9b23 in _L_lock_261 () from /lib/libpthread.so.0
> #2  0x00007fec003e94e8 in pthread_mutex_lock () from /lib/libpthread.so.0
> #3  0x00000000004b5f01 in mono_assembly_get_object (domain=0x7fec00f84e00, assembly=0x80) at reflection.c:5596
> #4  0x000000000049ddd0 in mono_domain_fire_assembly_load (assembly=0xd13ed0, user_data=<value optimized out>) at appdomain.c:840
> #5  0x00000000004e711f in mono_assembly_invoke_load_hook (ass=0xd13ed0) at assembly.c:923
> #6  0x00000000004e8d27 in mono_assembly_load_from_full (image=0xd08000, fname=<value optimized out>, status=0x41f64d6c, refonly=0) at assembly.c:1482
> #7  0x00000000004e90f1 in mono_assembly_open_full (filename=0xd80970 "/usr/mono-2.0/lib/monodevelop/AddIns/NUnit/nunit.core.dll", status=0x41f64d6c, refonly=0) at assembly.c:1298
> #8  0x00000000004ea415 in mono_assembly_load_full_nosearch (aname=0x41f64d70, basedir=0xd8ea70 "/usr/mono-2.0/lib/monodevelop/AddIns/NUnit", status=0x41f64d6c, refonly=0) at assembly.c:2256
> #9  0x00000000004ea788 in mono_assembly_load_full (aname=0x7fec00f84e08, basedir=0x80 <Address 0x80 out of bounds>, status=0x7fec003f15f0, refonly=-1) at assembly.c:2295
> #10 0x00000000004ea8ef in mono_assembly_load_reference (image=0xd81000, index=13) at assembly.c:848
> #11 0x00000000004a98ed in mono_class_from_typeref (image=0xd81000, type_token=<value optimized out>) at class.c:144
> #12 0x00000000004a9a85 in mono_class_get_full (image=0xd81000, type_token=128, context=0x7fec003f15f0) at class.c:5271
> #13 0x00000000004975ba in mono_metadata_interfaces_from_typedef_full (meta=0xd81000, index=<value optimized out>, interfaces=0x41f65020, count=0x41f6502c, context=0x0) at metadata.c:3405
> #14 0x00000000004a9377 in mono_class_create_from_typedef (image=0xd81000, type_token=33554466) at class.c:4194
> #15 0x00000000004a9a4c in mono_class_get_full (image=0xd81000, type_token=128, context=0x7fec003f15f0) at class.c:5268
> #16 0x00000000004a9be7 in mono_class_from_name (image=0xd81000, name_space=0xd11510 "MonoDevelop.NUnit", name=0xd11522 "LocalTestMonitor") at class.c:5688
> #17 0x00000000004b6773 in mono_reflection_get_type_internal (rootimage=0xd81000, image=0x7fec00f84e08, info=0x41f655c0, ignorecase=-1) at reflection.c:6779
> #18 0x00000000004b69b7 in mono_reflection_get_type_with_rootimage (rootimage=0xd81000, image=0x80, info=0x41f655c0, ignorecase=0, type_resolve=0x41f655bc) at reflection.c:6912
> #19 0x00000000004d6a82 in ves_icall_type_from_name (name=0x7febfe16dc00, throwOnError=0 '\0', ignoreCase=<value optimized out>) at icall.c:1210
> #20 0x000000004154b7fc in ?? ()
> #21 0x0000000000beea00 in ?? ()
> #22 0x0000000041f65770 in ?? ()
> #23 0x00007febfe797500 in ?? ()
> #24 0x0000000041f65770 in ?? ()
> #25 0x0000000000000000 in ?? ()

I have some more info about the hung process from gdb, but I thought the 
above is the most relevant.

Thanks.


More information about the Mono-devel-list mailing list