[Mono-dev] WebRequest timeouts after ThreadPool exhaustion

Seif Attar iam at seifattar.net
Fri Feb 12 10:42:38 UTC 2016


Hi,

We are running mono 4.2.2 in prod and the VM that the process was running
on had SAN failure and after storage recovered, all outgoing requests were
timing out, even though doing a curl was working fine.

Theory was that thread pool starved and somehow things didn't recover
properly.

Managed to reproduce with:

https://gist.githubusercontent.com/seif/ae2defbfa5afa26fa8f6/raw/bef351eded56c882658a4bad60fa4818ad32d629/timeout.cs

Even after ThreadPool finishes the tasks and has available threads, it
never recovers and subsequent webrequests all timeout.

Running on mono 4.2.2, linux kernel 4.2.0-27 and libc 2.21.

Output from gdb is:

(gdb) info threads
  Id   Target Id         Frame
  13   Thread 0x7fca903ff700 (LWP 27944) "cli" pthread_cond_wait@@GLIBC_2.3.2
() at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  12   Thread 0x7fca90b34700 (LWP 27945) "Finalizer" 0x00007fca911d70c9 in
futex_abstimed_wait (cancel=true, private=<optimised out>, abstime=0x0,
expected=0, futex=0x948ae0) at sem_waitcommon.c:42
  11   Thread 0x7fca8dfff700 (LWP 27946) "Timer-Scheduler"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  10   Thread 0x7fca91ba1700 (LWP 27947) "cli" __clock_nanosleep
(clock_id=1, flags=1, req=0x7fca91ba0dc0, rem=0x7fca90f134aa
<__clock_nanosleep+138>) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
  9    Thread 0x7fca8ddfe700 (LWP 27948) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  8    Thread 0x7fca8dbfd700 (LWP 27949) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  7    Thread 0x7fca8d7fb700 (LWP 27951) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  6    Thread 0x7fca8d3f9700 (LWP 27953) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  5    Thread 0x7fca8d1f8700 (LWP 27954) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  4    Thread 0x7fca8cff7700 (LWP 27955) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  3    Thread 0x7fca8cdf6700 (LWP 27956) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  2    Thread 0x7fca8cbf5700 (LWP 27957) "Threadpool work"
pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
* 1    Thread 0x7fca91cf0780 (LWP 27942) "cli" 0x00007fca911d7e0d in read
() at ../sysdeps/unix/syscall-template.S:81

Could not reproduce on mono 3.12, but it happens on 4.0.3 and 4.2.2

Is this a known issue? any workarounds? Tried setting MONO_DNS=1 to use the
clr dns, but that didn't help.

Let me know if there is any more info I need to provide.

Thanks,
Seif
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20160212/9272bacd/attachment.html>


More information about the Mono-devel-list mailing list