[Mono-dev] WebRequest timeouts after ThreadPool exhaustion

Seif Attar iam at seifattar.net
Fri Feb 12 20:07:24 UTC 2016


I like those long meaningful commit messages :) seems related but then
again it is threading.

Thanks for opening the bug report Alexander, I'll keep an eye on it.

On Fri, 12 Feb 2016 15:59 Alan <alan.mcgovern at gmail.com> wrote:

> Ah, sorry, i meant this commit
> https://github.com/mono/mono/commit/578a2327b8a216a2b59e9fc430ae5d77af2616bd
>
> On 12 February 2016 at 14:58, Alexander Köplinger <
> alexander.koeplinger at xamarin.com> wrote:
>
>> Sorry, turns out I made an error when testing on master. I can actually
>> see the request timeout there too, so it's not fixed on master.
>>
>> I filed a bug with your repro code:
>> https://bugzilla.xamarin.com/show_bug.cgi?id=38715
>>
>> - Alex
>>
>>
>>
>> 2016-02-12 15:13 GMT+01:00 Alexander Köplinger <
>> alexander.koeplinger at xamarin.com>:
>>
>>> Happens on mono-4.3.2-branch/9f44a62 as well...
>>>
>>> Alan: the PR you linked doesn't seem to be related, did you have another
>>> PR in mind?
>>>
>>> - Alex
>>>
>>> 2016-02-12 15:07 GMT+01:00 Alexander Köplinger <
>>> alexander.koeplinger at xamarin.com>:
>>>
>>>> I tried the testcase on master and couldn't reproduce there. I could,
>>>> however, reproduce it on the 4.3.2 build I had installed
>>>> (mono-4.3.2-branch/0df254d). I'm downloading a later 4.3.2 build right now
>>>> to see if it still happens there, if it does then we need to backport
>>>> something from master.
>>>>
>>>> - Alex
>>>>
>>>> 2016-02-12 15:04 GMT+01:00 Seif Attar <iam at seifattar.net>:
>>>>
>>>>> Great, I'll try it out. Is the console app in that gist enough for a
>>>>> test case?
>>>>>
>>>>> @Mike @Jonathan we've faced bugs with previous versions of libc and
>>>>> networking before, also some kernel issues. Update to latest if you can. I
>>>>> can't reproduce with 3.12. I get timeouts but then it recovers when there
>>>>> are available threads unlike with 4.x
>>>>>
>>>>> On Fri, 12 Feb 2016 13:50 Alan <alan.mcgovern at gmail.com> wrote:
>>>>>
>>>>>> It's also worth pointing out that the threadpool implementation has
>>>>>> changed completely since mono 4.0. I believe the new threadpool
>>>>>> implementation shipped as the default starting with mono 4.2 (or
>>>>>> thereabouts). If you're on older Monos the odds are high whatever issue you
>>>>>> have has been fixed already.
>>>>>>
>>>>>> Alan
>>>>>>
>>>>>> On 12 February 2016 at 13:48, Alan <alan.mcgovern at gmail.com> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> We have just fixed some issues in that area. They are expected to
>>>>>>> ship as part of a the next mono 4.3+ release. If you want to test them out
>>>>>>> in the meantime you could try building mono with this PR [0] and see if it
>>>>>>> resolves all your issues. If it doesn't then a testcase and bug report on
>>>>>>> http://bugzilla.xamarin.com would be awesome!
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> [0] https://github.com/mono/mono/pull/2576
>>>>>>>
>>>>>>> On 12 February 2016 at 12:33, Mike Horsley <mhorsley at vqcomms.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> we’ve also seen instances of webrequest timeouts that don’t recover
>>>>>>>> (but curl worked) as well but haven’t got to the bottom of it yet.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> we ran your test app and see the same issue with mono 3.12 on
>>>>>>>> OpenSUSE 13.2 (kernel 3.16.7, libc 2.19).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> we’ll add the diagnostics from your test app into ours and this
>>>>>>>> will tell us whether we are seeing the same issue with the threadpool.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> regards
>>>>>>>>
>>>>>>>> Mike
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* mono-devel-list-bounces at lists.ximian.com [mailto:
>>>>>>>> mono-devel-list-bounces at lists.ximian.com] *On Behalf Of *Seif Attar
>>>>>>>> *Sent:* Friday, February 12, 2016 10:43 AM
>>>>>>>> *To:* mono-devel-list at lists.ximian.com
>>>>>>>> *Subject:* [Mono-dev] WebRequest timeouts after ThreadPool
>>>>>>>> exhaustion
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are running mono 4.2.2 in prod and the VM that the process was
>>>>>>>> running on had SAN failure and after storage recovered, all outgoing
>>>>>>>> requests were timing out, even though doing a curl was working fine.
>>>>>>>>
>>>>>>>> Theory was that thread pool starved and somehow things didn't
>>>>>>>> recover properly.
>>>>>>>>
>>>>>>>> Managed to reproduce with:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://gist.githubusercontent.com/seif/ae2defbfa5afa26fa8f6/raw/bef351eded56c882658a4bad60fa4818ad32d629/timeout.cs
>>>>>>>>
>>>>>>>> Even after ThreadPool finishes the tasks and has available threads,
>>>>>>>> it never recovers and subsequent webrequests all timeout.
>>>>>>>>
>>>>>>>> Running on mono 4.2.2, linux kernel 4.2.0-27 and libc 2.21.
>>>>>>>>
>>>>>>>> Output from gdb is:
>>>>>>>>
>>>>>>>> (gdb) info threads
>>>>>>>>   Id   Target Id         Frame
>>>>>>>>   13   Thread 0x7fca903ff700 (LWP 27944) "cli" pthread_cond_wait@@GLIBC_2.3.2
>>>>>>>> () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
>>>>>>>>   12   Thread 0x7fca90b34700 (LWP 27945) "Finalizer"
>>>>>>>> 0x00007fca911d70c9 in futex_abstimed_wait (cancel=true, private=<optimised
>>>>>>>> out>, abstime=0x0, expected=0, futex=0x948ae0) at sem_waitcommon.c:42
>>>>>>>>   11   Thread 0x7fca8dfff700 (LWP 27946) "Timer-Scheduler"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   10   Thread 0x7fca91ba1700 (LWP 27947) "cli" __clock_nanosleep
>>>>>>>> (clock_id=1, flags=1, req=0x7fca91ba0dc0, rem=0x7fca90f134aa
>>>>>>>> <__clock_nanosleep+138>) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
>>>>>>>>   9    Thread 0x7fca8ddfe700 (LWP 27948) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   8    Thread 0x7fca8dbfd700 (LWP 27949) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   7    Thread 0x7fca8d7fb700 (LWP 27951) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   6    Thread 0x7fca8d3f9700 (LWP 27953) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   5    Thread 0x7fca8d1f8700 (LWP 27954) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   4    Thread 0x7fca8cff7700 (LWP 27955) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   3    Thread 0x7fca8cdf6700 (LWP 27956) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>>   2    Thread 0x7fca8cbf5700 (LWP 27957) "Threadpool work"
>>>>>>>> pthread_cond_timedwait@@GLIBC_2.3.2 () at
>>>>>>>> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
>>>>>>>> * 1    Thread 0x7fca91cf0780 (LWP 27942) "cli" 0x00007fca911d7e0d
>>>>>>>> in read () at ../sysdeps/unix/syscall-template.S:81
>>>>>>>>
>>>>>>>> Could not reproduce on mono 3.12, but it happens on 4.0.3 and 4.2.2
>>>>>>>>
>>>>>>>> Is this a known issue? any workarounds? Tried setting MONO_DNS=1 to
>>>>>>>> use the clr dns, but that didn't help.
>>>>>>>>
>>>>>>>> Let me know if there is any more info I need to provide.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Seif
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mono-devel-list mailing list
>>>>>>>> Mono-devel-list at lists.ximian.com
>>>>>>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Mono-devel-list mailing list
>>>>> Mono-devel-list at lists.ximian.com
>>>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20160212/c06daf62/attachment.html>


More information about the Mono-devel-list mailing list