[Mono-bugs] [Bug 690474] New: mod_mono autorestart under load drops connections and may corrupt the command stream
bugzilla_noreply at novell.com
bugzilla_noreply at novell.com
Thu Apr 28 03:37:31 EDT 2011
https://bugzilla.novell.com/show_bug.cgi?id=690474
https://bugzilla.novell.com/show_bug.cgi?id=690474#c0
Summary: mod_mono autorestart under load drops connections and
may corrupt the command stream
Classification: Mono
Product: Mono: Runtime
Version: 2.10.x
Platform: x86-64
OS/Version: Other
Status: NEW
Severity: Major
Priority: P5 - None
Component: misc
AssignedTo: mono-bugs at lists.ximian.com
ReportedBy: ben.last at nearmap.com
QAContact: mono-bugs at lists.ximian.com
Found By: ---
Blocker: ---
Created an attachment (id=427000)
--> (http://bugzilla.novell.com/attachment.cgi?id=427000)
Contains the aspx page and associated code
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13
(KHTML, like Gecko) Chrome/9.0.597.45 Safari/534.13
Ubuntu 9.04, Apache 2.2.11-2ubuntu2.3, mod_mono compiled from 2.10.1 source.
Running a site with the following mod_mono setup:
MonoAutoRestartMode appsite Requests
MonoAutoRestartRequests appsite 10000
MonoMaxActiveRequests appsite 32
MonoMaxWaitingRequests appsite 1024
The mod-mono executable is /usr/local/mono-2.10.1/bin/mod-mono-server4
Created a minimal site containing a single aspx page with code (see
attachments). All the page does is return a fixed string.
Using an inhouse load-generator, we caused 128 workers to repeatedly load
/alive.aspx as fast as possible (so there would be a maximum of 128 requests in
parallel at any one time, and a maximum of 128 requests queued if the
mod-mono-server is busy). The load generator typically runs at around 500
requests/second when exercising the mono back-end. Every 10000 requests, the
mod-mono-server is restarted.
Reproducible: Sometimes
Steps to Reproduce:
1. See details
Actual Results:
We see the following behaviour:
1. In the Apache error log:
[Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed
[Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed
[Thu Apr 28 15:07:22 2011] [error] Command stream corrupted, last command was 1
[Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed
[Thu Apr 28 15:07:22 2011] [error] (70014)End of file found: read_data failed
(many instances of this).
2. In the load-tester clients, at the point that the mod-mono-server is
restarted, we see many connection refused errors and 500 server errors. We
believe that these are due to the restart since they happen simultaneously with
the errors shown above.
3. The above does not always happen. For example, in one test run, when 10000
requests were reached, all requests waited until the restart and then
continued; though the responses were delayed, no errors occurred. We therefore
suspect this is a race condition that occurs when a certain pattern of parallel
requests is handled.
Expected Results:
We would expect that when the mod-mono-server restarts, pending requests are
queued until it is available again.
At present, we believe that the auto restart feature is necessary because (a)
there are slow memory leaks in the mod-mono-server process (probably due to
base heap fragmentation) and there are a number of conditions which cause the
mod-mono-server to consume 100% CPU*. However, unless the server can restart
cleanly, it is not usable in production under load.
* If we can get a test case for any of these, we'll submit bug reports. The
bug relating to caching, fixed in 2.10.2 is not the cause here, since we have
worked around that (and also reported that bug).
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.
More information about the mono-bugs
mailing list