[Mono-dev] Fundamental performance problems with Mono

damageboy dans at houmus.org
Thu Jan 7 17:14:28 EST 2010


Hi Zvika,
I'll start by saying that I've been there, I've also seen abysmal
performance with the mono async socket implementation.
If you'll dig down in the code (I did so last time around Mono 2.2) you
should also see that there is no such thing in Mono/Linux at any rate...
By this, I mean that a fundamental difference you'll find in the Linux world
from the Windows world is that there is no async socket API for Linux. This
is a "limitation" (if you want to call it like that) of the Linux kernel,
and in no way related to mono.

While calling BeginSend/Receive in Windows + MS.NET is implemented by means
of true async sockets on Windows, which ultimately are a Winsock / Windows
NT Kernel feature, calling BeginSend on Mono simply queues a work item into
the thread pool that will call the normal socket apis.
This is a fundamental difference in how Mono/MS.NET work.
Feel free to gaze at the code on
"mcs/class/System/System.Net.Sockets/Socket.cs" and see this for yourself...
While the Mono people could write two implementations for BeginXXX (one for
Windows + async sockets, one for Linux) I don't really blame them for
implementing the BeingXXX APIs the way they did.

In a way, using a BeginXXX APIs for sockets on Mono generally degrades
performance (in terms of overhead and latency for packet send/receive) under
heavy load than using the regular non-async apis.
This should pretty much leave you asking yourself why would you ever want to
use the so-called more advanced "XXXAsync Socket API" (which was your
original intent, as far as I can tell).
I personally see very little benefit even if were implemented in Mono.

This definitely does not mean that all is lost. On the contrary, you can
achieve much higher throughput / lower latency with Mono + Linux, but
achieving this with the Microsoft centric APIs / paradigms (as
System.Net.Sockets is) is highly unlikely IMO (again, I would like to stress
that this is really not Mono's fault).

I suggest you read up on the C10K problem either on Wikipedia or Dan Kegel's
site:
http://www.kegel.com/c10k.html

There are many possible solutions, including some that are not mentioned in
the C10K page, such as using
P/Invoke to call vmsplice/splice for sending/receiving data with Zero Copy
networking or, as I've done in the past,
wrapping up Evgeniy Polyakov's netchannels and userspace network stack:
http://www.ioremap.net/projects/unetstack
http://www.ioremap.net/projects/netchannels

Although this means getting down and dirty, often using unsafe code and
pointers and 
what not, let me assure you, that you will be able to make a very modest
server/desktop 
machine blow away anything you've ever sen with Windows before.

In short, I think you're looking at the wrong problem.

Hope this helps.



zvikag wrote:
> 
> Hello all,
> The bottom line of this message is that I don't see how can one write a
> high-performance socket server in Mono...
> Here is the story:
> I am writing a proxy server using .NET Socket API. This proxy does almost
> entirely I/O work - copying buffers from one socket to another. Now, Mono
> doesn't implement the newer 
> http://msdn.microsoft.com/en-us/library/system.net.sockets.socketasynceventargs.aspx
> XXXAsync Socket API  that was introduced in .NET 2.0 SP1 (or more
> accurately, implements it 
> http://www.mail-archive.com/mono-list@lists.ximian.com/msg28621.html
> perfunctorily ). So I was left to use the APM Socket API which produces
> tons of garbage objects under heavy load.
> When testing the server on Linux under load we saw very frequent CPU
> bursts that crippled the throughput of the server. After profiling with
> the mono built-in profiler I confirmed that the reason for the high CPU
> usage was the GC collections that got more and more frequent and took more
> and more time. I then read a little bit and realized that the Mono GC is
> non-generational which might explain the long GC cycles (if it was
> generational it could have collected the garbage objects that were created
> during async socket operations in generation 0 and probably stop there,
> but it has to traverse the entire managed heap).
> So the combination of the non-generational GC and the unimplemented
> XXXAsync Socket API result in very poor performance of the Mono server.
> The maximum throughput of the server with Mono on Linux is about half of
> that on Windows using .NET.
> 
> I attached the GC stats and profiling results of a 15 minute run.
>  http://old.nabble.com/file/p27026906/profile_alloc.log profile_alloc.log 
>  http://old.nabble.com/file/p27026906/gc_stats.log gc_stats.log 
> Can you help me out here?
> 

-- 
View this message in context: http://old.nabble.com/Fundamental-performance-problems-with-Mono-tp27026906p27067974.html
Sent from the Mono - Dev mailing list archive at Nabble.com.



More information about the Mono-devel-list mailing list