[Mono-dev] TCP Async

Brett Ernst brett.e.ernst at gmail.com
Wed Jul 18 21:08:58 UTC 2012


I've had some strangeness with the thread pool in the past, but never
enough to get a solid, consistent repro that I could file a bug for. I
don't know if this is related or not, but I've actually seen a simple Timer
fail to generate callbacks under very high load (and on old hardware).
Again, not repro-able enough to file a bug for but enough to make me
nervous.

When I run your mono-socket-problem code, I start seeing the "No completion
callback" messages within 5 seconds and then regularly every 5-10 seconds
or so. I can't say for sure if the issues are related, but if they are,
this is the best repro I've seen.

As you can imagine, I've grown a bit of a distrust for the threadpool and
thus async socket operations. I put some effort into digging through the
mono internals, but without a solid repro and lacking a good understanding
of the thread pool implementation, my ultimate solution was to give up and
stop using async sockets altogether.

I took a different approach: I wrapped libev and POSIX sockets. Manos de
Mono is an excellent example of this approach. So far, this has been rock
solid and performs extremely well. Of course, the major downsides are: 1)
it's platform-specific, and 2) totally single-threaded. I get around #2 by
simply running multiple load-balanced nodes, one for each core. I still
make light use of the thread pool for long-running operations that
shouldn't block the message loop.

I only throw this out there as a possible alternative if you don't have any
success resolving this issue. Our architecture fit very well into the event
loop paradigm, but that may not work for everyone.

On Tue, Jul 17, 2012 at 7:47 AM, Greg Young <gregoryyoung1 at gmail.com> wrote:

> Btw to avoid confusion and duplicated work if someone starts could we just
> sync a bit in this thread?
>
> On Tuesday, July 17, 2012, Greg Young wrote:
>
>> Hey all.
>>
>> As this is a big issue for us and I feel a huge problem for mono in
>> general at this point as it means sockets basically dont work which is a
>> strong point of unix environments, I would like to propose something I have
>> done in the past. I am willing to offer a bounty (personally) for a working
>> fix to this section of code of $500 usd (more if done quickly).
>>
>> Acceptance criteria is the included test working in a stable fashion in
>> Linux / bsd but just Linux is acceptable as well,
>>
>> I honestly wish more people would do this kind of thing with OSS projects.
>>
>> Cheers,
>>
>> Greg
>>
>> On Saturday, July 7, 2012, Yuriy Solodkyy wrote:
>>
>>> Hi Rodrigo,
>>>
>>> please find a small sample app at
>>> https://github.com/ysw/mono-socket-problem
>>>
>>> This app can start in either server or client mode.  These modes only
>>> differ in whether it listens for connections on multiple ports or
>>> connects to server on multiple ports. Upon connecting to or accepting
>>> connection it immediately sends some data, and then sends next chunk
>>> of data in response to any data received from the other side.  There
>>> are some random delays in code and we limit outgoing traffic on each
>>> connection not to be significantly higher than inbound.
>>>
>>> There is also a separate thread which regularly checks status of every
>>> connection and report any connections that are awaiting a callback,
>>> but their status obtained with socket.poll is already READY.  (A
>>> several seconds delay is still allowed).
>>>
>>> See also the README file.
>>>
>>>
>>> Also, it seems that constantly changing men/max threads in ThreadPool
>>> increases probability of the problem. See code.
>>> Please let me know if this sample app works for you.
>>>
>>> Hope it helps
>>>
>>>
>>> Thank you,
>>> Yuriy
>>>
>>>
>>> We've been aware of such issues, could you file a bug and attach a test
>>> case with it please?
>>>
>>> This would really really help us fix it.
>>>
>>> On Wed, Jun 27, 2012 at 4:08 AM, Greg Young <gregoryyoung1 at gmail.com>
>>> wrote:
>>>
>>> > We are experiencing an issue with async behaviours in sockets (with
>>> > SendAsync/callback not Begin/End).
>>> >
>>> > Our visible issue is that when in a send loop we will fail on our
>>> > heartbeats. After debugging and counting calls into/out of
>>> > SendAsync/callback we see that we are inside of a call to SendAsync
>>> > (eg: it never returns, in our case for 10 seconds before we declare
>>> > the socket dead). We can duplicate this fairly regularly on
>>> > mac/bsd/linux though its nonconsistent (sometimes it may happen
>>> > repeatedly other times it works fine). The code does not have such
>>> > issues on MS CLR. We are also running on loopback so its unlikely that
>>> > an underlying network problem is causing the hang up. The code itself
>>> > is fairly straight forward (not that different than the MS example of
>>> > the API except that its fully async (separate send/receive loops while
>>> > the example is request/response))
>>> >
>>> > I am pulling sources now to build latest but does anyone happen to
>>> > know of known issues with this sort of thing?
>>> >
>>> > Cheers,
>>> >
>>> > Greg
>>> >
>>> > --
>>> > Le doute n'est pas une condition agréable, mais la certitude est
>>> absurde.
>>> > _______________________________________________
>>> > Mono-devel-list mailing list
>>> > Mono-devel-list at lists.ximian.com
>>> > http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>> >
>>>
>>> --
>>> Yuriy Solodkyy
>>> (y.solodkyy at gmail.com)
>>> _______________________________________________
>>> Mono-devel-list mailing list
>>> Mono-devel-list at lists.ximian.com
>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>
>>
>>
>> --
>> Le doute n'est pas une condition agréable, mais la certitude est absurde.
>>
>
>
> --
> Le doute n'est pas une condition agréable, mais la certitude est absurde.
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20120718/ecd23ace/attachment.html>


More information about the Mono-devel-list mailing list