[Mono-dev] Timing/race conditions

Rodrigo Kumpera kumpera at gmail.com
Thu May 28 18:51:46 UTC 2015


I'm not keen on introducing yield calls all over the place in the runtime
to work around bad test-environment combinations.

Adding them to the test suite it fine though.

Maybe the 200ms timeout is too low to deal with overloaded systems and must
be increased. The goal is to
detect bugs in the suspend code.

It would be much easier if unix had a way to transfer the quantum in a
yield call instead of just giving up on it.
We can definitely increase the timeout if that would help or make it
optional guarded behind an env var.

Does changing the timeout to infinite fix those crashes?


On Thu, May 21, 2015 at 4:20 PM, Neale Ferguson <neale at sinenomine.net>
wrote:

> Hi,
> I have been experiencing some failures with the tests in mono/tests,
> particularly in a single core configuration.
>
> Firstly, the sleep test: when the delegated thread is started, the main
> thread goes to call the StopWatch start method which requires JITting.
> This involves gc interaction as objects are allocated. However, the
> delegated thread gets up and starts issuing GC.Collection() calls which
> end up occurring every 50 microseconds. This means the main thread never
> gets a chance to get out of the allocation phase so never gets to execute
> the stopwatch start, thread sleep etc. so the thread never ends. In a
> multi-core configuration this is not a problem and the test passes. I
> found by inserting a Thread.Yield() as the first method called in the
> delegated thread eliminates the problem [1].
>
> Secondly, the xxxxx-exit (e.g. thread-exit) tests will occasionally fail
> with an abort due to "suspend_thread suspend took xxx ms, which is more
> than the allowed 200 ms” where xxx exceeds 200. This seems to be due to
> the exiting thread sometimes not getting to the stage of setting the
> thread->state to ThreadState_Stopped in the
> ves_icall_System_Environment_Exit() processing within the 200ms time
> period. Again, with multiple cores this is not a problem (or the problem
> is much rarer). I found by inserting a mono_thread_info_yield() prior to
> the suspend_internal_thread() in mono_thread_suspend_all_other_threads()
> fixes the problem [2]. I am not sure this is the best option and it’s
> still theoretically possible for the problem to still occur depending on
> how heavily the system is loaded. I was wondering if the setting of the
> state to ThreadState_stopped could be moved earlier in the process rather
> than in thread_cleanup() or if there’s another alternative.
>
> While the occasional failure has been experienced with some of the more
> pathological tests, the trouble is they happen nearly 100% of the time on
> a single core virtual machine, less often on a 2 core but in a virtual
> machine environment where there may be 100s of virtual machines competing
> for the real cores the probability of failure increases. In addition tests
> in the main test suite also have failed for the same reason as described
> in the second case.
>
> Neale
>
> [1] Circumvention for case 1 -
>
> --- a/mono/tests/sleep.cs
> +++ b/mono/tests/sleep.cs
> @@ -13,6 +13,7 @@ public class Tests
>         public static int test_0_time_drift () {
>                 // Test the Thread.Sleep () is able to deal with time
> drifting due to interrupts
>                 Thread t = new Thread (delegate () {
> +                               Thread.Yield();
>                                 while (!finished)
>                                         GC.Collect ();
>                         });
>
> [2] Circumvention for case 2 -
>
> --- a/mono/metadata/threads.c
> +++ b/mono/metadata/threads.c
>
> @@ -3132,6 +3147,8 @@ void mono_thread_suspend_all_other_threads (void)
>
>                         UNLOCK_THREAD (thread);
>
> +                       mono_thread_info_yield ();
> +
>                         /* Signal the thread to suspend */
>                         suspend_thread_internal (thre
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20150528/c318ebe8/attachment.html>


More information about the Mono-devel-list mailing list