[Mono-dev] Mono CI weather report 1/19

Andi McClure anmccl at microsoft.com
Thu Jan 19 21:33:03 UTC 2017


What this is: The Mono team has a CI (continuous integration) system which builds and runs automated tests on every commit checked in to git (specifically the master branch). We have a test log viewer<https://jenkins.mono-project.com/view/All/job/jenkins-testresult-viewer/Test_Result_View/> on Jenkins that tracks the results (currently only accessible to github project admins, sorry). Once a week I sweep through and write an email with a list of the most frequently-failing automated tests.

I haven't sent one of these out for a few weeks. Our most frequent failures have not changed over that time, but we have some new, unusual failures. 5, 6, and 8 are worrying new thread issues and 7 and 10 are new issues with sockets.

Here are the top failures currently making Jenkins builds fail:

0. Disabled tests

The following Bugzillas represent tests that have been temporarily disabled because otherwise they are failing every time:

https://bugzilla.xamarin.com/show_bug.cgi?id=47053
https://bugzilla.xamarin.com/show_bug.cgi?id=47054

1. "sgen new threads dont join stw exe" timeout [Existing]

The various variants of sgen-new-threads-dont-join-stw.exe have started frequently hanging with a stack implicating System.Threading.WaitHandle.WaitOne_internal. Ludovic is looking at this currently but nothing is filed.

Example:
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/5692/testReport/MonoTests/sgen-regular-tests-ms-split/sgen_new_threads_dont_join_stw_exe_timedout/

The same stack has also been seen in Appdomain-unload.exe, for example:
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/1623/testReport/MonoTests/runtime/appdomain_unload_exe_timedout/

2. Timeouts in System.AppDomain.InternalUnload on mac [Existing]

This failure has been happening for a while. It happens very frequently on mac and occasionally on the Linux platforms.

Recent tests this fail in include:

MonoTests.runtime.appdomain-threadpool-unload.exe_timedout
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/5697/testReport/MonoTests/runtime/appdomain_threadpool_unload_exe_timedout/
This single test, with the InternalUnload timeout, is all by itself our most common failure

MonoTests.sgen-regular-tests-ms-conc-split.sgen-domain-unload.exe_timedout
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/5692/testReport/MonoTests/sgen-regular-tests-ms-conc-split/sgen_domain_unload_exe_timedout/

3. ThreadAbortException in System.Threading.Timer+Scheduler.SchedulerThread  (the "List`1 issue") [Existing]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43320 , currently assigned to Rodrigo. This has persistently been one of our heaviest crash contributors for months.

This occurs in many different places but the crash message always looks the same.

Unhandled Exception:
System.TypeInitializationException: The type initializer for 'System.Collections.Generic.List`1' threw an exception. ---> System.Threading.ThreadAbortException
   --- End of inner exception stack trace ---
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0000f] in <filename unknown>:0
  at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) [0x00017] in <filename unknown>:0
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x0008d] in <filename unknown>:0
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <filename unknown>:0
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state) [0x00031] in <filename unknown>:0
  at System.Threading.ThreadHelper.ThreadStart () [0x0000b] in <filename unknown>:0
[MVID] 0deb57f9de664ff681556c641423618d 0,1,2,3,4,5
[ERROR] FATAL UNHANDLED EXCEPTION: Nested exception trying to figure out what went wrong

Some places this failure has been seen include MonoTests.runtime.gsharing-valuetype-layout.exe, MonoTests.gshared.generic-marshalbyref.2.exe, MonoTests.runtime.bug-415577.exe, and as an unknown-test failure when a test suite (such as mcs/class/corlib) is shutting down.

Recent example:

https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/1625/testReport/MonoTests/runtime/appdomain_threadpool_unload_exe/

Older examples:

https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-amd64/1613/testReport/MonoTests/runtime/appdomain1_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-amd64/1039/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4606/testReport/MonoTests/gshared/generic_marshalbyref_2_exe_3/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-amd64/4607/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4608/testReport/MonoTests/runtime/bug_415577_exe/
https://jenkins.mono-project.com/job/test-mono-mainline/label=osx-i386/4656/parsed_console/log_content.html#WARNING1 (test shutdown)

4. MonoTests.System.Net.Sockets.SocketTest.SendAsyncFile [Existing]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=43172 , currently unassigned.

This has been failing for a very long time. It only occurs on Linux but on Linux it fails over 20% of the time. (It has also been seen on Android.) It is possible this is only an issue in CI (see akoeplinger note in bug).

The failure is consistent and looks like:

                                                MESSAGE:
                                                System.Exception : Could not abort registered blocking threads before closing socket.
Thread StackTrace:
  at System.Net.Sockets.SafeSocketHandle.RegisterForBlockingSyscall () [0x00057] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/SafeSocketHandle.cs:114
  at System.Net.Sockets.Socket.SendFile_internal (System.Net.Sockets.SafeSocketHandle safeHandle, System.String filename, System.Byte[] pre_buffer, System.Byte[] post_buffer, System.Net.Sockets.TransmitFileOptions flags) [0x00000] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2944
  at System.Net.Sockets.Socket.SendFile (System.String fileName, System.Byte[] preBuffer, System.Byte[] postBuffer, System.Net.Sockets.TransmitFileOptions flags) [0x00028] in /mnt/jenkins/workspace/test-mono-mainline-linux/label/ubuntu-1404-amd64/mcs/class/System/System.Net.Sockets/Socket.cs:2893
[snip]

Examples:

https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-amd64/556/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/558/testReport/MonoTests.System.Net.Sockets/SocketTest/SendAsyncFile/

5. Crash or assert during System.Threading.ThreadPool.RequestWorkerThread [New]

Several times we crashed with stacks like:

  at (wrapper managed-to-native) System.Threading.ThreadPool.RequestWorkerThread () <0x00033>
  at System.Threading.ThreadPoolWorkQueue.EnsureThreadRequested () [0x0001f] in <660ce50ed79647bca25d254afb2313da>:0
  at System.Threading.ThreadPoolWorkQueue.Enqueue (System.Threading.IThreadPoolWorkItem,bool) [0x0006c] in <660ce50ed79647bca25d254afb2313da>:0
  at System.Threading.ThreadPool.QueueUserWorkItemHelper (System.Threading.WaitCallback,object,System.Threading.StackCrawlMark&,bool) [0x00016] in <660ce50ed79647bca25d254afb2313da>:0
  at System.Threading.ThreadPool.UnsafeQueueUserWorkItem (System.Threading.WaitCallback,object) [0x00002] in <660ce50ed79647bca25d254afb2313da>:0

In one case, https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1625/testReport/MonoTests/runtime/marshal_valuetypes_exe/ , an assertion was seen
* Assertion at ../../mono/utils/refcount.h:41, condition `refcount' not met

In another, https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1621/parsed_console/log_content.html#WARNING1  , a different assertion was seen
mono_os_mutex_trylock: pthread_mutex_trylock failed with "Invalid argument" (22)

Examples:

MonoTests.runtime.appdomain1.exe
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1623/testReport/MonoTests/runtime/appdomain1_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/1612/testReport/MonoTests/runtime/appdomain1_exe/
Some native stacks survived in the first link; it looks like a bad semaphore or mutex

MonoTests.gshared.generic-marshalbyref.2.exe
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1622/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1610/testReport/MonoTests/gshared/generic_marshalbyref_2_exe/

MonoTests.runtime.marshal-valuetypes.exe
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1625/testReport/MonoTests/runtime/marshal_valuetypes_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armhf/1618/testReport/MonoTests/runtime/marshal_valuetypes_exe/

Failure while running the System.Runtime.Remoting tests
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1621/parsed_console/log_content.html#WARNING1

Tests that failed here also frequently failed with the List`1 issue.

6. "thread->suspended not met" assertion [New]

In mono/tests, running different tests (delegate6.cs and pinvoke-2.2.cs), we halted multiple times with this assertion:

* Assertion at threads.c:1082, condition `thread->suspended' not met

Example:
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/1625/parsed_console/log_content.html#WARNING1
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=ubuntu-1404-i386/1622/parsed_console/log_content.html#WARNING1

Both examples seen were in Linux Intel32

7. Test failure in ClientWebSocketTest.CloseAsyncTest [New]

Test failed three times on Linux ARM with

                                                MESSAGE:
                                                  Expected: True
  But was:  False

Examples:
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1618/testReport/(root)/ClientWebSocketTest/CloseAsyncTest/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armhf/1617/testReport/(root)/ClientWebSocketTest/CloseAsyncTest/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armhf/1618/testReport/(root)/ClientWebSocketTest/CloseAsyncTest/

8. "Invoked mono_thread_state_init_from_sigctx from non-Mono thread" assertion [New]

MonoTests.runtime.pinvoke3.exe failed three times with this assertion (in addition it also hung once) on 32-bit ARM Linux.

Examples:
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1597/testReport/MonoTests/runtime/pinvoke3_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1606/testReport/MonoTests/runtime/pinvoke3_exe/
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armhf/1620/testReport/MonoTests/runtime/pinvoke3_exe/

9. System.Xaml hangs [Existing]

Filed as https://bugzilla.xamarin.com/show_bug.cgi?id=46683 , not assigned. Has been persistently seen a few times a week for a while now, examples in bug. There is a set of Xaml tests which is hanging in XamlBackgroundReader.Read (), waiting on a ManualResetEvent that never triggers. Appears to be a class library issue.

10. ServiceModel crash, probably in System.Net.Sockets.Socket.Receive_internal [New]

While running the System.ServiceModel tests. We saw two crashes, one did not print anything useful, the other one failed during MonoTests.System.ServiceModel.Channels.TcpTransportBindingElementTest.SimpleDuplexBuffered with this stack:

  at (wrapper managed-to-native) System.Net.Sockets.Socket.Receive_internal (intptr,byte[],int,int,System.Net.Sockets.SocketFlags,int&,bool) <0x00057>
  at System.Net.Sockets.Socket.Receive_internal (System.Net.Sockets.SafeSocketHandle,byte[],int,int,System.Net.Sockets.SocketFlags,int&,bool) [0x00006] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/System/System.Net.Sockets/Socket.cs:1528
  at System.Net.Sockets.Socket.Receive (byte[],int,int,System.Net.Sockets.SocketFlags,System.Net.Sockets.SocketError&) [0x00016] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/System/System.Net.Sockets/Socket.cs:1313
  at System.Net.Sockets.Socket.Receive (byte[],int,int,System.Net.Sockets.SocketFlags) [0x00000] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/referencesource/System/net/System/Net/Sockets/Socket.cs:1769
/media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/scripts/ci/babysitter: Command `make` timed out
  at System.Net.Sockets.NetworkStream.Read (byte[],int,int) [0x0009b] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/referencesource/System/net/System/Net/Sockets/NetworkStream.cs:513
  at System.IO.Stream.ReadByte () [0x00007] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/referencesource/mscorlib/system/io/stream.cs:753
  at System.ServiceModel.Channels.NetTcp.TcpBinaryFrameManager.ProcessPreambleAckInitiator () [0x00000] in /media/ssd/jenkins/workspace/test-mono-mainline-linux/label/debian-8-armel/mcs/class/System.ServiceModel/System.ServiceModel.Channels.NetTcp/TcpBinaryFrameManager.cs:237

Examples:
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-armel/1625/parsed_console/log_content.html#WARNING2
https://jenkins.mono-project.com/job/test-mono-mainline-linux/label=debian-8-arm64/1624/parsed_console/log_content.html#WARNING2

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dot.net/pipermail/mono-devel-list/attachments/20170119/f68b8145/attachment-0001.html>


More information about the Mono-devel-list mailing list