[Mono-dev] Increasing stack size? [was: Re: xbuild crash with mono4.6.1?]

Jon Purdy jopur at microsoft.com
Wed Nov 2 00:47:24 UTC 2016


Yup, I’ll look into it.

From: Mono-devel-list <mono-devel-list-bounces at lists.dot.net> on behalf of Rodrigo Kumpera <kumpera at gmail.com>
Date: Tuesday, November 1, 2016 at 5:46 PM
To: Burkhard Linke <blinke at cebitec.uni-bielefeld.de>
Cc: "mono-devel-list at lists.dot.net" <mono-devel-list at lists.dot.net>
Subject: Re: [Mono-dev] Increasing stack size? [was: Re: xbuild crash with mono4.6.1?]

This looks like a bug in mono's qsort.

It should not need more than 18-36 levels of recursion.

Vlad/John, could you look at this issue?

--
Rodrigo

On Tue, Nov 1, 2016 at 9:35 AM, Burkhard Linke <blinke at cebitec.uni-bielefeld.de<mailto:blinke at cebitec.uni-bielefeld.de>> wrote:
Hi,


the allocation indeed is caused by mmap being unable to create additional mappings.


With more mapping the application is able to continue, but runs into another problem:


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f5afe3b8700 (LWP 55986)]
0x000000000061cd37 in memcpy (__src=0x7f5ab2e147f8, __dest=0x7f5afe3b6c30,
    __len=8) at /usr/include/x86_64-linux-gnu/bits/string3.h:52
52    }
(gdb) bt
#0  0x000000000061cd37 in memcpy (__src=0x7f5ab2e147f8, __dest=0x7f5afe3b6c30,
    __len=8) at /usr/include/x86_64-linux-gnu/bits/string3.h:52
#1  partition (swap_tmp=0x7f5afe3b6c20 "", pivot_tmp=0x7f5afe3b6c30 "", compar=
    0x60ae60 <block_usage_comparer>, width=8, nel=4517, base=0x7f5ab2e10168)
    at sgen-qsort.c:31
#2  qsort_rec (base=base at entry=0x7f5ab2e10168, nel=nel at entry=4517,
    width=width at entry=8, compar=compar at entry=0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:52
#3  0x000000000061ce7b in qsort_rec (base=base at entry=0x7f5ab2e10168,
    nel=nel at entry=4518, width=width at entry=8, compar=compar at entry=
    0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:53
#4  0x000000000061ce7b in qsort_rec (base=base at entry=0x7f5ab2e10168,
    nel=nel at entry=4519, width=width at entry=8, compar=compar at entry=
    0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:53
...

(gdb) bt -20
#18349 0x000000000061ce7b in qsort_rec (base=0x7f5ab2dbc030,
    base at entry=0x7f5ab2dbc000, nel=184426, nel at entry=184432,
    width=width at entry=8, compar=compar at entry=0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:53
#18350 0x000000000061ce7b in qsort_rec (base=base at entry=0x7f5ab2dbc000,
    nel=nel at entry=184433, width=width at entry=8, compar=compar at entry=
    0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:53
#18351 0x000000000061ce7b in qsort_rec (base=base at entry=0x7f5ab2dbc000,
    nel=nel at entry=229138, width=width at entry=8, compar=compar at entry=
    0x60ae60 <block_usage_comparer>,
    pivot_tmp=pivot_tmp at entry=0x7f5afe3b6c30 "",
    swap_tmp=swap_tmp at entry=0x7f5afe3b6c20 "") at sgen-qsort.c:53
#18352 0x000000000061cedd in sgen_qsort (base=base at entry=0x7f5ab2dbc000,
    nel=nel at entry=229138, width=width at entry=8, compar=compar at entry=
    0x60ae60 <block_usage_comparer>) at sgen-qsort.c:69
#18353 0x000000000060b7df in sgen_evacuation_freelist_blocks (
    block_list=0x7f5b8576b300, size_index=10) at sgen-marksweep.c:1860
#18354 0x000000000060d319 in major_start_major_collection ()
    at sgen-marksweep.c:1898
#18355 0x0000000000604f59 in major_start_collection (
---Type <return> to continue, or q <return> to quit---
    reason=reason at entry=0x702fb1 "LOS overflow",
    concurrent=concurrent at entry=0,
    old_next_pin_slot=old_next_pin_slot at entry=0x7f5afe3b6d28) at sgen-gc.c:1923
#18356 0x0000000000607678 in major_do_collection (forced=0, is_overflow=0,
    reason=0x702fb1 "LOS overflow") at sgen-gc.c:2082
#18357 major_do_collection (reason=0x702fb1 "LOS overflow", is_overflow=0,
    forced=0) at sgen-gc.c:2065
#18358 0x0000000000607d44 in sgen_perform_collection (requested_size=43344,
    generation_to_collect=1, reason=0x702fb1 "LOS overflow", wait_to_finish=0,
    stw=1) at sgen-gc.c:2279
#18359 0x000000000060823c in sgen_ensure_free_space (size=<optimized out>,
    generation=<optimized out>) at sgen-gc.c:2232
#18360 0x000000000060a259 in sgen_los_alloc_large_inner (
    vtable=vtable at entry=0xe004a8, size=size at entry=43344) at sgen-los.c:379
#18361 0x00000000005fb580 in sgen_alloc_obj_nolock (
    vtable=vtable at entry=0xe004a8, size=size at entry=43344) at sgen-alloc.c:175
#18362 0x00000000005e8da1 in mono_gc_alloc_string (vtable=vtable("string"),
    size=size at entry=43344, len=len at entry=21661) at sgen-mono.c:1833
#18363 0x00000000005c5025 in mono_string_new_size_checked (domain=0xdd2fe0,
    len=len at entry=21661, error=error at entry=0x7f5afe3b6eb0) at object.c:6074
#18364 0x0000000000597899 in ves_icall_System_String_InternalAllocateStr (
    length=21661) at string-icalls.c:41
#18365 0x00000000405fbed2 in ?? ()
---Type <return> to continue, or q <return> to quit---
#18366 0x00007f5b016fdd78 in ?? ()
#18367 0x00007f5aaa5c6930 in ?? ()
#18368 0x0000000000000000 in ?? ()


Stack overflow due to 18368 stack frames caused by the recurvise quicksort implementation in sgen-qsort.c. The application is creating a high number of short lived objects, and the memory is badly fragmented (229138 entries in freelist...). Stack size has already been increased to 16M, and GC nursery size is set to 2G to cope with the high number of temporary objects, which keeps the number of mmap'ed fragments lower (~ 60.000 instead of ~120.000).

Does mono honor the system stack size limit (and thus allows larger stacks for larger values of ulimit -s)?

Regards,
Burkhard
_______________________________________________
Mono-devel-list mailing list
Mono-devel-list at lists.dot.net<mailto:Mono-devel-list at lists.dot.net>
http://lists.dot.net/mailman/listinfo/mono-devel-list<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.dot.net%2Fmailman%2Flistinfo%2Fmono-devel-list&data=02%7C01%7Cjopur%40microsoft.com%7Cbe91d7cdc5a345b616c108d402b9a8ed%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636136443813815535&sdata=5%2F9GKjw9PszRV3UdHxlfJK3eKCX6vQo2Vnph%2Bux9vgA%3D&reserved=0>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dot.net/pipermail/mono-devel-list/attachments/20161102/4d80193a/attachment.html>


More information about the Mono-devel-list mailing list