[Mono-dev] malloc error executing OBS-built mono

Miguel de Icaza miguel at xamarin.com
Fri Oct 23 14:31:37 UTC 2015


Well, the good news is that this should be trivial to reproduce under gdb,
are you able to do that?

The valgrind command is not reporting line numbers, could you perhaps
compile with debug symbols, or skip the part where you strip those symbols
out?

As for the bug, I can not figure what it is, the issue is an allocation of
a buffer that is later passed to strlen, and the function at that revision
does not have any buffers allocated from dllmap_start that are later passed
to strlen.

On Fri, Oct 23, 2015 at 3:28 AM, Miguel González <
mgonzalez at codicesoftware.com> wrote:

> Hi,
>
>
>
> I’ve run mono inside valgrind. I had already done that when I first bumped
> on this issue, but I didn’t save the results :-(  This is the summarized
> output:
>
>
>
> $ valgrind -v --track-origins=yes --leak-check=full
> /opt/plasticscm5/mono/bin/mono
> /opt/plasticscm5/mono/lib/mono/4.5/gacutil.exe –l
>
>
>
> ==4421== Memcheck, a memory error detector
>
> ==4421== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>
> ==4421== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>
> ==4421== Command: /opt/plasticscm5/mono/bin/mono
> /opt/plasticscm5/mono/lib/mono/4.5/gacutil.exe -l
>
> ==4421==
>
> --4421-- Valgrind options:
>
> --4421--    -v
>
> --4421--    --track-origins=yes
>
> --4421--    --leak-check=full
>
> --4421-- Contents of /proc/version:
>
> --4421--   Linux version 3.4.33-2.24-desktop (geeko at buildhost) (gcc
> version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (SUSE Linux) ) #1
> SMP PREEMPT Tue Feb 26 03:34:33 UTC 2013 (5f00a32)
>
> --4421-- Arch and hwcaps: X86, x86-sse1-sse2
>
> --4421-- Page sizes: currently 4096, max supported 4096
>
> --4421-- Valgrind library directory: /usr/lib/valgrind
>
> ==4421==
>
> ### ... more command output (see attached file) ... ###
>
> ==4421==
>
> ==4421== HEAP SUMMARY:
>
> ==4421==     in use at exit: 19,546 bytes in 643 blocks
>
> ==4421==   total heap usage: 81,884 allocs, 81,241 frees, 36,260,501 bytes
> allocated
>
> ==4421==
>
> ==4421== Searching for pointers to 643 not-freed blocks
>
> ==4421== Checked 23,326,752 bytes
>
> ==4421== ==4421== LEAK SUMMARY:
>
> ==4421==    definitely lost: 5,175 bytes in 326 blocks
>
> ==4421==    indirectly lost: 64 bytes in 8 blocks
>
> ==4421==      possibly lost: 288 bytes in 2 blocks
>
> ==4421==    still reachable: 14,019 bytes in 307 blocks
>
> ==4421==         suppressed: 0 bytes in 0 blocks
>
> ==4421== Reachable blocks (those to which a pointer was found) are not
> shown.
>
> ==4421== To see them, rerun with: --leak-check=full --show-reachable=yes
>
> ==4421==
>
> ### ... Detailed info about leaked heap blocks (see attached file) ... ###
>
> ==4421==
>
> ==4421== ERROR SUMMARY: 130 errors from 130 contexts (suppressed: 0 from 0)
>
> ==4421==
>
> ==4421== 1 errors in context 1 of 130:
>
> ==4421== Conditional jump or move depends on uninitialised value(s)
>
> ==4421==    at 0x402C2D9: strlen (in
> /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
>
> ==4421==    by 0x81B749E: dllmap_start (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x8283F41: monoeg_g_markup_parse_context_parse (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6E37: mono_config_parse_xml_with_context (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6F6C: mono_config_parse_file_with_context (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6FD6: mono_config_parse_file (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B808B: mono_config_parse (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x80C1448: mono_main (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x8065E51: main (in /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==  Uninitialised value was created by a heap allocation
>
> ==4421==    at 0x402B9FD: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
>
> ==4421==    by 0x827C84F: monoeg_malloc (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B7476: dllmap_start (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x8283F41: monoeg_g_markup_parse_context_parse (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6E37: mono_config_parse_xml_with_context (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6F6C: mono_config_parse_file_with_context (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B6FD6: mono_config_parse_file (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x81B808B: mono_config_parse (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x80C1448: mono_main (in
> /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==    by 0x8065E51: main (in /opt/plasticscm5/mono/bin/mono-sgen)
>
> ==4421==
>
> ==4421== ERROR SUMMARY: 130 errors from 130 contexts (suppressed: 0 from 0)
>
>
>
> I’ve attached the full report as a text file to this mail.
>
>
>
> Thanks!
>
>
>
> Miguel
>
>
>
> *From:* Miguel de Icaza [mailto:miguel at xamarin.com]
> *Sent:* 22 October 2015 18:09
>
> *To:* Miguel González <mgonzalez at codicesoftware.com>
> *Cc:* mono-devel-list at lists.ximian.com
> *Subject:* Re: [Mono-dev] malloc error executing OBS-built mono
>
>
>
> Hello,
>
>
>
> Well, the great news is that this happens without involving the JIT or GC
> - it happens just at startup, so this is a plain-old C bug.
>
>
>
> Your best bet is now to run this with Valgrind and see what it tells you.
>
>
>
> On Thu, Oct 22, 2015 at 12:04 PM, Miguel González <
> mgonzalez at codicesoftware.com> wrote:
>
> I executed a sample command (gacutils.exe -l) under gdb to see the trace
> at the time of the crash.
>
>
>
> This is the backtrace as returned by gdb:
>
>
>
> Program received signal SIGABRT, Aborted.
>
> 0xf7e1b245 in raise () from /lib/libc.so.6
>
> #0  0xf7e1b245 in raise () from /lib/libc.so.6
>
> #1  0xf7e1cac3 in abort () from /lib/libc.so.6
>
> #2  0xf7e635cb in malloc_printerr () from /lib/libc.so.6
>
> #3  0xf7e63aec in top_check () from /lib/libc.so.6
>
> #4  0xf7e65b13 in malloc_check () from /lib/libc.so.6
>
> #5  0xf7e66c45 in malloc () from /lib/libc.so.6
>
> #6  0xf7e6b071 in strdup () from /lib/libc.so.6
>
> #7  0x0827d54c in monoeg_g_strsplit ()
>
> #8  0x081b70a5 in arch_matches ()
>
> #9  0x081b70fa in arch_matches ()
>
> #10 0x081b7770 in dllmap_start ()
>
> #11 0x08283f42 in monoeg_g_markup_parse_context_parse ()
>
> #12 0x081b6e38 in mono_config_parse_xml_with_context ()
>
> #13 0x081b6f6d in mono_config_parse_file_with_context ()
>
> #14 0x081b6fd7 in mono_config_parse_file ()
>
> #15 0x081b808c in mono_config_parse ()
>
> #16 0x080c1449 in mono_main ()
>
> #17 0x08065e52 in main ()
>
>
>
> It seems to be failing when the /etc/mono/config file is loaded. This is
> the config file contents. The zlib line has been manually added to the code
> retrieved from GitHub.
>
> <configuration>
>
>    <dllmap dll="i:cygwin1.dll" target="libc.so.6" os="!windows" />
>
>    <dllmap dll="libc" target="libc.so.6" os="!windows"/>
>
>    <dllmap dll="intl" target="libc.so.6" os="!windows"/>
>
>    <dllmap dll="intl" name="bind_textdomain_codeset" target="libc.so.6"
> os="solaris"/>
>
>    <dllmap dll="libintl" name="bind_textdomain_codeset" target="libc.so.6"
> os="solaris"/>
>
>    <dllmap dll="libintl" target="libc.so.6" os="!windows"/>
>
>    <dllmap dll="i:libxslt.dll" target="libxslt.so" os="!windows"/>
>
>    <dllmap dll="i:odbc32.dll" target="libodbc.so" os="!windows"/>
>
>    <dllmap dll="i:odbc32.dll" target="libiodbc.dylib" os="osx"/>
>
>    <dllmap dll="oci" target="libclntsh.so" os="!windows"/>
>
>    <dllmap dll="db2cli" target="libdb2_36.so" os="!windows"/>
>
>    <dllmap dll="MonoPosixHelper"
> target="$mono_libdir/libMonoPosixHelper.so" os="!windows" />
>
>    <dllmap dll="i:msvcrt" target="libc.so.6" os="!windows"/>
>
>    <dllmap dll="i:msvcrt.dll" target="libc.so.6" os="!windows"/>
>
>    <dllmap dll="sqlite" target="libsqlite.so.0" os="!windows"/>
>
>    <dllmap dll="sqlite3" target="libsqlite3.so.0" os="!windows"/>
>
>    <dllmap dll="libX11" target="libX11.so.6" os="!windows" />
>
>    <dllmap dll="libgdk-x11-2.0" target="libgdk-x11-2.0.so.0"
> os="!windows"/>
>
>    <dllmap dll="libgtk-x11-2.0" target="libgtk-x11-2.0.so.0"
> os="!windows"/>
>
>    <dllmap dll="libXinerama" target="libXinerama.so.1" os="!windows" />
>
>    <dllmap dll="libcairo-2.dll" target="libcairo.so.2" os="!windows"/>
>
>    <dllmap dll="libcairo-2.dll" target="libcairo.2.dylib" os="osx"/>
>
>    <dllmap dll="libcups" target="libcups.so.2" os="!windows"/>
>
>    <dllmap dll="libcups" target="libcups.dylib" os="osx"/>
>
>    <dllmap dll="i:kernel32.dll">
>
>        <dllentry dll="__Internal" name="CopyMemory"
> target="mono_win32_compat_CopyMemory"/>
>
>        <dllentry dll="__Internal" name="FillMemory"
> target="mono_win32_compat_FillMemory"/>
>
>        <dllentry dll="__Internal" name="MoveMemory"
> target="mono_win32_compat_MoveMemory"/>
>
>        <dllentry dll="__Internal" name="ZeroMemory"
> target="mono_win32_compat_ZeroMemory"/>
>
>    </dllmap>
>
>    <dllmap dll="gdiplus" target="libgdiplus.so" os="!windows"/>
>
>    <dllmap dll="gdiplus.dll" target="libgdiplus.so"  os="!windows"/>
>
>    <dllmap dll="gdi32" target="libgdiplus.so" os="!windows"/>
>
>    <dllmap dll="gdi32.dll" target="libgdiplus.so" os="!windows"/>
>
>    <dllmap dll="z" target="libz.so.1" os="!windows" />
>
> </configuration>
>
>
>
> Anyway, as I mentioned, we’re putting the openSUSE 12.3 build aside. We’ve
> noticed that openSUSE 12.3, 13.1 and 13.2 can work with our openSUSE 12.2
> repository.
>
>
>
> Miguel
>
>
>
> *From:* mono-devel-list-bounces at lists.ximian.com [mailto:
> mono-devel-list-bounces at lists.ximian.com] *On Behalf Of *Miguel González
> *Sent:* 22 October 2015 10:49
> *To:* Miguel de Icaza <miguel at xamarin.com>
>
>
> *Cc:* mono-devel-list at lists.ximian.com
> *Subject:* Re: [Mono-dev] malloc error executing OBS-built mono
>
>
>
> Hi Miguel,
>
>
>
> Thanks for the update :-) I had also an e-mail from Zoltan Varga yesterday
> telling me that the change had been applied. I updated my code accordingly
> to see if that could be the fix for my issue, too… But it didn’t :-(
>
>
>
> I’ll try using gdb as you suggested. I hadn’t considered it before since
> this crash only happens inside the automated build of the OBS worker
> virtual machine and I’ve not been able to reproduce in any other
> environment. However, any additional info is always appreciated! There are
> also chances of ditching this build and relying on other distros to build
> our packages, since our OpenSUSE packages seem to be highly compatible
> between versions.
>
>
>
> Thank you,
>
>
>
> Miguel
>
>
>
> *From:* Miguel de Icaza [mailto:miguel at xamarin.com <miguel at xamarin.com>]
> *Sent:* 21 October 2015 20:44
> *To:* Miguel González <mgonzalez at codicesoftware.com>
> *Cc:* mono-devel-list at lists.ximian.com
> *Subject:* Re: [Mono-dev] malloc error executing OBS-built mono
>
>
>
> Hello Miguel,
>
>
>
> I also had to apply this change in order to avoid an unallowed warning
> message:
>
> I: Statement might be overflowing a buffer in strncat. Common mistake:
>
>    BAD: strncat(buffer,charptr,sizeof(buffer)) is wrong, it takes the
>
>    left over size as 3rd argument
>
>    GOOD: strncat(buffer,charptr,sizeof(buffer)-strlen(buffer)-1)
>
>
>
> We replaced that code with the glib string operations just yesterday:
>
>
>
> 042ddd504c09977682bb48010c5642390826d1da
>
>
>
> But thanks for sharing.
>
>
>
> At this point I’m able to build mono RPM packages and they’re working as I
> install them using a test OpenSUSE 12.3 virtual machine. However, when the
> GTK# builds are started –which use the mono packages as build requirement–,
> the worker is unable to run the mono executable: apparently, malloc is
> corrupting the heap or something. This is a sample execution as extracted
> from the OBS build logs:
>
> [  101s] + /opt/plasticscm5/mono/bin/mono /opt/plasticscm5/mono/lib/mono/4.5/gacutil.exe -l
>
> [  101s] *** Error in `/opt/plasticscm5/mono/bin/mono': malloc: top chunk is corrupt: 0x08ab9230 ***
>
>
>
> What you want to do at this point in time is to run the process under gdb,
> as this will show where malloc detected the error, and then you should get
> both the unmanaged stack trace, and if possible the managed one (with the
> mono_stack gdb macro)
>
>
>
> Miguel.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20151023/8ca49f2c/attachment-0001.html>


More information about the Mono-devel-list mailing list