[Mono-dev] tuning sgen performance & bug

Jonathan Shore jonathan.shore at gmail.com
Fri Aug 31 22:26:41 UTC 2012


With this specific application, (which is single threaded), I have a "volatile" working set of ~2GB .   By volatile I mean that these are not application lifetime objects, rather will be disposed at some point during evaluation.

More specifically, I read 1.6TB of data incrementally into 1600 timeseries (basically an array of event objects).   Each timeseries only holds a window of data (in my case half with 25K items  and half with 5K items).   Once each timeseries has overrun by say 1024 elements, the 1024 oldest elements are shifted off, for GC.

So the pattern is that there are always 2GB of referenced objects, and periodically 1600 x 1024 old objects to be disposed of.    Due to the large sizes, it would seem that these older objects get relegated to the main heap.   This then requires a much more expensive GC (presumably).

If I understood the sgen algorithm correctly, no matter what the size of the nursery (unless was 1.6TB), my working set is going to land in the main heap with my object garbage pattern.    I believe this is because if the nursery fills, any objects that are still referenced, regardless of age, will be moved to the main heap.    Once GC completes, the nursery is empty (maybe except pinned objs)?

My objects become garbage in a FIFO pattern and not something LIFO like.   The garbage "pipeline" is 2GB large, so the nursery fails for this app.

Assuming Boehm is my only choice, If I expand the series window or # of series I quickly run into the maximum heap problem with Boehm.   

Ideas?


On Aug 31, 2012, at 5:29 PM, Rodrigo Kumpera wrote:

> There are two situations that make sgen slower than boehm. 
> 
> The first is a non-generational workload. If your survivor rate is too high, a generational collector
> can't compete with single space one like boehm.
> 

To some extent this is defined by the size of the nursery, no? 


> The second one is if you have too much of the old generation pointing to young objects causing minor collections
> to scan way too much memory to be profitable.
> 
> The nursery size should usually be a not so small fraction of the total heap you expect. As a good guess you can use
> 1/10 - 1/20.
> 
> Are you expecting to have a heap of multiple Gigs? Because a 2Gb nursery will need at least 8Gb of major memory.
> 
> About your crash. I just noticed a very silly thing, we have never ever tried sgen with huge nurseries because there's a 
> 128Mb implicit limit due to some internal sizes.
> 
> Jonathan, for such huge heaps, sgen will need the parallel collector to compete with boehm on linux, which is a not
> very mature piece of code both in stability and performance.
> 
> 
> On Fri, Aug 31, 2012 at 2:03 PM, Jonathan Shore <jonathan.shore at gmail.com> wrote:
> HI,
> 
> sgen is now working for me (thanks to a subtle bug fix for thread-local-storage by Zoltan).   However, for one application, sgen is 25% slower than the same with the boehm collector.   I am processing some GBs of timeseries data, though only evaluating a window at a time.   As the window reaches some size, older objects in the timeseries are dereferenced.   The object size is 88bytes, but generate many millions across the course of a run.
> 
> I suspect that the nursery is too small, so that the objects I want to collect are now in the main heap.    Towards that end I wanted to extend the nursery, and attempted this:
> 
> export MONO_GC_PARAMS="nursery-size=2g"
> 
> This causes mono to crash immediately, with:
> 
> 	* Assertion at sgen-gc.c:1206, condition `idx < section->num_scan_start' not met
> 	...
> 
> (this is on linux with the latest code on master, roughly 2.11.3+)
> 
> I took a look at the code, but requires too much context for me to understand the real cause of the issue.   I am guessing that there is some assumption re: the size of the nursery, block size, etc.
> 
> Finally, I am interested in trying the "copying collector" as discussed in this blog entry: 
> 
> http://schani.wordpress.com/2011/01/10/sgen-the-major-collectors/
> 
> I'm wondering if will get some performance advantages with this approach, whereas the nursery may be too small for my garbage working set.
> 
> Ideas?
> 
> Thanks
> Jonathan
> 
> 
> 
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20120831/f28170fc/attachment-0001.html>


More information about the Mono-devel-list mailing list