[Mono-list] Parallel Task Library - Frustrating Behavior Leads to OOM Exception

Nigel Delaney nigel.delaney at outlook.com
Sun Dec 15 23:53:42 UTC 2013


Hi,

 

I had a question/improvement-suggestion regarding how the parallel task
library partitions data.  I recently found that when processing large
amounts of data in parallel, mono would run out of memory for no good
reason.  As an example case, consider this code designed to perform parallel
calculations over a large amount of data produced by an enumerator:

 

var myEnumerator=ThingThatProducesChunksOfData.DataGetter()

Parallel.ForEach(myEnumerator, x =>

        {CalculateSomething(x);});

 

If the dataset is large, it seems the mono implementation is guaranteed to
run out of memory for this code, while the MS version does not.

 

This happened to me recently (essentially I was running the above loop, and
found that the memory increased linearly until it gave me an OOM exception).
This was frustrating,  because each item in the enumerator could easily fit
in memory, and so I would never have expected it to run out of memory, and
it took me some time to trace why.

 

After tracing the issue, I found that by default mono will divide up the
enumerator using an EnumerablePartitioner class.  This class has a very
strange behavior in that every time it gives data out to a task, it "chunks"
the data by an ever increasing (and unchangeable) factor of 2.  So the first
time a task asks for data it gets a chunk of size 1, the next time of size
2*1=2, the next time 2*2=4, then 2*4=8, etc. etc.  The result is that the
amount of data handed to the task, and therefore stored in memory
simultaneously, increases with the length of the task, and if a lot of data
is being processed, an out of memory exception inevitably occurs.
Presumably, the original reason for this behavior is that it wants to avoid
having each thread return multiple times to get data, but it seems to be
based on the assumption that all data being processed could fit in to memory
(not the case when reading from large files).

 

After tracing this issue, I was able to avoid it by creating a custom data
partitioner, but I basically lost an afternoon figuring out what was going
on.  I don't know how the Microsoft team wrote their data partitioners to
avoid the above problem, and understand that Xamarin's focus is on
environments where this is not likely to occur.  However, if any experts out
there had good suggestions for quick fixes on how to avoid the OOM error, or
to intelligently raise the error in a way that alerts the user to the fact
that they should implement a custom partition, I thought it might be useful
(and a good improvement).  The only thing I could think of immediately was
adding a try/catch block around likes 125-131 of the
EnumerablePartitioner.cs file, that would give a more informative error
message with the OOM exception, but thought the community/Xamarin could beat
that so just wanted to alert people to it.

 

Warm wishes,

N

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-list/attachments/20131215/19ecc6f8/attachment.html>


More information about the Mono-list mailing list