[Mono-bugs] [Bug 565666] New: High memory usage in collections

bugzilla_noreply at novell.com bugzilla_noreply at novell.com
Thu Dec 17 12:17:45 EST 2009


http://bugzilla.novell.com/show_bug.cgi?id=565666

http://bugzilla.novell.com/show_bug.cgi?id=565666#c0


           Summary: High memory usage in collections
    Classification: Mono
           Product: Mono: Runtime
           Version: 2.6.x
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Normal
          Priority: P5 - None
         Component: GC
        AssignedTo: lupus at novell.com
        ReportedBy: goldywhite at gmail.com
         QAContact: mono-bugs at lists.ximian.com
          Found By: ---
           Blocker: ---


User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US)
AppleWebKit/532.0 (KHTML, like Gecko) Chrome/3.0.195.38 Safari/532.0

1) My class SegmentedCache<T> loads data in array of arrays[1028*1024] by
rowid. T is class {uint,ulong,ulong,string[2]} rowid is sequence of longs
started with 0 (uint field in T). I make it to avoid memory reallocation on
data add. Each array constructed once with predefined size.

2) Indexing of data based on my class:
BalancedSortedList<TKey, TValue, TBasis, TShortKey> : IDictionary<TKey, TValue>
TKey (ulong fields from T) is semisequenced. In that case i use this
combination of sorted lists:
SortedList<TBasis, SortedList<TShortKey, TValue>> m_cache;
To avoid realocations on data insert I collected count of records in
SortedDictionary<TBasis, int> m_stat; at first
and than constructed each inner SortedList by this constructor: new
SortedList<TShortKey, TValue>(count);
TKey - ulong
TValue - uint
TBasis - ulong
TShortKey - ushort
on 5 mln. of data records there is for about 1000 SortedList<TShortKey, TValue>
In Windows x86 .NET 1 mln. of records takes 119296Kb of Memory.
In Windows x86 Mono it takes twice more - 215864Kb of Memory.

On x64 RHEL5.3 I load 27mln of records and built one unique index on them.
usefull data:
27M * 22 bytes
reference and other data:
 28M(T References)*8 + 27M*12 (string pointers+length) + 28*8 (array
references)
Index useful data:
27M*8
Index additional data:
30K * (8(key)+8(SortedListPointer)+20(arrays in sortedList and size))

~1.5Gb
But in real it takes 7.1 Gb.
For example Oracle TimesTen with same data and T-Tree index takes 3.4 Gb.

Reproducible: Always

Steps to Reproduce:
1.
2.
3.



public class SegmentedCache<T>
    {
        private readonly long m_segmentMask;
        private readonly long m_segmentSize;
        private readonly int m_segmentSizeMod;
        private T[][] datas = new T[1][];
        public virtual T this[long rowid]
        {
            get
            {
                var seg = rowid >> m_segmentSizeMod;
                if (datas.Length <= seg || datas[seg] == null)
                    return default(T);
                var idx = rowid & m_segmentMask;
                return datas[seg][idx];
            }
            set
            {
                var seg = rowid >> m_segmentSizeMod;
                if (datas.Length <= seg)
                {
                    Array.Resize(ref datas, (int)seg + 1);
                    datas[seg] = new T[m_segmentSize];
                }
                else if (datas[seg] == null)
                    datas[seg] = new T[m_segmentSize];
                var idx = rowid & m_segmentMask;
                datas[seg][idx] = value;
            }
        }
}


public sealed class BalancedSortedList<TKey, TValue, TBasis, TShortKey> :
IDictionary<TKey, TValue>
    {
        private readonly SortedList<TBasis, SortedList<TShortKey, TValue>>
m_cache;
        private readonly SplitAction<TKey, TBasis, TShortKey> m_split;
        private readonly JoinAction<TKey, TBasis, TShortKey> m_join;
        private int m_count;
        private SortedDictionary<TBasis, int> m_stat;
        public BalancedSortedList(SplitAction<TKey, TBasis, TShortKey> split,
JoinAction<TKey, TBasis, TShortKey> join)
        {
            m_cache = new SortedList<TBasis, SortedList<TShortKey, TValue>>();
            m_stat = new SortedDictionary<TBasis, int>();
            m_split = split;
            m_join = join;
        }
        private void Add(TKey key, TValue value, bool overRide)
        {
            TShortKey shortKey;
            TBasis basis;
            m_split(key, out basis, out shortKey);
            SortedList<TShortKey, TValue> values;
            lock (m_cache)
            {
                if(!m_cache.TryGetValue(basis, out values))
                {
                    int count;
                    if(m_stat.TryGetValue(basis, out count))
                        m_stat.Remove(basis);
                    values = new SortedList<TShortKey, TValue>(count);
                    m_cache.Add(basis, values);
                }
            }
            lock (values)
            {
                if (!overRide && values.ContainsKey(shortKey))
                    throw new Exception(string.Format("key {0} already
exists",key));
                var cnt = values.Count;
                values[shortKey] = value;
                m_count += values.Count - cnt;
            }
        }
       public TValue this[TKey key]
        {
            get {
                TValue value;
                if (!TryGetValue(key, out value)) return default(TValue);
                return value;
            }
            set { Add(key, value, true);}
        }
        public void Collect(TKey key)
        {
            TShortKey shortKey;
            TBasis basis;
            m_split(key, out basis, out shortKey);
            int count;
            if (m_stat.TryGetValue(basis, out count))
                count++;
            else
                count = 1;
            m_stat[basis] = count;
        }
    }

-- 
Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the mono-bugs mailing list