[Mono-dev] Performance problem with System.Data

senganal thirunavakarasu senganal at gmail.com
Tue Feb 5 12:27:24 EST 2008


Hi

as i said before, my whole reason for trying out the RBTree implementation
was to be compare with .net performance for the
usecases mentioned (some kind of sorted order for the data) before.. i was
trying to see if using a balanced tree would give me
comparable results. on the other-hand,  there is always a tradeoff with
memory .. i dont remember seeing any major difference in memory
consumption between the basic implementation and .net (i mean in orders) for
large datasets.. infact, if mem consumption is a big issue, then u could
have a mixed implementation (messy to maintain ofcourse) and use the tree
based storage purely for sorted data and the basic array implementation for
others..

i guess this is something that needs to tested and figured out for different
usecases and compared with .net..

cheers
senganal

On Feb 5, 2008 5:19 AM, Konstantin Triger <kostat at mainsoft.com> wrote:

>  Hello,
>
>
>
> My major concern with an RBTree implementation is memory consumption.
> Since you must create a node for each record, for a DataTable with several
> hundreds thousands records memory size of RBTree may reach several MB. Since
> it's common that DataTable has several constraints/views, memory consumption
> may easily reach hundred MB or more... In addition, this will pose an
> additional pressure on GC.
>
>
>
> Note that one of design goals behind array based architecture of both
> DataContainers and indices was to minimize memory consumption due to the
> calculations above.
>
>
>
> I think that we must run many different tests/scenarios before we go with
> index datastructure redesign.
>
>
>
> Regards,
>
> Konstantin Triger
>
> *From:* senganal thirunavakarasu [mailto:senganal at gmail.com]
> *Sent:* Sunday, February 03, 2008 6:29 PM
> *To:* Nagappan A
> *Cc:* Konstantin Triger; Hubert FONGARNAND;
> mono-devel-list at lists.ximian.com
>
> *Subject:* Re: [Mono-dev] Performance problem with System.Data
>
>
>
> Hi
>
> Its been quite a while (~1.5yrs) and i dont remember all the details , but
> will try and explain whatever i remember..
>
> Well, the difference is when u have constraints on the datatable.. loading
> lots of data is inherently slower in an array based storage as adding new
> data would mean inserting data and not just appending it to the end..
> ofcourse, if u disable all constraints and load all the data and then enable
> the constraints, things should be fine .. but on testing .net
> implementation, both the usecases (with and without the constraints) were
> fast as compared the mono implementation..
>
> using a RBTree, inserting/searching are not expensive operations and hence
> perform reasonably well in both the cases.. infact, its prob a lil bit
> slower than the array implementation when the constraints are not enforced
> but thats negligibly small ..
>
> the RBTree implementation that i had was pretty much a basic
> implementation straight out of cormen book..  the idea was to check if there
> will be any significant effect.. It definitely could be better implemented,
> which is why i had not checked it in at that time.. anyways, an RBTree or
> any balanced tree based implementation would definitely be faster than the
> array based implementation when it comes to loading/modifying the data in
> dataTable..
>
> hope that helps..
>
> cheers
> senganal
>
> On Feb 3, 2008 4:03 AM, Nagappan A <nagappan at gmail.com> wrote:
>
> Hi Kosta,
>
> I haven't executed / compiled any of the programs :) FYI.
>
> Attaching what ever with me.
>
> Thanks
> Nagappan
>
>
>
> On Feb 3, 2008 12:02 AM, Konstantin Triger <kostat at mainsoft.com> wrote:
>
> Hi Nagappan,
>
>
>
> As far as I know, when adding many records, the suggested usage of
> DataTable is [BeginLoadData -> add records -> EndLoadData]. In this case the
> performance of both implementation should be roughly similar, but the memory
> footprint of RBTree will be much higher.
>
>
>
> Can you please post the Senganal's test code?
>
>
>
> Regards,
>
> Konstantin Triger
>
> *From:* Nagappan A [mailto:nagappan at gmail.com]
> *Sent:* Saturday, February 02, 2008 11:39 PM
> *To:* Konstantin Triger
> *Cc:* Hubert FONGARNAND; mono-devel-list at lists.ximian.com;
> senganal at gmail.com
> *Subject:* Re: [Mono-dev] Performance problem with System.Data
>
>
>
> Hi Kosta,
>
> RBTree implementation is not directly related to this bug, but I was
> trying to say, in general about the performance of System.Data.
>
> In general RBTree performance is much better than Array based. As per
> Senganal's test result, for adding 1 million records, it took 40 minutes.
> With RBTree implementation, he was able to do them in seconds.
>
> Adding senganal in CC.
>
> Thanks
> Nagappan
>
> 2008/2/2 Konstantin Triger <kostat at mainsoft.com>:
>
> Hey Nagappan,
>
> Can you please explain
> 1. How RBTree implementation will solve the issue in the bug?
> 2. Why do you think RBTree implementation will be superior over Array in
> performance?
>
>
>
> Regards,
> Konstantin Triger
>
>
> --
> Linux Desktop Testing Project - http://ldtp.freedesktop.org
> http://nagappanal.blogspot.com
>
>
>
>
> --
> Linux Desktop Testing Project - http://ldtp.freedesktop.org
> http://nagappanal.blogspot.com
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080205/3df8ca31/attachment.html 


More information about the Mono-devel-list mailing list