[Mono-dev] Performance issue with DataTable.Load on "large" data sets

Nicklas Overgaard nicklas at isharp.dk
Tue Apr 12 06:09:08 EDT 2011


Hey Alan,

Thanks for picking it up :)

> Firstly the simple change of moving the BeginLoad/EndLoad out of the
> loop could easily be committed as a separate patch. If it's possible
> to verify this change with an additional unit test, all the better! It
> means it can never regress again.

Well, the thing is that the simple move of Begin/End load actually
breaks four of the tests. However, after reviewing the test code, i'm
seriously doubting that the test is correct - hence the question about
having verified it on windows :)

The patch along with a little graph showing the performance improvement
has been attached.

I hope that someone with more insigt in System.Data can shed some light
on the now-broken unit tests.

I will get back when i have "fixed" the remaining issues, which also
gives more performance.

And thanks for the tips about testing it on windows. I will figure
something out.

Best regards,

Nicklas

On Tue, 2011-04-12 at 10:38 +0100, Alan wrote:
> Hey,
> 
> Firstly the simple change of moving the BeginLoad/EndLoad out of the
> loop could easily be committed as a separate patch. If it's possible
> to verify this change with an additional unit test, all the better! It
> means it can never regress again.
> 
> As for the failing tests, the simplest thing to do would be to
> copy/paste the test assembly from linux to windows and execute it
> there to see if all the tests pass. If that doesn't work you could try
> copying/pasting the individual tests you want to verify, compiling
> them on windows and executing that. The complicated way of testing
> would be to check out mono from git, build it on windows and then run
> the tests. Either way, a commit which regresses tests can't be
> accepted unless those tests can be proven to be incorrect (i.e. the
> fail under MS .NET). It's also possible that these are behavioural
> differences between .NET 3 and .NET 4, in which case the modifications
> would have to be conditionally built.
> 
> Alan
> 
> On Tue, Apr 12, 2011 at 9:41 AM, Nicklas Overgaard <nicklas at isharp.dk> wrote:
> > Hi again,
> >
> > I have now made further optimizations, which brings the Load method up
> > to speed with the .net implementation. However, 5 of the
> > regression-tests are now failing.
> >
> > Have all these System.Data regression tests been verified on a windows
> > machine with .net? I just don't want to chase bugs / regressions that
> > does not exist/are not valid :)
> >
> > Best regards,
> >
> > Nicklas
> >
> > On Thu, 2011-04-07 at 20:13 +0200, Nicklas Overgaard wrote:
> >> Hi again,
> >>
> >> Sorry for the spamming.
> >>
> >> Moving out the "Begin" and "End" load methods reduced DataTable.Load
> >> time to 1.7 seconds on my test machine, so we are getting there!
> >>
> >> /Nicklas
> >>
> >> On Thu, 2011-04-07 at 19:29 +0200, Nicklas Overgaard wrote:
> >> > Hi again,
> >> >
> >> > I now have a profile log, created with the new mono profiler. It shows,
> >> > that the method "EndLoadData" is using up almost all of the time (16
> >> > minutes of the 17 minutes it took to create the dump).
> >> >
> >> > When looking in the file "DbDataAdapter.cs" line 355 in current GIT
> >> > head, the "BeginLoadData" and "EndLoadData" methods are called for each
> >> > iteration in the DataReader's data.
> >> >
> >> > This means that for each row we add to the DataTable, the DataSet is
> >> > begin asked to enforce constraints and other stuff in the datatable.
> >> >
> >> > According to MSDN:
> >> > http://msdn.microsoft.com/en-us/library/system.data.datatable.beginloaddata.aspx
> >> >
> >> > "BeginLoadData Turns off notifications, index maintenance, and
> >> > constraints while loading data."
> >> >
> >> > So would'nt it make sense to move "BeginLoad.." and "EndLoad.." out of
> >> > the loop?
> >> >
> >> > Well, I'm trying it out :)
> >> >
> >> > Best regards,
> >> >
> >> > Nicklas Overgaard
> >> >
> >> > On Thu, 2011-04-07 at 14:58 +0200, Nicklas Overgaard wrote:
> >> > > Hi mono-devers!
> >> > >
> >> > > I'm currently working on a rather large webproject, where we are using a
> >> > > combination of mono 2.10.1 and MySQL.
> >> > >
> >> > > Over the past week, I have observed that loading "large" datasets (5000+
> >> > > rows) from mysql into a DataTable takes a very long time.
> >> > >
> >> > > It's done somewhat like this:
> >> > > <code>
> >> > >
> >> > > comm.CommandText = query;
> >> > > comm.CommandTimeout = MySQLConnection.timeout;
> >> > > MySqlDataReader reader = (MySqlDataReader)comm.ExecuteReader();
> >> > > DataTable dt = new DataTable();
> >> > > dt.Load(reader); // <- this is killing mono
> >> > > reader.Close();
> >> > >
> >> > > </code>
> >> > >
> >> > > I have created a small testprogram, compiled it on my linux machine and
> >> > > executed it.
> >> > >
> >> > > It takes 15 seconds to do such operation under mono - but on windows it
> >> > > takes only 0.4 seconds (with the same executable, fetching the same
> >> > > data). I have profiled the application on windows, and it seems that
> >> > > the .net framework is using specialized methods for loading data from a
> >> > > datareader.
> >> > >
> >> > > I have been looking through the implementation in mono, in regard to
> >> > > DataTable.Load, and I can see that a lot of validation and other stuff
> >> > > is going on, which could explain the huge difference. I'm also working
> >> > > on a mono log profile trace, to dig a little deeper.
> >> > >
> >> > > Would it be OK, if I tried to patch the current mono implementation to
> >> > > gain the same speeds as .net? The reason for asking, is that I know that
> >> > > I cannot contribute to Mono if I have seen the actual code in .NET (but
> >> > > does a profile result count as "seeing the code"?)
> >> > >
> >> > > Best regards,
> >> > >
> >> > > Nicklas Overgaard
> >> > >
> >> > > _______________________________________________
> >> > > Mono-devel-list mailing list
> >> > > Mono-devel-list at lists.ximian.com
> >> > > http://lists.ximian.com/mailman/listinfo/mono-devel-list
> >> >
> >> > _______________________________________________
> >> > Mono-devel-list mailing list
> >> > Mono-devel-list at lists.ximian.com
> >> > http://lists.ximian.com/mailman/listinfo/mono-devel-list
> >>
> >> _______________________________________________
> >> Mono-devel-list mailing list
> >> Mono-devel-list at lists.ximian.com
> >> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> >
> > _______________________________________________
> > Mono-devel-list mailing list
> > Mono-devel-list at lists.ximian.com
> > http://lists.ximian.com/mailman/listinfo/mono-devel-list
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: runtime.png
Type: image/png
Size: 8753 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20110412/ae97b936/attachment-0001.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: begin-end-opt.patch
Type: text/x-patch
Size: 1463 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20110412/ae97b936/attachment-0001.bin 


More information about the Mono-devel-list mailing list