[Mono-dev] Performance issue with DataTable.Load on "large" data sets

Alan alan.mcgovern at gmail.com
Tue Apr 12 05:38:58 EDT 2011


Hey,

Firstly the simple change of moving the BeginLoad/EndLoad out of the
loop could easily be committed as a separate patch. If it's possible
to verify this change with an additional unit test, all the better! It
means it can never regress again.

As for the failing tests, the simplest thing to do would be to
copy/paste the test assembly from linux to windows and execute it
there to see if all the tests pass. If that doesn't work you could try
copying/pasting the individual tests you want to verify, compiling
them on windows and executing that. The complicated way of testing
would be to check out mono from git, build it on windows and then run
the tests. Either way, a commit which regresses tests can't be
accepted unless those tests can be proven to be incorrect (i.e. the
fail under MS .NET). It's also possible that these are behavioural
differences between .NET 3 and .NET 4, in which case the modifications
would have to be conditionally built.

Alan

On Tue, Apr 12, 2011 at 9:41 AM, Nicklas Overgaard <nicklas at isharp.dk> wrote:
> Hi again,
>
> I have now made further optimizations, which brings the Load method up
> to speed with the .net implementation. However, 5 of the
> regression-tests are now failing.
>
> Have all these System.Data regression tests been verified on a windows
> machine with .net? I just don't want to chase bugs / regressions that
> does not exist/are not valid :)
>
> Best regards,
>
> Nicklas
>
> On Thu, 2011-04-07 at 20:13 +0200, Nicklas Overgaard wrote:
>> Hi again,
>>
>> Sorry for the spamming.
>>
>> Moving out the "Begin" and "End" load methods reduced DataTable.Load
>> time to 1.7 seconds on my test machine, so we are getting there!
>>
>> /Nicklas
>>
>> On Thu, 2011-04-07 at 19:29 +0200, Nicklas Overgaard wrote:
>> > Hi again,
>> >
>> > I now have a profile log, created with the new mono profiler. It shows,
>> > that the method "EndLoadData" is using up almost all of the time (16
>> > minutes of the 17 minutes it took to create the dump).
>> >
>> > When looking in the file "DbDataAdapter.cs" line 355 in current GIT
>> > head, the "BeginLoadData" and "EndLoadData" methods are called for each
>> > iteration in the DataReader's data.
>> >
>> > This means that for each row we add to the DataTable, the DataSet is
>> > begin asked to enforce constraints and other stuff in the datatable.
>> >
>> > According to MSDN:
>> > http://msdn.microsoft.com/en-us/library/system.data.datatable.beginloaddata.aspx
>> >
>> > "BeginLoadData Turns off notifications, index maintenance, and
>> > constraints while loading data."
>> >
>> > So would'nt it make sense to move "BeginLoad.." and "EndLoad.." out of
>> > the loop?
>> >
>> > Well, I'm trying it out :)
>> >
>> > Best regards,
>> >
>> > Nicklas Overgaard
>> >
>> > On Thu, 2011-04-07 at 14:58 +0200, Nicklas Overgaard wrote:
>> > > Hi mono-devers!
>> > >
>> > > I'm currently working on a rather large webproject, where we are using a
>> > > combination of mono 2.10.1 and MySQL.
>> > >
>> > > Over the past week, I have observed that loading "large" datasets (5000+
>> > > rows) from mysql into a DataTable takes a very long time.
>> > >
>> > > It's done somewhat like this:
>> > > <code>
>> > >
>> > > comm.CommandText = query;
>> > > comm.CommandTimeout = MySQLConnection.timeout;
>> > > MySqlDataReader reader = (MySqlDataReader)comm.ExecuteReader();
>> > > DataTable dt = new DataTable();
>> > > dt.Load(reader); // <- this is killing mono
>> > > reader.Close();
>> > >
>> > > </code>
>> > >
>> > > I have created a small testprogram, compiled it on my linux machine and
>> > > executed it.
>> > >
>> > > It takes 15 seconds to do such operation under mono - but on windows it
>> > > takes only 0.4 seconds (with the same executable, fetching the same
>> > > data). I have profiled the application on windows, and it seems that
>> > > the .net framework is using specialized methods for loading data from a
>> > > datareader.
>> > >
>> > > I have been looking through the implementation in mono, in regard to
>> > > DataTable.Load, and I can see that a lot of validation and other stuff
>> > > is going on, which could explain the huge difference. I'm also working
>> > > on a mono log profile trace, to dig a little deeper.
>> > >
>> > > Would it be OK, if I tried to patch the current mono implementation to
>> > > gain the same speeds as .net? The reason for asking, is that I know that
>> > > I cannot contribute to Mono if I have seen the actual code in .NET (but
>> > > does a profile result count as "seeing the code"?)
>> > >
>> > > Best regards,
>> > >
>> > > Nicklas Overgaard
>> > >
>> > > _______________________________________________
>> > > Mono-devel-list mailing list
>> > > Mono-devel-list at lists.ximian.com
>> > > http://lists.ximian.com/mailman/listinfo/mono-devel-list
>> >
>> > _______________________________________________
>> > Mono-devel-list mailing list
>> > Mono-devel-list at lists.ximian.com
>> > http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>
>> _______________________________________________
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>


More information about the Mono-devel-list mailing list