[Mono-devel-list] Re: [Fwd: Re: Review of system.data]

Wed Apr 27 13:21:22 EDT 2005

Hi Uma,

I finished the failures analysis.

mono testsuite regressions:
1. 2 of 4 regressions are fixed.
2.  MonoTests.System.Data.DataSetTest:WriteDifferentNamespaceSchema: 
this test makes binary comparison of the xml. When I looked into it, our 
output holds the same data, but differently formatted. So our 
implementation is correct and in fact there is no regression. Since we 
did not made any serious changes in this area, this is probably a merge 
problem, which I will find and resolve. From the other side the test 
should be changed to compare xml correctly.
3. MonoTests.System.Data.DataViewTest.ComplexEventSequence1: this test 
is very strict and actually tests for both the documented and 
undocumented behaviour (UpdateIndex). We fixed several bugs in the 
DataView area (can be seen from our testsuite), which are documented and 
under some conditions require additional calls to UpdateIndex. I agree 
that more work should be done to minimize UpdateIndex calls as this 
affects performance. But currently I would propose to make this test 
less strict.

Mainsoft testsuite failures in Mainsoft branch:
All those failures are known, they are common to mono HEAD and us. Their 
impact is small and fixing them is not a trivial task.

If you want to move forward, we branched at r43344, you can use it as a 
base for merge.

Regards,
Konstantin Triger

Konstantin Triger wrote:

>
> Hello Uma,
>
> I'm CCing mono-devel-list to this email, so anyone who wants to review 
> the code can find it at 
> svn://mono.myrealbox.com/source/branches/Mainsoft.System.Data/mcs/class/System.Data 
>
>
> 1. There are 2 efforts made in the code: design improvements related 
> to indices and bug fixes. Attached a brief design document of the 
> indices redesign (questions are welcome). Regarding the bug fixes, I 
> will create a relevant changelog.
> 2. The testsuite we have sent is nunit compatible. To run it, one need 
> to enter the System.Data/Test/System.Data.Tests.Mainsoft directory and 
> issue "make" command. After this, the following command should be 
> issued (from the System.Data root):
> MONO_PATH="../../class/lib/default;;$MONO_PATH" mono  --debug 
> ../../class/lib/default/nunit-console.exe   
> /output:TestResult-Mainsoft.log 
> /exclude:NotWorking,ValueAdd,CAS,InetAccess 
> /xml:TestResult-Mainsoft.xml  
> ../../class/lib/default/System.Data.Tests.Mainsoft.dll
>
> Currently there are 36 failures in the HEAD branch and 7 failures in 
> the Mainsoft branch.
> From the other side there are 4 regressions running the standard mono 
> testsuite in the Mainsoft branch.
>
> I'm analysing all this now and will provide resolution/explanation for 
> each one of the above.
>
> 3. Will send later today.
>
> Regards,
> Konstantin Triger
>
>
>
> S Umadevi wrote:
>
>> Hi  We see that you have checked in the code in SVN under a different
>> branch. You had agreed to send us the following (around 2 weeks back)so
>> that the code can be reviewed effectively.
>> 1. Design document explaining the changes and why the changes are
>> done.
>> 2. Testcases that are failing and what are the testcases that pass with
>> the new code. The testcases that you have sent us are the complete set
>> and as I had mentioned in my previous mails, the testcases that are new
>> in your set needs to be identified to us.(since many  are already
>> existing in nunit style)
>> 3. Since we have implemented the indices using arrays there would be
>> some amount of performance degradation. In your mail, you had agreed to
>> share the performance data/tests with us.. Please can you send them..
>>
>> We have been spending lot of time trying to review/understand the new
>> code, since we will have to take it forward from here for the .net 2.0
>> features, the above would be very useful..
>>
>> Also there have been code fixes after you had branched out, we would
>> need to merge them too..along with merging the testcases to nunit style
>> testing
>>
>>
>> Thanks & Regards
>> Uma
>>
>>
>>
>>
>>  
>>
>
>
>------------------------------------------------------------------------
>
>
>Goal:
>Use indices in all the functionality related to Sort, Select and Filter operations in the implementation of constraints, DataTable, DataView and other classes.
>
>This implies the following requirements to the indices:
>1.	Have description consisting of columns, sort direction per column, DataViewRowState, filter expression object and provide indexing over this description.
>2.	Able to store unique values.
>3.	Preserve order inside non-unique values (need shown by tests).
>4.	Provide access by index (needed for DataView.Find and DataView indexer).
>
>In addition, the following features greatly improve usability/performance:
>1.	Common Index storage and factory to enable index reuse and update (implemented at DataTable level).
>2.	Split the index implementation to Descriptor and Data. The descriptor to be reused in other areas, i.e. constraints (yet to be done).
>3.	Index to be DataContainer friendly, i.e. store indices into DataContainer instead of DataRows
>4.	Index to have a cheap Duplicates property for easy uniqueness ensuring.
>5.	Minimize memory footprint.
>
>To meet all the requirements and provide the features above, a lot of implementation needed to be added. Moreover, the requirements 2, 3, 4 were very hard to implement using Balanced Tree data structure and a way it was used till now.
>
>We decided to choose another data structure, which will fit by design for all the above. The chosen data structure is based on a plain int[], where the stored integers are the indices into the DataContainer.
>
>The solution for each requirement:
>1.	There is a special class called "Key", which is responsible to hold the Index description. It is able to Filter, Compare and feed the Index with the relevant indices to the DataContainer.
>2.	Plain array with Merge sort is used.
>3.	Merge sort is used.
>4.	Plain array.
>
>And solution for each feature:
>1.	The DataTable holds the Index collection. Each Key class has a Compare method and an Index has a reference counting. Thus indices are reused and dropped when the reference falls to 0.
>2.	There are Key and Index classes.
>3.	Implied.
>4.	During index sort or update, a special flag is set if a duplicate is found.
>5.	Implied.
>
>  
>