[Mono-list] XmlDataDocument

scottdw2@uwm.edu scottdw2@uwm.edu
Thu, 30 May 2002 10:52:55 -0500 (CDT)


This is kind of long, but it’s my $0.02 on the XmlDataDocument issue....

I'm going to the ACM SIGMOD / PODS conference next week, and one of the 
scheduled presentations is on mechanisms for indexing XML, so what I have to 
say might change in a week, but this is what I see as the best way to go right 
now:


The most important thing you need to do is create a homomorphism (a 1:1 
mapping ) between nodes in a general tree and rows in a relation. Particularly 
this means that two indexes are required, one to map an XmlElement instance to 
a DataRow instance and one to map a DataRow instance to an XmlElement 
instance. Because am XmlDataDocument can be created from a DataSet that does 
not contain any DataRelation or Constraint instances, the comparison used in 
the "indexes" between the relational view of data and the xml view of data 
cannot be based on "column" values because they won't be guaranteed to be 
unique. If this were C++ I would say that pointer values could be used as keys 
into whatever type of associative container you wanted to use as your index. 
However this isn't C++, and C# reference comparison semantics are limited to 
ReferenceEquals which is not sufficient for ordering objects (or for hashing 
them).

Because DataRow instances can't be ordered based on column values unless an 
enforced unique constraint is in place in the DataSet and that won't always be 
the case, that means something else needs to be used. I would suggest adding a 
private integer value to every DataRow instance that is set to a table-unique 
value when a DataRow is inserted into the table. This can be accomplished by 
having a private integer in the DataRowCollection used by a DataTable instance 
that is incremented via a call to interlocked increment. Many database systems 
use such a "private sequence" scheme in the absence of a primary key on a 
table. There use of that value, however, is mostly concerned with the query 
rewrite optimizer, which we don't need to worry about here.

This unique integer will not suffice for use as a key into an index that is 
used to implement a UniqueConstraint, ForeginKeyConstraint, or a DataRelation 
instance. These should use collections of instances of the Tuple class I sent 
to the group last week, or some similar mechanism, because they are completely 
dependant on value semantics.



Quoting Daniel Morgan <danmorg@sc.rr.com>:

> Jason,
> 
> Since you are very acquainted with XML in System.Xml, I thought I
> would
> ask you how would class System.Xml.XmlDataDocument in assembly
> System.Data.dll be created?  
> 
> Currently, it is only stubbed.  XmlDataDocument inherits from
> XmlDocument.  It is meant to provide relational data in a XML Document
> and interact with a DataSet.
> 
> Any ideas?
> 
> Thanks,
> Daniel
> 
> 
> 
> _______________________________________________
> Mono-list maillist  -  Mono-list@ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-list
>