[Mono-list] XmlTextReader: MS compatibility, or W3C conformance?

Atsushi Eno atsushi@ximian.com
Fri, 09 Jul 2004 14:13:49 +0900


Ian MacLean wrote:

> Atsushi Eno wrote:
> 
>> Ian MacLean wrote:
>>
>>> Atsushi Eno wrote:
>>>
>>>> Hello,
>>>>
>>>> On bugzilla we (Ian and I) were discussing on XmlTextReader conformance
>>>> to XML specification. MS XmlTextReader is buggy since it accepts
>>>> XML declaration as element content (that violates W3C XML specification
>>>> section 3 Logical Structures).
>>>> http://bugzilla.ximian.com/show_bug.cgi?id=61274
>>>>
>>>> However, there is another discussion that it is useful that new
>>>> XmlTextReader (xmlText, XmlNodeType.Element, null) accepts XML 
>>>> declaration.
>>>>
>>>> Well, I agree that
>>>>
>>>>     - that error-prone XmlTextReader might be useful (especially
>>>>       for people who already depends on that behavior).
>>>>     - we did not always reject Microsoft badness; for example
>>>>       we are copying System.Xml.XmlCDataSection that violates
>>>>       W3C DOM interface hierarchy (!)
>>>>
>>>> So it is case by case. I believe we should not allow such use of
>>>> XmlTextReader, but I understand what Ian wants me to do. The "fix"
>>>> can be very easily done.
>>>>
>>>> I don't think it is major problem. Users can easily fix this problem
>>>> by calling MoveToContent(), or by skipping XmlDeclaration node with
>>>> Read() method (well, to call Read() safely, users have to check if
>>>> the reader state is Initial or not).
>>>>  
>>>>
>>> Its not a major problem but your workaround above only works if every 
>>> fragment you want to parse follows Document constraints - eg single 
>>> root node. What I have done now is check the incoming xml fragment 
>>> for an xml decl and if present use XmlNodeType.Document otherwise use 
>>> XmlNodeType.Element.
>>>
>> If XmlTextReader created with XmlNodeType.Element does not accept
>> multiple top-level element, that is a bug (if so, please create
>> another bugzilla entry). If you want such xml that has "XML
>> declaration and multiple top-level elements", that sounds curious
>> needs and I wonder what kind of use case appreciates such fix :-?
>>
> 
> you misunderstand. XmlNodeType.Element does accept multiple top-level 
> elements fine. And I have no need of documents with xml decl and 
> multiple top level elements.
> What I have is some code that parses xml text fragments and those 
> fragments can be either:
> - a complete document ( with or without xml decl )
> or
> - an element or nodelist ( of course without xml decl ).
> 
> Since my parsing method just takes a string argument I need to determine 
> which of the above it is so that I know the appropriate XmlNodeType 
> argument to pass to XmlValidatingReader.

Ah, OK. I noticed that MoveToContent() won't work without that
XmlTextReader "fix" what we're talking about. Then users still need
to identify the first node and which to use: XmlNodeType.Document or
XmlNodeType.Element.

Then to fix or not to fix - that is the question.

Atsushi Eno