[Mono-list] XmlTextReader: MS compatibility, or W3C conformance?
Ian MacLean
ianm@ActiveState.com
Fri, 09 Jul 2004 14:29:18 +0900
Atsushi Eno wrote:
> Ian MacLean wrote:
>
>> Atsushi Eno wrote:
>>
>>> Ian MacLean wrote:
>>>
>>>> Atsushi Eno wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> On bugzilla we (Ian and I) were discussing on XmlTextReader
>>>>> conformance
>>>>> to XML specification. MS XmlTextReader is buggy since it accepts
>>>>> XML declaration as element content (that violates W3C XML
>>>>> specification
>>>>> section 3 Logical Structures).
>>>>> http://bugzilla.ximian.com/show_bug.cgi?id=61274
>>>>>
>>>>> However, there is another discussion that it is useful that new
>>>>> XmlTextReader (xmlText, XmlNodeType.Element, null) accepts XML
>>>>> declaration.
>>>>>
>>>>> Well, I agree that
>>>>>
>>>>> - that error-prone XmlTextReader might be useful (especially
>>>>> for people who already depends on that behavior).
>>>>> - we did not always reject Microsoft badness; for example
>>>>> we are copying System.Xml.XmlCDataSection that violates
>>>>> W3C DOM interface hierarchy (!)
>>>>>
>>>>> So it is case by case. I believe we should not allow such use of
>>>>> XmlTextReader, but I understand what Ian wants me to do. The "fix"
>>>>> can be very easily done.
>>>>>
>>>>> I don't think it is major problem. Users can easily fix this problem
>>>>> by calling MoveToContent(), or by skipping XmlDeclaration node with
>>>>> Read() method (well, to call Read() safely, users have to check if
>>>>> the reader state is Initial or not).
>>>>>
>>>>>
>>>> Its not a major problem but your workaround above only works if
>>>> every fragment you want to parse follows Document constraints - eg
>>>> single root node. What I have done now is check the incoming xml
>>>> fragment for an xml decl and if present use XmlNodeType.Document
>>>> otherwise use XmlNodeType.Element.
>>>>
>>> If XmlTextReader created with XmlNodeType.Element does not accept
>>> multiple top-level element, that is a bug (if so, please create
>>> another bugzilla entry). If you want such xml that has "XML
>>> declaration and multiple top-level elements", that sounds curious
>>> needs and I wonder what kind of use case appreciates such fix :-?
>>>
>>
>> you misunderstand. XmlNodeType.Element does accept multiple top-level
>> elements fine. And I have no need of documents with xml decl and
>> multiple top level elements.
>> What I have is some code that parses xml text fragments and those
>> fragments can be either:
>> - a complete document ( with or without xml decl )
>> or
>> - an element or nodelist ( of course without xml decl ).
>>
>> Since my parsing method just takes a string argument I need to
>> determine which of the above it is so that I know the appropriate
>> XmlNodeType argument to pass to XmlValidatingReader.
>
>
> Ah, OK. I noticed that MoveToContent() won't work without that
> XmlTextReader "fix" what we're talking about. Then users still need
> to identify the first node and which to use: XmlNodeType.Document or
> XmlNodeType.Element.
>
> Then to fix or not to fix - that is the question.
>
At the very least it should be added to a "Incompatibilities with
MS.net" document if such a beast exists. If it doesn't I think it would
be a useful addition to the docs/website.
Ian
--
Ian MacLean, Developer,
ActiveState, a division of Sophos
http://www.ActiveState.com