[Mono-list] Problem with XmlTextReader

Atsushi Eno atsushi@ximian.com
Wed, 21 Jul 2004 13:09:14 +0900


Hello,

 > Thanks for answering my question. but the problem is that i
 > feel the XmlTextReader, just reads to the end of the stream
 > when it is instantiated (thats why it takes so long to instantiate)
 > and keeps this in memory. Then when the client asks for a Read()
 > he just gives the next element or whatever. Shouldn't the parser
 > be reading off the stream when the client gives a Read()?

In short, no. It is impossible.

I have one easy answer: for such TextReader whose CanSeek
is false (can not Peek()), we will have to cache the peek character
anyways. It have to continue reading until it encounters '<', but
you won't send the next '<foo>' element. Thus, such "wait & see"
way won't work anyways.


Another complicated case: Suppose you are going to read such XML
document like:

	<!DOCTYPE foo SYSTEM "foo.dtd">
	<root>
	&amp; &quot; &apos; &lt; &gt; are character entity.
	external &ent; &amp; &not; are defined in foo.dtd.
	</root>

There general entity "ent" and "not" are defined in foo.dtd.

When you call the third Read(), then XmlTextReader tries to
read the text node inside root element. Read() will return
true and then it represents Text node that ends with "external "
(immediate before &ent;).

When XmlTextReader found '&', it MUST NOT stop the parse since
it cannoy identify if following markup represents character
entity (&amp;, &quot;, &apos;, &lt; or &gt;) or general entity
without reading the folloing text stream.


Atsushi Eno