[Mono-list] UTF-16 and XmlTextReader questions

Brion Vibber brion at pobox.com
Fri Jul 29 18:26:33 EDT 2005


François Garillot wrote:
> OK. I work from this basefile :
> <test>á</test>
> hexdump:
> 0000000 743c 7365 3e74 3ce1 742f 7365 3e74

This file does not appear to be UTF-16 at all; it appears to be in an
8-bit encoding, ISO 8859-1. In that encoding it is meaningful XML.

> I take the base file again and run 'iconv -f utf-16 -t utf-16' on it.
> I get :
> ÿþ<test>á</test>
>
> hexdump:
> 0000000 feff 743c 7365 3e74 3ce1 742f 7365 3e74

This file, interpreted as UTF-16, reads as a series of Han characters:
"琼獥㹴㳡琯獥㹴". It doesn't contain any document element, so is
interpreted as a text node which is illegal outside of an element:

> Unhandled Exception: System.Xml.XmlException: Text node cannot appear in
> this state. file://test.xml Line 1, position 1.
> in <0x001ee> System.Xml.XmlTextReader:ReadText (Boolean notWhitespace)
> in <0x00186> System.Xml.XmlTextReader:ReadContent ()
> in <0x0011f> System.Xml.XmlTextReader:Read ()
> in <0x00071> test:Main ()

-- brion vibber (brion @ pobox.com)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://lists.ximian.com/pipermail/mono-list/attachments/20050730/095e7a06/signature.bin


More information about the Mono-list mailing list