[Mono-list] Character encoding problems with System.XML.XmlDocument

Jeroen Pulles jeroen.pulles at redslider.net
Fri Sep 23 11:30:37 EDT 2005


Hi,

I am new to Mono (and C#). I thought it was best to start out with
something I do quite often in my work: open some XML file in a DOM, read
and/or edit it, and save it to file again.

I am having some trouble, however, with a simple program that loads and
saves a small XML document with three common special characters: an
e-umlaut, non-breaking space and the euro monetary character. So far, I
haven't found a method to do this in a non-UTF-8 encoding, without
losing information :-( .
I have three problems:

1 - Special characters that don't fit in the output encoding are reduced
to ? instead of a numerical entity like ë,

2 - The XML declaration is not followed by a newline when using
XmlTextWriter,

3 - The encoding specified in the XML declaration does not correspond
with the actual output; Lacking an encoding specification in the XML
declaration I get e-umlaut in Latin 1 (as per my system locale setting,
I'm guessing).

Attached you'll find the two sample documents (example.xml,
example_ascii.xml), the program (dom.cs) and the output (output.txt).
Only one of the 10 variations produces output that is valid XML and
contains the same information as the input document (example.xml, Case e).

I may be going around this the wrong way, but I can't find any samples
or tutorials on this subject anywhere. And Bugzilla doesn't seem to turn
up any bugs on the subject.

Anyone?

I'm running mono 1.1.8 (debian testing, powerpc).

regards,
jeroen

(the attached files can also be found on
http://www.redslider.net/2005/mono/)


-------------- next part --------------
using System.Xml;
using System.IO;
using System.Text;

/** Simple dom example */
class Dom 
{

    private static void load_and_print(string filename)
    {
        XmlDocument d = new XmlDocument();
        d.Load(filename);
        System.Console.Out.WriteLine(filename + ":");
        System.Console.Out.WriteLine("Case a:");
        d.Save(System.Console.Out);
        System.Console.Out.WriteLine("\nCase b:");
        d.Save(new StreamWriter(System.Console.OpenStandardOutput(), Encoding.ASCII));
        System.Console.Out.WriteLine("\nCase c:");
        d.Save(new XmlTextWriter(System.Console.Out));
        System.Console.Out.WriteLine("\nCase d:");
        d.Save(new XmlTextWriter(System.Console.OpenStandardOutput(), Encoding.ASCII));
        System.Console.Out.WriteLine("\nCase e:");
        d.Save(System.Console.OpenStandardOutput());
        System.Console.Out.WriteLine("");
    }
    
    public static void Main(string[] args)
    {
        load_and_print("example.xml");
        load_and_print("example_ascii.xml");
    }
}

-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.xml
Type: text/xml
Size: 57 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-list/attachments/20050923/78e11d5b/example.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example_ascii.xml
Type: text/xml
Size: 78 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-list/attachments/20050923/78e11d5b/example_ascii.xml
-------------- next part --------------
example.xml:
Case a:
<?xml version="1.0"?>
<text>ë ?</text>
Case b:
<?xml version="1.0"?>
<text>???</text>
Case c:
<?xml version="1.0"?><text>ë ?</text>
Case d:
<?xml version="1.0"?><text>???</text>
Case e:
<?xml version="1.0"?>
<text>ë €</text>
example_ascii.xml:
Case a:
<?xml version="1.0" encoding="US-ASCII"?>
<text>ë ?</text>
Case b:
<?xml version="1.0" encoding="US-ASCII"?>
<text>???</text>
Case c:
<?xml version="1.0" encoding="US-ASCII"?><text>ë ?</text>
Case d:
<?xml version="1.0" encoding="US-ASCII"?><text>???</text>
Case e:
<?xml version="1.0" encoding="US-ASCII"?>
<text>???</text>



More information about the Mono-list mailing list