[Mono-devel-list] [XSP] encoding bug reloaded
bzdurqa at wp.pl
Fri Dec 19 16:52:44 EST 2003
was closed due to same results appearing under IIS.
But after some research I found that the results are not exactly
the same, and also, that it's kind of .NET 'this-is-not-a-bug,
Unicode standard defines BOM, a signature at the beginning of data
stream, that helps to recognize "whether they [files] are in big or
little endian format — it can also serve as a hint indicating that
the file is in Unicode" .
This signature is non-obligatory though, even MSDN states: "It used
to be thought that placing a UTF-8 BOM at the beginning of a file
was undesirable, but should be respected if it's present. However,
this has been challenged recently [...] So, UTF-8 BOMs are acceptable,
but don't indicate byte-ordering."
Now back to the case:
- when a .aspx file contains some national characters inside html
part, or utf string is generated by Response.Write method, XSP
output is invalid , national chars are shown as they were UTF-16(?),
even though file is recognized as proper UTF-8.
Yes, I've tried setting Response.Charset, http content header and
- IIS acts the same way, but after you put UTF-8 BOM at the beginning
of the file (for UTF-8 it's three byte sequence: EF BB BF) IIS sends
right chars to the browser. This does not work on XSP.
Another thing that I found out (but I did not check it) is that IIS
treats files wthout signatures as 'defaultly encoded'. Same goes for
Mono/XSP, but default config settings:
seem to be ignored in this case.
- since BOM is 'acceptable' (not mandatory), I think XSP should not
require it. Mono does work this way in other cases, i.e. when parsing
UTF-8 (no BOM) XML files, national characters are displayed properly.
BOM should recognizd though, does IBM ICU library cover this?
Gonzalo, should I reopen the bug, or report a new one?
More information about the Mono-devel-list