[Mono-bugs] [Bug 399959] BinaryReader : ReadChars problem

Fri Jun 13 17:27:45 EDT 2008

https://bugzilla.novell.com/show_bug.cgi?id=399959

User andyhume32 at yahoo.co.uk added comment
https://bugzilla.novell.com/show_bug.cgi?id=399959#c2

Andy Hume <andyhume32 at yahoo.co.uk> changed:

           What    |Removed                                         |Added
----------------------------------------------------------------------------
                 CC|                                                |andyhume32 at yahoo.co.uk

--- Comment #2 from Andy Hume <andyhume32 at yahoo.co.uk>  2008-06-13 15:27:44 MDT ---
The text in the file contains eight invalid characters so MSFT's UTF8Encoding,
as used by default by BinaryReader, converts them to U+FFFD, i.e. REPLACEMENT
CHARACTER, so gets 64 chars from 64 bytes.  Mono (etc) just skips them, so
needs to read onward to get eight more.  Hence the difference.  The U+FFFD
behaviour is new in .NET 2.0 SP1 [1], so if you run your app on MSFT FX 1.1, or
on the original FX2, you'll see the same problem.  I've tested the first at
least!

The text in the file appears to contain fifteen bytes of (ASCII/Latin) null
terminated text, with the remaining 48 bytes being uninitialised data -- thus
the invalid bytes.  If the specification says 64 _bytes_ of text then the best
solution would be to just use ReadBytes(64) and then encoding.GetString;
that'll work in all situations.  Initializing BinaryReader with Encoding.ASCII
(or a Latin one) is another possibility.

If you're also writing the file then remember to null those buffers!

[1] See 
http://support.microsoft.com/kb/940521/
http://blogs.msdn.com/michkap/archive/2007/09/17/4950277.aspx
http://blogs.msdn.com/shawnste/archive/2007/07/23/utf-16-utf-8-utf-32-update-to-conform-with-unicode-5-0-s-security-concerns.aspx
etc

-- 
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.