[Mono-list] Unhandled Exception: System.ArgumentException: Arg_InvalidUTF8

btouchet btouchet@drakonis.dyndns.org
28 Jan 2003 07:47:46 -0500


--=-S9TXkbrH65Xenxr/0qPo
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Tue, 2003-01-28 at 06:26, A Rafael D Teixeira wrote:
> UTF8 is and encoding with very strict rules, it was made so to allow you =
to=20
> detect if you are trying to read text that perhaps is in another encoding=
,=20
> like the ISO8859-* or Windows125* families.
>=20
> I think the exception may be too harsh a measure, but surely you have to =
at=20
> least ignore those characters. To pass them along is to surely transfer t=
he=20
> problem to the client code in an clueless way.
>=20

I kind of got that from the code :) the only issue i had was that it
didn't do this under .NET, the same code seems just ignores the extra
character. From the look o fit it could also happen in
InternalGetCharCount.

> In resume:
>=20
> If you have characters (bytes in truth) in your text, that are greater th=
an=20
> 0x7F and aren't valid start codes (the start code tells the count of byte=
s=20
> that will follow) followed by their proper number of complementary bytes,=
=20
> either these bytes ARE garbage (generated by an bad application) or the b=
yte=20
> stream IS ENCODED with another encoding.
>=20

If it is done with another encoding, is there a better to get it to it
so that this problem goes away?=20


--=20
btouchet <btouchet@drakonis.dyndns.org>

--=-S9TXkbrH65Xenxr/0qPo
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQA+Nnvyvko8cCR4S6IRAgbkAKCpx8qj73fr+S4CvvAUAhOvqSH34ACgkzU9
s/oH3Kzsrk/q4c9Z/yw0OF4=
=T4NA
-----END PGP SIGNATURE-----

--=-S9TXkbrH65Xenxr/0qPo--