[Mono-list] Just to clarify it: strings in .NET are in UTF-16 not UCS-2
A Rafael D Teixeira
rafaelteixeirabr@hotmail.com
Thu, 04 Oct 2001 08:30:45 -0300
>
> > C# code like this:
> >
> > string x = "\U00010001Test";
> > foreach(char c in x.ToCharArray())
> > System.Console.Write(" " + ((int)c).ToString("X4"));
> >
> > When compiled with csc and run in MS runtime, will output:
> > D800 DC01 0054 0065 0073 0074
>
>Uh oh. I am starting to get confused.
>
>Maybe they do encode the \U00010001 as two characters in the stream?
>
>Miguel.
That is what UTF-16 means, any character in the 0x10000 to 0xFFFFF range is
coded as two 16-bits chars (the 'surrogate pair')
Rafael Teixeira
Brazilian Developer
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp