[Mono-list] Just to clarify it: strings in .NET are in UTF-16 not UCS-2
A Rafael D Teixeira
Thu, 04 Oct 2001 08:30:45 -0300
> > C# code like this:
> > string x = "\U00010001Test";
> > foreach(char c in x.ToCharArray())
> > System.Console.Write(" " + ((int)c).ToString("X4"));
> > When compiled with csc and run in MS runtime, will output:
> > D800 DC01 0054 0065 0073 0074
>Uh oh. I am starting to get confused.
>Maybe they do encode the \U00010001 as two characters in the stream?
That is what UTF-16 means, any character in the 0x10000 to 0xFFFFF range is
coded as two 16-bits chars (the 'surrogate pair')
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp