[Mono-dev] Error Normalizing Arabic Strings
Atsushi Eno
atsushieno at veritas-vos-liberabit.com
Fri Sep 11 21:08:59 EDT 2009
Hi Tom,
Thanks. I'll have a look next week.
I'm not passionate to fix every unicode normalization issue unless I can
fully commit this
area. .NET is not very excellent on Unicode compliant normalization (it
fails much more
than us according to normalization tests). Crashers like this are no
good though.
Atsushi Eno
On 2009/09/12 7:36, Tom Philpot wrote:
> I just discovered more Unicode Normalization Bugs in Mono SVN.
>
>
> using System;
> using System.Text;
>
> namespace Test
> {
>
> public class NormalizationTest_Arabic {
>
> public void TestNormalization() {
> char[] originalChars = new char [] { '\u064A', '\u064F',
> '\u0648', '\u0654', '\u0652', '\u064A', '\u064F', '\u0648', '\u0654' };
>
> // Results from http://minaret.info/test/normalize.msp
> char[] formC = new char [] { '\u064A', '\u064F', '\u0624',
> '\u0652', '\u064a', '\u064f', '\u0624' };
> char[] formD = new char [] { '\u064A', '\u064F', '\u0648',
> '\u0652', '\u0654', '\u064a', '\u064f', '\u0648', '\u0654' };
> char[] formKC = new char [] { '\u064A', '\u064F', '\u0624',
> '\u0652', '\u064a', '\u064f', '\u0624' };
> char[] formKD = new char [] { '\u064A', '\u064F', '\u0648',
> '\u0652', '\u0654', '\u064a', '\u064f', '\u0648', '\u0654' };
>
> string str = new string(originalChars);
>
> string strNormalizedC = str.Normalize(NormalizationForm.FormC);
> string strNormalizedD = str.Normalize(NormalizationForm.FormD);
> string strNormalizedKC =
> str.Normalize(NormalizationForm.FormKC);
> string strNormalizedKD =
> str.Normalize(NormalizationForm.FormKD);
> }
>
> public static void Main()
> {
> NormalizationTest_Arabic nta = new NormalizationTest_Arabic();
> nta.TestNormalization();
> }
> }
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
>
>
More information about the Mono-devel-list
mailing list