[Mono-dev] Error Normalizing Arabic Strings

Tom Philpot tom.philpot at logos.com
Fri Sep 11 18:51:18 EDT 2009


I forgot to mention what happens. You get an unhandled exception.

We're planning on writing a process that goes through and finds a bunch of
these cases so we can submit a huge bug report of all the cases we know of
where Mono differs from say ICU's normalization.

ws1048:~ tom.philpot$ gmcs ArabicBug.cs -out:ArabicBug.exe

ws1048:~ tom.philpot$ mono --debug ./ArabicBug.exe

Unhandled Exception: System.SystemException: Internal error: should not
happen.
  at Mono.Globalization.Unicode.Normalization.Combine
(System.Text.StringBuilder sb, Int32 start, Int32 checkType) [0x00000]
  at Mono.Globalization.Unicode.Normalization.Compose (System.String source,
Int32 checkType) [0x00000]
  at Mono.Globalization.Unicode.Normalization.Normalize (System.String
source, Int32 type) [0x00000]
  at System.String.Normalize (NormalizationForm normalizationForm) [0x00000]
  at Test.NormalizationTest_Arabic.TestNormalization () [0x00000]
  at Test.NormalizationTest_Arabic.Main () [0x00000]



On 9/11/09 3:36 PM, "Tom Philpot" <tom.philpot at logos.com> wrote:

> I just discovered more Unicode Normalization Bugs in Mono SVN.
> 
> 
> using System;
> using System.Text;
> 
> namespace Test
> {
> 
>     public class NormalizationTest_Arabic {
> 
>         public void TestNormalization() {
>             char[] originalChars = new char [] { '\u064A', '\u064F',
> '\u0648', '\u0654', '\u0652', '\u064A', '\u064F', '\u0648', '\u0654' };
> 
>             // Results from http://minaret.info/test/normalize.msp
>             char[] formC = new char [] { '\u064A', '\u064F', '\u0624',
> '\u0652', '\u064a', '\u064f', '\u0624' };
>             char[] formD = new char [] { '\u064A', '\u064F', '\u0648',
> '\u0652', '\u0654', '\u064a', '\u064f', '\u0648', '\u0654' };
>             char[] formKC = new char [] { '\u064A', '\u064F', '\u0624',
> '\u0652', '\u064a', '\u064f', '\u0624' };
>             char[] formKD = new char [] { '\u064A', '\u064F', '\u0648',
> '\u0652', '\u0654', '\u064a', '\u064f', '\u0648', '\u0654' };
>                  
>             string str = new string(originalChars);
>         
>             string strNormalizedC = str.Normalize(NormalizationForm.FormC);
>             string strNormalizedD = str.Normalize(NormalizationForm.FormD);
>             string strNormalizedKC =
> str.Normalize(NormalizationForm.FormKC);
>             string strNormalizedKD =
> str.Normalize(NormalizationForm.FormKD);
>         }
>     
>         public static void Main()
>         {
>             NormalizationTest_Arabic nta = new NormalizationTest_Arabic();
>             nta.TestNormalization();
>         }
>     }
> 
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list



More information about the Mono-devel-list mailing list