[Mono-dev] Unhandled Exception in Normalization.cs Combine()

Tom Philpot tom.philpot at logos.com
Thu Jun 18 16:51:06 EDT 2009


Here is a revision of the test case I sent earlier to the list that doesn't
rely on any specific encoding (only uses '\uXXXX' characters).

Hopefully this will be helpful.

Tom


On 6/18/09 1:49 PM, "Tom Hindle" <tom_hindle at sil.org> wrote:

> Hi Guys,
> 
> With regard to recent Normalization changes I have just run our test
> suite with recent mono r136422 - and are getting a number of
> regressions.
> 
> 
> For example:
> 
> {
> string styleName = "\u00e1bc";
> StStyle style = new StStyle();
> Cache.LangProject.StylesOC.Add(style);
> style.Name = styleName;
> 
> FwStyleSheet.StyleInfoCollection styleCollection = new
> FwStyleSheet.StyleInfoCollection();
> styleCollection.Add(new BaseStyleInfo(style));
> 
> Assert.IsTrue(styleCollection.Contains(styleName.Normalize(NormalizationForm.F
> ormC))); Assert.IsTrue(styleCollection.Contains(styleName.Normalize(Normalizat
> ionForm.FormD)));   Assert.IsTrue(styleCollection.Contains(styleName.Normalize
> (NormalizationForm.FormKC))); Assert.IsTrue(styleCollection.Contains(styleName
> .Normalize(NormalizationForm.FormKD)));
> }
> 
> is now failing, as well as other larger unit tests.
> 
> I will look info this further to try and produce an example test program
> that doesn't contain references to our code base.
> 
> Thanks
> Tom
> 
> On Thu, 2009-06-18 at 15:01 +0900, Atsushi Eno wrote:
>> Hi,
>> 
>> If you mean the test cases by the previous email, then that's what
>> (I said) includes raw native encoding in your land (Latin1?) and is
>> what I cannot read. Replace them all with ASCII representation (\uxxxx).
>> 
>> Even if the attachment includes encoding (you mean BOMs?), it is
>> not readable in some environment (like the text editor I use on
>> Windows). Let me repeat, Latin1 is not universal. Don't depend on it
>> (if you do).
>> 
>> Atsushi Eno
>> 
>> 
>> Tom Philpot wrote:
>>> Atsushi,
>>> 
>>> Thanks for the feedback. For some reason, the Mac when displaying
>>> unicode always composes strings before display. I'll look at the test
>>> case in corlib tomorrow when I get in to work. Would it be helpful for
>>> the test cases if I gave you both the formD bytes and the formC bytes
>>> that I think are correct for the test case I sent? Perhaps the encoding
>>> did not come across in the attachment.
>>> 
>>> We have a workaround for the Mac port of our app which would require
>>> overriding string.Normalize to p/invoke to Mac OS X's NSString library
>>> to do normalization. It would work, but we would prefer not to have to
>>> ship a custom build of Mono. The normalization on .NET appears to be
>>> "good enough" for our purposes and we'd just like our Mac version to be
>>> consistent.
>>> 
>>> Tom
>>> 
>>> -----Original Message-----
>>> From: Atsushi Eno [mailto:atsushieno at veritas-vos-liberabit.com]
>>> Sent: Wed 6/17/2009 7:51 PM
>>> To: Tom Philpot
>>> Cc: mono-devel-list at ximian.com
>>> Subject: Re: [Mono-dev] Unhandled Exception in Normalization.cs Combine()
>>> 
>>> You seem to have embedded raw native encoding in your land that
>>> is *not* understandable in Japan. Anyways the input string you
>>> posted in the previous sample was already in FormC which will
>>> look like "doing nothing" as the conversion results.
>>> 
>>> There is a standalone normalization test generated from normalization
>>> conformance test in corlib/Mono.Globalization.Unicode. We fail
>>> about 26000. Far from good, but still better than 35000 on .NET.
>>> 
>>> Atsushi Eno
>>> 
>>> Tom Philpot wrote:
>>>> Now, string.Normalize(NormalizationForm.FormC) doesn't do anything using
>>>> mono (r136228).
>>>> 
>>>> I've attached some test cases which will hopefully help in tracking down
>>>> what doesn't work.
>>>> 
>>>> On 6/15/09 1:58 AM, "Atsushi Eno" <atsushieno at veritas-vos-liberabit.com>
>>>> wrote:
>>>> 
>>>>> Hi again,
>>>>> 
>>>>> It should be now fixed in trunk.
>>>>> 
>>>>> Atsushi Eno
>>>>> 
>>>>> Atsushi Eno wrote:
>>>>>> I'll have a look. However since 4 years have passed since I wrote it,
>>>>>> I'll have to revisit the spec and will take not a little time.
>>>>>> 
>>>>>> Atsushi Eno
>>>>>> 
>>>> 
>>> 
>>> 
>> 
>> _______________________________________________
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: NormTest3.cs
Type: application/octet-stream
Size: 3201 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20090618/f5930d60/attachment-0001.obj 


More information about the Mono-devel-list mailing list