Thu, 07 Oct 2004 19:35:32 -0400
At 06:57 AM 06/10/2004 -0400, Jonathan Pryor wrote:
>On Wed, 2004-10-06 at 03:59, Polton, Richard (IT) wrote:
>> If the char which is to be converted is 0661, say, then what will be the
>> value of the subtraction? Will it be 0661 - 0660 or will it be 0661 -
>> 0030? I assume that a literal '0' will always map to 0030 rather than
>> cleverly detect the range of digits that the char belongs to.
>Oh. Good point. (Why didn't I think of that?) The literal '0' is
>mapped to 0030, so you'd get U+0661 - U+0030, which is *way* too big.
>So I guess the code is broken. The question is, in what way? :-/
>Now the question is: what does Microsoft's implementation do? :-)
>Someone will have to throw U+0661 at Microsoft's
>Microsoft.VisualBasic.dll and see what the return value (or exception
>generated) is. They may require a value between '0' and '9', and all
>other "Nd" digits, such as U+0661, generate exceptions.
>Alternatively, Microsoft always subtracts from the proper value.
>We can do either of these, we just need to know which to do.
I just wrote a simple test program using some of the ranges listed two
posts back. I threw "375", "\u0663\u0667\u0665", "\u09E9\u09ED\u09EB", and,
for kicks, the native Japanese representation, 5 kanji long,
"\u4E09\u767E\u4E03\u5341\u4E94" (sambyaku nanajuu go). Here are the results:
Arabic-Indic: ERROR: FormatException
Bengali: ERROR: FormatException
Japanese: ERROR: FormatException
When I concatenate the arabic-indic script to the arabic script (yielding
the string "375\u0663\u0667\u0665"), VB's Val() function returns "375". In
other words, int.Parse() should throw when it gets something that isn't in
['0', '9'] (or relevant punctuation), and
Microsoft.VisualBasic.Conversion.Val() should stop parsing when it reaches
the first such character.
Hope this helps :-)