[Mono-list] conversions

Jonathan Gilbert 2a5gjx302@sneakemail.com
Thu, 07 Oct 2004 19:35:32 -0400

Previous message: [Mono-list] How many strings do I have?
Next message: [Mono-list] trn.coding.* Newsgroups
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

At 06:57 AM 06/10/2004 -0400, Jonathan Pryor wrote:
>On Wed, 2004-10-06 at 03:59, Polton, Richard (IT) wrote:
>> If the char which is to be converted is 0661, say, then what will be the
>> value of the subtraction? Will it be 0661 - 0660 or will it be 0661 -
>> 0030? I assume that a literal '0' will always map to 0030 rather than
>> cleverly detect the range of digits that the char belongs to.
>
>Oh.  Good point.  (Why didn't I think of that?)  The literal '0' is
>mapped to 0030, so  you'd get U+0661 - U+0030, which is *way* too big.
>
>So I guess the code is broken.  The question is, in what way? :-/
>
>Now the question is: what does Microsoft's implementation do? :-)
>
>Someone will have to throw U+0661 at Microsoft's
>Microsoft.VisualBasic.dll and see what the return value (or exception
>generated) is.  They may require a value between '0' and '9', and all
>other "Nd" digits, such as U+0661, generate exceptions.
>
>Alternatively, Microsoft always subtracts from the proper value.
>
>We can do either of these, we just need to know which to do.

I just wrote a simple test program using some of the ranges listed two
posts back. I threw "375", "\u0663\u0667\u0665", "\u09E9\u09ED\u09EB", and,
for kicks, the native Japanese representation, 5 kanji long,
"\u4E09\u767E\u4E03\u5341\u4E94" (sambyaku nanajuu go). Here are the results:

int.Parse(..):
  Arabic: 375
  Arabic-Indic: ERROR: FormatException
  Bengali: ERROR: FormatException
  Japanese: ERROR: FormatException
VB.NET's Val(..):
  Arabic: 375
  Arabic-Indic: 0
  Bengali: 0
  Japanese: 0

When I concatenate the arabic-indic script to the arabic script (yielding
the string "375\u0663\u0667\u0665"), VB's Val() function returns "375". In
other words, int.Parse() should throw when it gets something that isn't in
['0', '9'] (or relevant punctuation), and
Microsoft.VisualBasic.Conversion.Val() should stop parsing when it reaches
the first such character.

Hope this helps :-)

Jonathan Gilbert

Previous message: [Mono-list] How many strings do I have?
Next message: [Mono-list] trn.coding.* Newsgroups
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]