[Mono-dev] Regex and Unicode

Vladimir Giszpenc vgiszpenc at dsci.com
Wed Jul 9 10:02:51 EDT 2008

Hello list members,


Since .Net regular expressions don't do POSIX, I replace POSIX character
classes by Unicode ones.  See
http://www.regular-expressions.info/posixbrackets.html for the translation
table I used.


I ran into a problem with

POSIX       Description             ASCII       Unicode

[:alpha:]  Alphabetic characters  [a-zA-Z]    \p{L&}


The problem occurs in .Net too so you might choose not to fix it and I would
understand.  I hope however, you can fix it (and get MS to do the same).


The \p{L&} regular expression throws an exception because the ampersand is
neither a character nor a close brace.  


System.ArgumentException: parsing "\p{L&}" - Incomplete \p{X} character

Parameter name: \p{L&}

  at System.Text.RegularExpressions.Syntax.Parser.ParseUnicodeCategory ()
[0x000a8] in

>>> Rest of exception clipped <<<



using System;

using System.Text.RegularExpressions;


namespace test


      class MainClass


            public static void Main(string[] args)


                  Regex r = new Regex(@"\p{L&}"); //running this code will
throw an exception





Since this might be by design, I am not using bugzilla.  I hope someone can
tell me how this is supposed to work first.  Also, I am not looking for a
workaround, as I have one.  This is more of an FYI.






-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080709/fd063916/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3604 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080709/fd063916/attachment-0001.bin 

More information about the Mono-devel-list mailing list