[Mono-dev] Regex and Unicode
Vladimir Giszpenc
vgiszpenc at dsci.com
Wed Jul 9 10:02:51 EDT 2008
Hello list members,
Since .Net regular expressions don't do POSIX, I replace POSIX character
classes by Unicode ones. See
http://www.regular-expressions.info/posixbrackets.html for the translation
table I used.
I ran into a problem with
POSIX Description ASCII Unicode
[:alpha:] Alphabetic characters [a-zA-Z] \p{L&}
The problem occurs in .Net too so you might choose not to fix it and I would
understand. I hope however, you can fix it (and get MS to do the same).
The \p{L&} regular expression throws an exception because the ampersand is
neither a character nor a close brace.
System.ArgumentException: parsing "\p{L&}" - Incomplete \p{X} character
escape.
Parameter name: \p{L&}
at System.Text.RegularExpressions.Syntax.Parser.ParseUnicodeCategory ()
[0x000a8] in
/tmp/monobuild/build/BUILD/mono-1.9.1/mcs/class/System/System.Text.RegularEx
pressions/parser.cs:796
>>> Rest of exception clipped <<<
using System;
using System.Text.RegularExpressions;
namespace test
{
class MainClass
{
public static void Main(string[] args)
{
Regex r = new Regex(@"\p{L&}"); //running this code will
throw an exception
}
}
}
Since this might be by design, I am not using bugzilla. I hope someone can
tell me how this is supposed to work first. Also, I am not looking for a
workaround, as I have one. This is more of an FYI.
Thanks,
Vlad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080709/fd063916/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3604 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080709/fd063916/attachment-0001.bin
More information about the Mono-devel-list
mailing list