[Mono-bugs] [Bug 480178] System.Globalization.CharUnicodeInfo.GetUnicodeCategory() does not handle surrogate characters appropriately.

bugzilla_noreply at novell.com bugzilla_noreply at novell.com
Mon May 17 14:16:17 EDT 2010


http://bugzilla.novell.com/show_bug.cgi?id=480178

http://bugzilla.novell.com/show_bug.cgi?id=480178#c37


Damien Diederen <dd at crosstwine.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #362382|0                           |1
        is obsolete|                            |

--- Comment #37 from Damien Diederen <dd at crosstwine.com> 2010-05-17 18:16:16 UTC ---
Created an attachment (id=362712)
 --> (http://bugzilla.novell.com/attachment.cgi?id=362712)
create-category-table: Utility to generate reasonably-packed Unicode tables

This program generates (partially) bi-level tables encoding the
contents of the Unicode character category database.

Mono embeds a linear table with category codes for the Unicode
BMP (first 65536 codepoints), and lacks information about characters
in the astral planes--leading to requests such as bug 480178.
Extending the linear table to cover the full codespace is not an ideal
solution, as that would expand the embedded "blob" by a factor of 17.

The new tables generated by this program can be used to support the
full range of characters.  An additional level of indirection used for
characters outside the U+0000..U+FFFF range enables "page" sharing, so
that the total amount of embedded data only grows by 13.5kB.

Cf. in-file comments for usage instructions.

-- 
Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.


More information about the mono-bugs mailing list