[Mono-bugs] [Bug 480178] System.Globalization.CharUnicodeInfo.GetUnicodeCategory() does not handle surrogate characters appropriately.
bugzilla_noreply at novell.com
bugzilla_noreply at novell.com
Wed Apr 21 15:40:49 EDT 2010
http://bugzilla.novell.com/show_bug.cgi?id=480178
http://bugzilla.novell.com/show_bug.cgi?id=480178#c17
--- Comment #17 from Damien Diederen <dd at crosstwine.com> 2010-04-21 19:40:47 UTC ---
Hi Paolo,
(In reply to comment #16)
> I was playing with a bi-level table compression myself a few months
> ago, so the general approach is fine by me.
Okay.
> On the specific implementation, I'm not sure some of the additional complexity
> in your changes is worth it. Let's consider a 256 byte page size. Any category
> lookup could be done with:
> char_data [char_start [val >> 8] + (val & Oxff)]
> this is more compact and in most cases should be better than the branchy code
> in your patch. Care to try that out or did you already test something similar?
Possibly. This implementation is a (more or less) direct port of
GLib's, which itself comes from libunicode, and I must admit I haven't
had time to research the history of that solution, nor to explore
alternative ones.
This is definitely worth trying and measuring, though. I will look
into it before submitting an updated series.
> As for multiple versions of the data, we likely just want to use the
> latest, but once we have numbers about the cost of this we could
> reconsider.
Okay; I will focus on getting numbers for the simple lookup technique
first.
--
Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.
More information about the mono-bugs
mailing list