[Mono-devel-list] The first (attempt to checkin) managedcollation patch

Kornél Pál kornelpal at hotmail.com
Wed Jul 20 18:12:54 EDT 2005


> From: "Ben Maurer"
>      * There are extremely long runs of the same char in many instances
>      * The file seems to have tons of 0 bytes.
>      * There are some runs of sequences:
>
> 0002bfb0: 3c00 3d00 3e00 3f00 4000 4100 4200 4300  <.=.>.?. at .A.B.C.
> 0002bfc0: 4400 4500 4600 4700 4800 4900 4a00 4b00  D.E.F.G.H.I.J.K.
> 0002bfd0: 4c00 4d00 4e00 4f00 5000 5100 5200 5300  L.M.N.O.P.Q.R.S.
> 0002bfe0: 5400 5500 5600 5700 5800 5900 5a00 5b00  T.U.V.W.X.Y.Z.[.
>
>        though they are somewhat smaller than the runs of the same char.

I see the problem as the following: If the file contains unicode Unicode
charaters it eats disk space but is fast to read thus sorting is fast.
If it is compressed but unbuffered sorting is slow and eats CPU.
If it's buffered either because it is compressed or "just for fun" it eats
RAM.

So I think the decission should be made carfully.

Kornél




More information about the Mono-devel-list mailing list