String performance boost (Re: [Mono-dev] Patch to boost speed ofUnicodeEncoding)
atsushi at ximian.com
Fri Mar 17 05:02:50 EST 2006
Your patch is based on pretty old code(!) ;-) I tried it for
The patch makes it faster. But I guess CopyChar() functionality
overlaps Paolo's memcpy() implementation in the latest code,
though it was not simply replaceable.
ditto. The improvement rate is like you showed, about 5%.
The patch resulted much better, as you showed. I think it is
the best improvement.
- GetHashCode(), Insert(), Remove(),
We already have managed implementation like yours.
- ToUpper(), ToLower():
Based on your patch, I rather made TextInfo.ToUpper() and
ToLower() based on pointer implementation, so the mind is already
checked in svn :-)
There was a one-liner bug in the implementation. Even after
fixing it, the result became much worse than existing one, about
1.5x slower :-(
It didn't make improvement - I think it is because we already
have managed implementation by now.
The patch made results much worse :-( It doubled execution time.
The patch was a bit worse, about 7-8% loss.
- PadLeft(), PadRight():
Almost no difference - and seems like it still has bugs (some
NUnit tests fail).
It looks pretty fast but it is too buggy to fix right now :(
I'm attaching a patch based on the original source, so anyone can
try them handy. Those which is buggy or makes performance worse
are commented out.
I think Trim() can be checked in, with a few changes
(changing CharCopy() to InternalStrcpy()). For CharCopy(),
I'd wait for further review. ToCharArray() is also nicer
to have, but it depends on CharCopy().
Thanks a bunch, Andreas.
Andreas Nahr wrote:
> It's just bad naming ;)
> The String.cs.old contains the managed implementations. Please note that
> there are usually multiple implementation-tries for each method and all
> but one is just commented out. Its extremely unlikely that the ones
> currently "active" are the best ones.
> Also I can remember that at least one of the stats is completely wrong
> because there was a bug in the implemenation that copied only half the
> If looking at the numbers I guess it is Replace because the difference
> is really huge
> I'll also attach the file I used for the microbenchmarks. But it is a
> complete mess and I doubt anyone could use anything from it.
>> Wow, the numbers are quite impressive. Can you please attach your
>> String.cs(.new) ? :-)
>> Atsushi Eno
>> Andreas Nahr wrote:
>>> Hi Zac, Hi Kornél,
>>> I've been working quite some time on improving the existing String
>>> class a
>>> long time ago (about 2-3 years), but as I got a new job back then
>>> never had
>>> the time to finish anything. My findings back then were that purely
>>> implementations basically always outperformed the internalcalls (And
>>> I guess
>>> the JIT is now even more evolved than it was 3 years ago).
>>> However as I sayed it was never finished and contains bugs. Moreover it
>>> doesn't care at all for alignment issues.
>>> If anyone want's to look at it - I attach my String-Testing class.
>>> find lots of different tries to optimize the Methods. But beware, the
>>> is in horrible shape, far from being usable.
>>> Some optimizations use specific string-domain knowlege (like "equals"
>>> testing first for the first char and after that from the end of the
>>> My conclusion was: We should have a few managed functions to do the work
>>> (MemoryCopy, MemoryCompare, possibly for Byte* and Char*). They
>>> should be
>>> managed so that optimizers and optimizing compilers are able to do
>>> optimizations even on IL level. Whenever possible the JIT should replace
>>> these at runtime (provided they aren't optimized away) with
>>> architecture-specific assembly versions with the managed version as
>>> Here are some findings from microbenchmarks made back then (the first
>>> is always the time in milliseconds for the existing unmanaged
>>> (internalcall), the second for a tested managed implementation, the
>>> in () is the length of the string tested):
>>> Some Methods still need internalcalls to create new Strings, but were
>>> faster than the native implementations (The optimum would be to
>>> InternalAllocateStr calls).
>>> CopyTo (000): 810 -> 381
>>> CopyTo (010): 832 -> 441
>>> CopyTo (100): 1352 -> 881
>>> CopyTo (512): 3395 -> 3014
>>> ToCharArray (000): 5067 -> 4466
>>> ToCharArray (002): 5317 -> 4857
>>> ToCharArray (015): 8041 -> 7691
>>> ToCharArray (960): 2915 -> 2894
>>> ToCharArray (with parameters): Similar to above
>>> Trim (): 6930 -> 6760
>>> Trim (custom search Chars): 10596 -> 9413
>>> Trim (default search Chars): 10455 -> 7210
>>> Trim (no trimmed chars, long string): 1893 -> 661
>>> Trim (no trimmed chars, short string): 1893 -> 631
>>> Replace (004 - one replace): 37264 -> 3135
>>> Replace (004 - nothing to replace): 3735 -> 310
>>> Replace (961 - everything replaced): 2584 -> 501
>>> Replace (961 - only last char replaced): 2463 -> 481
>>> Split (default split Chars, long string, lots splitted): 42421 -> 8523
>>> Split (custom split Chars, long string, non found): 2944 -> 2263
>>> Split (custom split Chars, long string, lots found): 22062 -> 7330
>>> Split (default split Chars, short string, non found): 2093 -> 761
>>> Split (default split Chars, short string, nearly only splitChars):
>>> 8002 ->
>>> IndexOf (17): 1132 -> 791
>>> IndexOf (2162): 10576 -> 7862
>>> LastIndexOf (similar to above)
>>> IndexOfAny (long string, nothing found): 25867 -> 2984 (Break even
>>> bei ca.
>>> LastIndexOfAny (similar to above)
>>> PadLeft/ PadRight: slightly slower that current (Should get faster once
>>> optimized CharCopy is available): 1012 -> 1031
>>> Remove: slightly slower that current (Should get faster once optimized
>>> CharCopy is available): 2153 -> 2283
>>> If somebody is interesting in picking this up I might be able to help a
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the Mono-devel-list