String performance boost (Re: [Mono-dev] Patch to boost speed ofUnicodeEncoding)
Andreas Nahr
ClassDevelopment at A-SoftTech.com
Thu Mar 16 16:46:10 EST 2006
It's just bad naming ;)
The String.cs.old contains the managed implementations. Please note that
there are usually multiple implementation-tries for each method and all but
one is just commented out. Its extremely unlikely that the ones currently
"active" are the best ones.
Also I can remember that at least one of the stats is completely wrong
because there was a bug in the implemenation that copied only half the data.
If looking at the numbers I guess it is Replace because the difference is
really huge
I'll also attach the file I used for the microbenchmarks. But it is a
complete mess and I doubt anyone could use anything from it.
Andreas
> Wow, the numbers are quite impressive. Can you please attach your
> String.cs(.new) ? :-)
>
> Atsushi Eno
>
> Andreas Nahr wrote:
>> Hi Zac, Hi Kornél,
>>
>> I've been working quite some time on improving the existing String class
>> a
>> long time ago (about 2-3 years), but as I got a new job back then never
>> had
>> the time to finish anything. My findings back then were that purely
>> managed
>> implementations basically always outperformed the internalcalls (And I
>> guess
>> the JIT is now even more evolved than it was 3 years ago).
>> However as I sayed it was never finished and contains bugs. Moreover it
>> doesn't care at all for alignment issues.
>> If anyone want's to look at it - I attach my String-Testing class. You'll
>> find lots of different tries to optimize the Methods. But beware, the
>> code
>> is in horrible shape, far from being usable.
>> Some optimizations use specific string-domain knowlege (like "equals"
>> testing first for the first char and after that from the end of the
>> string)
>>
>> My conclusion was: We should have a few managed functions to do the work
>> (MemoryCopy, MemoryCompare, possibly for Byte* and Char*). They should be
>> managed so that optimizers and optimizing compilers are able to do
>> optimizations even on IL level. Whenever possible the JIT should replace
>> these at runtime (provided they aren't optimized away) with
>> architecture-specific assembly versions with the managed version as
>> fallback
>> mechanism.
>>
>>
>> Here are some findings from microbenchmarks made back then (the first
>> number
>> is always the time in milliseconds for the existing unmanaged
>> implementation
>> (internalcall), the second for a tested managed implementation, the
>> Number
>> in () is the length of the string tested):
>>
>> Some Methods still need internalcalls to create new Strings, but were
>> still
>> faster than the native implementations (The optimum would be to
>> internalize
>> InternalAllocateStr calls).
>>
>> CopyTo (000): 810 -> 381
>> CopyTo (010): 832 -> 441
>> CopyTo (100): 1352 -> 881
>> CopyTo (512): 3395 -> 3014
>>
>> ToCharArray (000): 5067 -> 4466
>> ToCharArray (002): 5317 -> 4857
>> ToCharArray (015): 8041 -> 7691
>> ToCharArray (960): 2915 -> 2894
>>
>> ToCharArray (with parameters): Similar to above
>>
>> Trim (): 6930 -> 6760
>> Trim (custom search Chars): 10596 -> 9413
>> Trim (default search Chars): 10455 -> 7210
>> Trim (no trimmed chars, long string): 1893 -> 661
>> Trim (no trimmed chars, short string): 1893 -> 631
>>
>> Replace (004 - one replace): 37264 -> 3135
>> Replace (004 - nothing to replace): 3735 -> 310
>> Replace (961 - everything replaced): 2584 -> 501
>> Replace (961 - only last char replaced): 2463 -> 481
>>
>> Split (default split Chars, long string, lots splitted): 42421 -> 8523
>> Split (custom split Chars, long string, non found): 2944 -> 2263
>> Split (custom split Chars, long string, lots found): 22062 -> 7330
>> Split (default split Chars, short string, non found): 2093 -> 761
>> Split (default split Chars, short string, nearly only splitChars):
>> 8002 ->
>> 5067
>>
>> IndexOf (17): 1132 -> 791
>> IndexOf (2162): 10576 -> 7862
>>
>> LastIndexOf (similar to above)
>>
>> IndexOfAny (long string, nothing found): 25867 -> 2984 (Break even bei
>> ca.
>> 75chars)
>>
>> LastIndexOfAny (similar to above)
>>
>> PadLeft/ PadRight: slightly slower that current (Should get faster once
>> optimized CharCopy is available): 1012 -> 1031
>>
>> Remove: slightly slower that current (Should get faster once optimized
>> CharCopy is available): 2153 -> 2283
>>
>>
>> If somebody is interesting in picking this up I might be able to help a
>> litte.
>> Andreas
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: T.cs
Url: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20060316/8b38fb5f/attachment.pl
More information about the Mono-devel-list
mailing list