[Mono-devel-list] [PATCH] String speedup

Paolo Molaro lupus at ximian.com
Tue Feb 24 12:12:06 EST 2004


On 02/23/04 Ben Maurer wrote:
> This tests the speed of copying strings of various lengths. On my box, the results were:
> Length + Before ---+ After ---+
>    2   |   .388 s  |   .273 s |
>    5   |   .426 s  |   .436 s |
>    8   |   .419 s  |   .421 s |
[...]
>   47   |   .536 s  |   .937 s |
> -------+-----------+----------+
> 
> My implementation of memcpy/memove is attached, with a little test driver.
> 
> So, it looks like right now, after len 2 strings, the cost of the icall
> becomes lower than the benefit of memcpy.

What about a different explanation? Because to me it looks like that
with a crippled memcpy managed implementation you can get as bad results 
as you want. Attached a first cut that doesn't try to optimize away the
unaligned accesses. It beats the icall on my system until about 50-55
and is about 10% slower with lengths between 80-100 (and 10% is
definitely within the improvements we can gain in the jit). Also note
that it takes 3 80-char copies with the icall to gain back the time lost
with a single 10-char copy. Note it doesn't handle overlap, so I'm not
going to commit it: the few calls in stringbuilder that do need overlap
should be changed to call another function so the common case is handled
faster. I had hoped you would do that, but it looks like you wanted to
show how to write orrible and slow code instead.

> One other thought I had was somehow using the CPBLK instruction. We
> could make method that was transformed into CPBLK by the jit. This way,
> we just have to optimize that opcode. Note, that Mono runs the CPBLK
> bench mark 3x slower than MS does, so we may have to do some work. Also,

Trivially optimized with about 5 lines of C code. Anyone out there who
wants to start some jit hacking? No asm knowledge required, the results
should look something like:
Old:
$ mono -O=all,-intrins benchmark/bulkcpy.exe
Elapsed : 4046 ms.
New:
$ mono -O=all benchmark/bulkcpy.exe
Elapsed : 1359 ms.

On 02/23/04 Ben Maurer wrote:
> Some greping shows that the old JIT had code generation for CPBLK, and
> it looked pretty fast. Maybe we can port that over?

Nope. Well, you're free to spend your time porting it, maybe you'll
learn something. Once you have ported it we'll show you why it is not
needed.

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better



More information about the Mono-devel-list mailing list