[Mono-dev] [PATCH] Boost speed of UnicodeEncoding

Fri Mar 17 07:12:03 EST 2006

Oh.. :-) hehe..

You ran into what I was afraid of I see at the bottom of the email. 
Thats why I added the 10 char limit to using the new method so that 
really small strings would use old method. Hehe...

-- 
Zac Bowling
http://zacbowling.com/

----- Message from kornelpal at hotmail.com ---------
    Date: Thu, 16 Mar 2006 23:59:53 +0100
    From: Kornél Pál <kornelpal at hotmail.com>
Reply-To: Kornél Pál <kornelpal at hotmail.com>
Subject: Re: [Mono-dev] [PATCH] Boost speed of UnicodeEncoding
      To: Atsushi Eno <atsushi at ximian.com>

> Hi,
>
> Originally I didn't plan to create a patch I only made some suggestions. But
> then I realized that current the UnicodeEncoding is too inefficient.
>
> So I implemented my idea to UnicodeEncoding.
>
> UnicodeEncodingPerformance.cs is the test I used.
>
> Results:
> Before:
> 1, string to byte[], same: 265
> 1, char[] to byte[], same: 282
> 1, byte[] to char[], same: 453
> 1, string to byte[], diff: 265
> 1, char[] to byte[], diff: 266
> 1, byte[] to char[], diff: 453
> 4, string to byte[], same: 672
> 4, char[] to byte[], same: 703
> 4, byte[] to char[], same: 594
> 4, string to byte[], diff: 656
> 4, char[] to byte[], diff: 609
> 4, byte[] to char[], diff: 641
> 1024, string to byte[], same: 1406
> 1024, char[] to byte[], same: 1391
> 1024, byte[] to char[], same: 922
> 1024, string to byte[], diff: 1297
> 1024, char[] to byte[], diff: 1281
> 1024, byte[] to char[], diff: 1250
> 1048576, string to byte[], same: 3453
> 1048576, char[] to byte[], same: 2500
> 1048576, byte[] to char[], same: 1515
> 1048576, string to byte[], diff: 2734
> 1048576, char[] to byte[], diff: 1407
> 1048576, byte[] to char[], diff: 1312
>
>
> After:
> 1, string to byte[], same: 578
> 1, char[] to byte[], same: 563
> 1, byte[] to char[], same: 844
> 1, string to byte[], diff: 328
> 1, char[] to byte[], diff: 359
> 1, byte[] to char[], diff: 578
> 4, string to byte[], same: 578
> 4, char[] to byte[], same: 563
> 4, byte[] to char[], same: 812
> 4, string to byte[], diff: 391
> 4, char[] to byte[], diff: 406
> 4, byte[] to char[], diff: 594
> 1024, string to byte[], same: 47
> 1024, char[] to byte[], same: 47
> 1024, byte[] to char[], same: 62
> 1024, string to byte[], diff: 203
> 1024, char[] to byte[], diff: 204
> 1024, byte[] to char[], diff: 203
> 1048576, string to byte[], same: 391
> 1048576, char[] to byte[], same: 375
> 1048576, byte[] to char[], same: 375
> 1048576, string to byte[], diff: 984
> 1048576, char[] to byte[], diff: 391
> 1048576, byte[] to char[], diff: 375
>
> Note these are the results of two actual executions so they are not fully
> representative.
>
> As you can see converting 1 character became slower. But longer strings are
> much faster converted (4 bytes for example). Just to show how inefficient
> the old code was converting 1024 characters is about 20-30 times faster than
> it was before.
>
> I think converting a single character should not be optimized as doing so is
> already inefficient. It's much faster to use convert it inline using shift
> operators.
>
> Please review and approve the patch.
>
> Kornél
>
> ----- Original Message -----
> From: "Atsushi Eno" <atsushi at ximian.com>
> To: "Kornél Pál" <kornelpal at hotmail.com>
> Cc: <mono-devel-list at lists.ximian.com>; "Zac Bowling" <zac at zacbowling.com>
> Sent: Wednesday, March 15, 2006 11:10 PM
> Subject: Re: [Mono-dev] Patch to boost speed of UnicodeEncoding
>
>
>> Hi,
>>
>> It's always nice if encoding conversion stuff get faster. Can you
>> also provide how it becomes faster when you finish writing the patch?
>>
>> Thx,
>> Atsushi Eno
>>
>>
>> Kornél Pál wrote:
>>> Hi,
>>>
>>> I think doing something like in the attached draft is faster. No new
>>> String
>>> object is created. Arrays are accessed using pointers. And I think there
>>> is
>>> no use to use a more complicated conversion method for short strings.
>>>
>>> This draft is very unsafe. It lacks of any checks and does not perform
>>> any
>>> special character or byte sequence handling.
>>>
>>> Note that I haven't done any tests to determine whether using byte
>>> pointer
>>> or using int pointers and shift operations to swap bytes is faster. But
>>> mixing bytes an ints results in two different code for big and little
>>> endian
>>> encodings while byte swapping can be performed using a single code when
>>> using only bytes or only ints.
>>>
>>> Kornél
>> _______________________________________________
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>
>

----- End message from kornelpal at hotmail.com -----