[Mono-devel-list] First CIL Regex performance result

Eric Durand Tremblay eric.durand-tremblay.1 at ulaval.ca
Sun Feb 29 12:41:28 EST 2004


*First of all, great thanks to Miguel and Ben who took the time to look at our IL.  I will try to expose what limitation we have toward this problem.*

*From Miguel :*

	* In location 0x8c, you do a:

		blt IL_0096
		br  IL_0075

	IL_0096:
	  You can do a simple jump optimization, and change that to:

		bge IL_0075

*These two line of IL are independent (One is from emitBranch, the other by emitCharacter)  For exemple, a character matching is'nt necessary called by a branch.  It can be called directly by an anchor or anything else.*



	* In 0xba, you do something similar:

		IL_00ba:  beq IL_00c4
		IL_00bf:  br IL_0075
		IL_00c4:  br IL_012d

	  There is an obvious optimization there, but am puzzled by why do you
	  have the beq there in the first place, there is no compare there, and
	  various nops before, which makes me wonder about the intent.
*Here is the end of EmitCharacter it is dependent from the context ( what called it).  For the nop, I really don't understand why they are there.  Reflection.Emit put them automatically.  The beq compare the char loaded at IL_00a2 and the char loaded at IL_00b5(ldc.i4.s 0x61).  We will try to make it more simple.*

	* Loop inversion:
	  This way, you only branch one per test.
*Good, We will use it.*


	* Caching values?

	  Maybe you might want to copy at the function startup the scan_ptr value into
	  a local variable, and if you need the value afterwards, move the local to the
*	*  field later on.*
I tried this with no significant result.  I can put it back.*


See what mcs generates for:

	fixed (char c = &my_string){
	}
*Shure, the reason I did'nt use something like this it is that I did'nt know it was possible in C# ( no mutch experience in unmanaged code). Thanks.*

*From Ben :*

Ok, lemme add some of my own:

>> 	IL_000d:  ldc.i4 0
>> 	IL_0012:  add 
>
x + 0 = x

*The peace of code you this refer to resolve to (int anch_end = text_end - match_min + anch_offset)  
The 0 is anch_offset... You are right, I can test this (anch_offset is almost always 0)*


>Also, you have alot of code like:
>  
>
>> 	IL_0080:  ldarg.0 
>> 	IL_0081:  ldfld  int32 [System]System.Text.RegularExpressions.CILMachineBase::scan_ptr
>> 	IL_0086:  ldarg.0 
>> 	IL_0087:  ldfld  int32 [System]System.Text.RegularExpressions.CILMachineBase::text_end
>> 	IL_008c:  blt IL_0096
>  
>

I wonder if you should store these into locals, and then store them back
when you are done. Local storage should be much faster.

*Same remark than Miguel.*

>> 	IL_0069:  ret
>> 
>> //****  Branch Code 
>> 	IL_006a:  ldarg.0 
>> 	IL_006b:  callvirt instance void class [System]'System.Text.RegularExpressions.CILMachineBase'::'Checkpoint'()
>> 	IL_0070:  br IL_0080
>> 
>> 	IL_0075:  ldarg.0 
>> 	IL_0076:  callvirt instance void class [System]'System.Text.RegularExpressions.CILMachineBase'::'Backtrack'()
>> 	IL_007b:  br IL_00c9
>> 
>> //**** Eval Character 1
>> 	IL_0080:  ldarg.0 
>  
>

Ok, the arragnement here is really weird. If you move the block 6a-70 to
right above 80, you can avoid a branch.
*
Humm... the emiting of branch is somewhat iterative.  We don't know how many branch we will have and what code will be inside.  The only thing we can do is to put all control stuff on top and set a context stack to tell the compiler where to link back.  Maybe it is possible to Emit some BranchTail with the bactrack code.( save some Br )...  Here again, you are right.*

For debugging purposes, some of your code is really complex and hard to
understand. I realize it is not designed to be read, but you are going
ot make it hard on yourself if you emit code like:

*Yeah, we taught that Dup was more efficient than reloading the value.  
Note that in our code, eavry peace of IL is commented with it's conterpart in C#*

>> IL_00da:  br IL_0128
>> ...
>> IL_0128:  br IL_005f
>  
>

That can be simplified.

>> 	IL_002a:  blt IL_0068
>> 	IL_0063:  br IL_0028
>> 
>> 	IL_0068:  ldc.i4.0 
>> 	IL_0069:  ret
>  
>

The only branch to 68 is the one in 2a. So you can just move the code up there.

*Same commentary than above (same case).*

I think this shoudl get you on your way. Once you get these done, the
code should be alot cleaner and easier to optimize.

-- ben

*Yeah for shure.


We will make some try and come back to you with our results.

Thanks again.

Eric Durand Tremblay
TIP-MONO 
aeglos.dyndns.org/tip-mono
*

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20040229/e702b27d/attachment.html 


More information about the Mono-devel-list mailing list