[Mono-devel-list] First CIL Regex performance result

Eric Durand Tremblay eric.durand-tremblay.1 at ulaval.ca
Tue Mar 9 11:41:17 EST 2004


Thanks to Varga Zoltan and Ben Maurer,

I corrected our implementation of Checkpoint() and Bactrack() to use 
Ben's Stack and I changed the alternation construct who now clear the 
stack when no bactrack is necessary.

Results are very promising. "a|b" take now 70% of the time used by the 
interpreter compared to 150% before these changes.

We will now continue to add new construct before doing some more 
optimization.

Set ([a-z]) and repetition (*,+) are expected to work at the end of this 
week.
We are still looking at Right To Left bugs.

Eric Durand Tremblay
TIP-MONO (university Laval)
http://aeglos.dyndns.org/tip-mono


Ben Maurer a écrit :

>Hello,
>
>If you really cant make it into a stack, try doing the following:
>
>CheckpointNode [] cpstack;
>int cpstackc;
>
>void PushCp (CheckpointNode n)
>{
>    if (cpstack == null) n = new CheckpointNode [16];
>    if (cpstack.Length == cpstackc) 
>         // realloc, copy, etc
>    cpstack [cpstackc++] = n;
>}
>
>CheckpointNode PopCp ()
>{
>    return cpstack [cpstackc--];
>}
>
>that will avoid boxing. I did that alot in xslt, with much success.
>
>  
>
>>>>Eric Durand Tremblay <eric.durand-tremblay.1 at ulaval.ca> 03/05/04 13:01 PM >>>
>>>>        
>>>>
>Hello all
>
>Thank you for the hint.
>
>Varga Zoltan a écrit :
>
>  
>
>>                                                  Hi,
>>
>>  I think the main problem here is the Checkpoint and Backtrace
>>functions in CILCompiler.cs:
>>
>>
>>This function is called approx 6 million times during the
>>regex test you
>> 
>>
>>    
>>
>This explain a lot but there is still some case like "c" who do not call 
>checkpoint at all and that are still slow.
>
>  
>
>>Also, not all Checkpoint calls have a matching Backtrace
>>call, s
>>
>>    
>>
>You are right,  it's a big mistake.
>
>  
>
>>So my advice is: make the Checkpoint and Backtrace calls
>>similar to
>>the ones used in the interpreter. 
>>
>>    
>>
>This is not possible because the interpreter do recursive call ( 
>implicit stack).   Since the compiled regex are evalued iteratively, we 
>must keep the checkpoints in a kind of stack.
>
>Say  :   (foo| | [ab]* | c)
>
>This regex will contain nested checkpoint backtrack construct. So, it is 
>a stack case.
>
>We will try to find an alternative.
>
>Eric Durand Tremblay
>
>_______________________________________________
>Mono-devel-list mailing list
>Mono-devel-list at lists.ximian.com
>http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>_______________________________________________
>Mono-devel-list mailing list
>Mono-devel-list at lists.ximian.com
>http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>  
>




More information about the Mono-devel-list mailing list