[Mono-devel-list] First CIL Regex performance result
Eric Durand Tremblay
eric.durand-tremblay.1 at ulaval.ca
Tue Mar 9 11:41:17 EST 2004
Thanks to Varga Zoltan and Ben Maurer,
I corrected our implementation of Checkpoint() and Bactrack() to use
Ben's Stack and I changed the alternation construct who now clear the
stack when no bactrack is necessary.
Results are very promising. "a|b" take now 70% of the time used by the
interpreter compared to 150% before these changes.
We will now continue to add new construct before doing some more
optimization.
Set ([a-z]) and repetition (*,+) are expected to work at the end of this
week.
We are still looking at Right To Left bugs.
Eric Durand Tremblay
TIP-MONO (university Laval)
http://aeglos.dyndns.org/tip-mono
Ben Maurer a écrit :
>Hello,
>
>If you really cant make it into a stack, try doing the following:
>
>CheckpointNode [] cpstack;
>int cpstackc;
>
>void PushCp (CheckpointNode n)
>{
> if (cpstack == null) n = new CheckpointNode [16];
> if (cpstack.Length == cpstackc)
> // realloc, copy, etc
> cpstack [cpstackc++] = n;
>}
>
>CheckpointNode PopCp ()
>{
> return cpstack [cpstackc--];
>}
>
>that will avoid boxing. I did that alot in xslt, with much success.
>
>
>
>>>>Eric Durand Tremblay <eric.durand-tremblay.1 at ulaval.ca> 03/05/04 13:01 PM >>>
>>>>
>>>>
>Hello all
>
>Thank you for the hint.
>
>Varga Zoltan a écrit :
>
>
>
>> Hi,
>>
>> I think the main problem here is the Checkpoint and Backtrace
>>functions in CILCompiler.cs:
>>
>>
>>This function is called approx 6 million times during the
>>regex test you
>>
>>
>>
>>
>This explain a lot but there is still some case like "c" who do not call
>checkpoint at all and that are still slow.
>
>
>
>>Also, not all Checkpoint calls have a matching Backtrace
>>call, s
>>
>>
>>
>You are right, it's a big mistake.
>
>
>
>>So my advice is: make the Checkpoint and Backtrace calls
>>similar to
>>the ones used in the interpreter.
>>
>>
>>
>This is not possible because the interpreter do recursive call (
>implicit stack). Since the compiled regex are evalued iteratively, we
>must keep the checkpoints in a kind of stack.
>
>Say : (foo| | [ab]* | c)
>
>This regex will contain nested checkpoint backtrack construct. So, it is
>a stack case.
>
>We will try to find an alternative.
>
>Eric Durand Tremblay
>
>_______________________________________________
>Mono-devel-list mailing list
>Mono-devel-list at lists.ximian.com
>http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>_______________________________________________
>Mono-devel-list mailing list
>Mono-devel-list at lists.ximian.com
>http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
>
More information about the Mono-devel-list
mailing list