[MonoDevelop] Code completion matching - input needed

Mike Krüger mkrueger at novell.com
Tue Aug 10 16:48:15 EDT 2010


Hi

Ok, we"ve a problem with the matching algorithm for the code completion. 
I think this is one of the most important parts in mono develop and I 
think we need opinions on that topic.

We've two places where we match with abbreviations:

1) The navigate to dialog
2) The code completion in the text editor

I know that both features are used very often (one by force - one by 
need :)). We had 2 algorithms for the matching.  They yield different 
results for the same input. We came to the conclusion that this >may< 
not the best thing to have. But that is even not decided to the end.

Therefore the 1st question is: Would you expect that both filters are 
the same - or not ?

Now to the algorithms - how they work. First the navigate to dialog 
algorithm. It matches a word, if the filter is a subsequence. For example:

The filter 'strm' would match 'Stream', 'stringMatch', 'StrongTypMatch' 
but also 'FirstStorm'.
It currently completly ignore scase - 'DBO' == 'dbo' and would match 
'dboField', 'DataBaseObject', 'DBOStuff' or 'OddNumberContainer'.

This makes sense for example when you have a term 'Autotools' and you 
search for 'tools' this is matched - the other algorithm won't match this.
The plus side of this approach is that it gives you >many< items in the 
case you're not really sure what to look for.

Ok now to the other in the code completion filter. This one does >not< 
do a full substring match instead I would call it subword prefix 
matcher. It breaks a word that is camel or pascal cased into subwords 
and it matches the filter at the word starts - it maes a difference if 
the filter is lower or upper case.

For example 'strm' would NOT match 'Stream' or 'FirstStorm' ,  but 
'StringMatch' or 'StrongTypeMatch'.
In the 'DBO' case it won't match for 'dboField' but 'DataBaseObject' or 
'DBOStuff' - because upper case letter enforces a word jump. 'dbo' would 
match the same but 'dboField' as well - but it would >never< match 
'OddNumberContainer'.

This one would miss 'Autotools' because it doesn't recognize the 'tools' 
part - but it would match 'AutoTools' on 'tools' ... but at the plus 
side it won't never give back 'TurboBooleans' if you search for 'tools'. 
Generally this approach will yield a subset of the results of the 1st 
approach.
The plus side of this approach is that you get very fast to the result 
you're looking for - for example if you hit 'g' you could end up with 
'string' in the other one - would never happen with this approach. And 
because it's just giving a subset this approach is faster.

Ok now we've the question: which algorithm to take. Do we need both ? 
When we take one for all occasions which one ? Are we completly idiots 
and another abbreviation approach would be better ?

Then another thing - I've attached both algorithms to play around with 
the approahes outside of monodevelop. The lane completion matcher is 
approach 1 and the BacktrackingCompletionMatcher is approach 2.
Currently in monodevelop the backtracking matcher is disabled by lluis 
and the lane completion matcher is taken - therfore it doesn't make 
sense to start monodevelop to test out the differences.

I've tried to present both approaches objectively - I hope I had success 
with that.

Regards
Mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MatcherComparison.tar.bz2
Type: application/octet-stream
Size: 4403 bytes
Desc: not available
Url : http://lists.ximian.com/pipermail/monodevelop-list/attachments/20100810/23cf4eaa/attachment.obj 


More information about the Monodevelop-list mailing list