[MonoDevelop] Spell checking in monodevelop

Thu Apr 2 09:55:12 EDT 2009

I have a maths lecturer helping me with some of the algorithms (basically better ones than I had in the first place). I've also found a load of training data to start off with, it won't have the correct balance of spelling errors but it should be able to train it reasonably well.

Anyhow, I'm going to write the core of it quite quickly run a load of training data through and see what it's performance is like compared to hunspell. I the performance is significantly better then I can just release it, get as many people using at it as possible and then maybe look at it every 9 months or so to see what could be tuned / improved based on real world data. If it's not significantly better then I'll just use hunspell and go back to improving my spell checker (or rethinking it) in a few years time.

I usually have lots of project on the go at once and I'm quite good at getting them to a point where their production ready then either having someone else pick up from me or working on them a little less (just doing support and not really adding new features), or leaving them for a while and coming back to them.

I have high standards of what is production ready so anything I work on MD will be very good before I take a supporting role, or switch to developing something else (in MD or in another open source project). I also intend to use MD at work so making it good is very important to me.

-----Original Message-----
From: Michael Hutchinson [mailto:m.j.hutchinson at gmail.com] 
Sent: 02 April 2009 01:29
To: Oliver Stieber
Cc: monodevelop-list at lists.ximian.com
Subject: Re: [MonoDevelop] Spell checking in monodevelop

On Mon, Mar 30, 2009 at 8:05 AM, Oliver Stieber
<oliver.stieber at ukplc.net> wrote:
> The best bit it that I plan to have a centralized server as well as the client app
> so that all the data from everyone's spelling mistakes (provided they don't turn
> data collection off, in which case there not going to much better off than running
> hunspell because they would need to pull a partial snapshot of the spelling
> database down from the server on first use) and turns them into a huge knowledge
> base of spelling mistake patters and words not in the dictionary and user profiles
> that can be pulled down to any machine with the spell checker in it and group
> dictionarys so that uses can share their words that shouldn't be in the main
> dictionary with everyone in their office.

This sounds like a brilliant idea, but I can't help but worry that
it'll grow into a bigger and more time-consuming project than you
anticipate. There are many interesting things that such a service
could be extended to do, for example finding incorrect homophones
based on statistical analysis of context.

-- 
Michael Hutchinson
http://mjhutchinson.com