[MonoDevelop] Code completion problem

Sat Jun 17 06:32:53 EDT 2006

Hi

I have stumbled upon a very disturbing problem. When I realized that 
IClass doesn't have a list of IReturnType for its base classes I thought 
'why not changing the string collection of base types into a list of 
IReturnType' - surely, it is needed if we want generics in MonoDevelop. 
Well, since this was the first compromising change of the current 
architecture (the serialization process of IClass instances changed with 
this), I had to change some things in the Persistent<Type> classes. 
Soon, after having to change the behaviour of dozens of other classes, I 
have realized that we have a very serious problem.

Explanation: Currently the architecture's deserialization process has to 
resolve EVERY language element it deserializes from the persistent 
storage. Obviously this would not be necessary if the persistent storage 
would be serialized/deserialized as a whole (atomically). This is not 
only an unnecessary overhead far exceeding everything I've ever seen, 
it's also a big problem for coders (like me), because adding only one 
piece of support for whatever new code completion feature (in my case 
generics) will result in inevitable neck breaking, endless coding and 
total breakage of interrelated code (code so interrelated that no 
foreign programmer will ever be capable having an oversight of it).

My arguments underpinning the statement 'the current code completion 
architecture is fundamentally flawed':
- we get duplicated entries in the persistent storage (fully qualified 
names of types being just one of these)
- we get multiple utility classes which exist only because we have such 
a curious way of (de)serializing type information (look at all those 
'Persistent<Type>' classes or the ITypeResolver) - making up the 
spaghetti I was talking about...
- many of these utility classes do per element array copies in 'foreach' 
loops and a lot of O(n^2) or even more heavy algorithms to do things 
that would otherwise never be needed <-- this is done in the resolving 
process
- we have a lot of statements like these: 'return new ArrayList()' where 
we create a lot of temporary arrays just because some utility classes 
don't check whether some properties are 'null' or not. Now tell me, 
honestly, which operation is more expensive (speed- and memory-wise): 
creating a new ArrayList, or checking that it is 'null'?

Errr, why did Mike Krüger decide to store things this way? Is .NET's 
serialization so ineffective (bloated, slow?) that he wanted to 
sacrifice speed and in-memory space in favour of on-disk space 
consumption? Well, if disk space is a problem, we can still compress the 
code completion database... But essentially, I doubt this is a 
problem... If there are some insiders among you guys, please, tell me if 
and why I'm wrong - what have I overlooked.

Do I have a go for designing a new architecture, or do you guys have any 
warnings or objections for me. I'm aware that I might be wrong with some 
(or all) of my above assumptions.

If there will be no comments from you, I'll take it as a 'go' ;-)

Errr, yeah, I'm aware that it will take a long time before I rework the 
code completion system... :-)

Enjoy
---
Matej