[Mono-dev] Generic sharing: Good news, bad news, how to win big

Mon Apr 14 10:58:04 EDT 2008

Hey Rodrigo!

Thanks for the feedback!

> Isn't possible or better to do RGCTX free'ing at GC time? It would be
> simpler, the hardest
> part would be guarding against parking threads inside RGCTX related code,
> which can be done with
>  some link time trickery and a lit of changes on stack scanning code.

I'm not sure it would be simpler (to be honest I don't know how much
work it would involve), plus it would involve the MonoObject overhead
which is 8/16 bytes per RGCTX.

> In Madrid we discussed about using segfaults to trigger lazy filling of
> rgctx, have you thought about using that?

Not seriously.  My first concern was getting everything to work
correctly, which it now does.  I'll concentrate on saving memory next,
so that sharing generic code actually makes sense.  Performance was
never an issue in my tests.

> I remember that a major issue with the rgctx layout was that you need to
> coordinate slot filling between a type and all it's parents to avoid
> collisions. How would that work on your proposed schema?

I would still do the bookkeeping for collision avoidance and then use
the resulting slot number to uniquely identify the type information.
Just think of it as a sparse array.

> How about using a
> pointer to the parent context? This would eliminate the whole issue, could
> save some bytes for parents with fat rgctx and make even less likely to have
> a large rgctx.

No, that doesn't work in the general case, because the type arguments
of the parent class might be different:

class B<T> : C<X<T>>

It would probably work in the special case where they are the same,
but I don't know if it's worth to do that kind of optimization,
especially since it makes the lookup code more complicated.  I'd like
to keep the lookup code for the small RGCTX small so that we can do it
in managed code (not inline, but in a trampoline).

> One more thing, your stats miss something I guess it's important, how many
> generic sharing failures each test suite has? This is important to see how
> much further this could be improved if constrained and mixed
> reference/valuetype sharing gets done.

Yes, I don't have those stats yet, but they're on my TODO list.

> It might too early to think about this, but do you have some speed results
> for these tests?

Yes.  At least on x86 there is no noticable speed difference between
sharing and not sharing.  I also did mini-benchmarks for List`1 and
Dictionary`2 and there was no speed difference either, in runs that
lasted about 30 seconds.

Mark