[Mono-list] C#/.NET Generics update and summary

Stefan Matthias Aust sma@3plus4.de
Sun, 30 Mar 2003 19:08:54 +0200


David Jeske wrote:

> I have another proposal to handle checked exceptions without the
> annoyance of Java. However, this would have to be in a C# superset
> like the Java superset you referred to.

Basically you say, add "throws" only to public methods as these are
probably the interface methods other users might care about and where
they should and their compilers know about about possible exceptions.

That might be a good compromise. But actually, I never found Java's
checked exceptions a problem.  Well, okay, there's one exception ;-)
Let's say you want to create an iterator that implements the
java.util.Iterator interface class. Let's say that itertor should
iterate some database stuff.  Every database operator might throw an SQL
exception (a checked exception). But the Iterator interface doesn't
allow you to throw exceptions, as potential users of iterators might not
deal with them.  So you have to  wrap your checked exceptions with
uncheckt runtime exceptions, working around the too strict default
mechanism.  That's annoying.

Otherwise I don't buy your argument, that adding "throws" declarations
is bad.  If you're working on something like for example a parser which
reads in an InputStream (an operation that might throw an IOException)
I've absolutely no problem with the idea that nearly every parser method
might have a "throws" declaration.  I wouldn't mind if I don't have to
add declarations to non-public methods though.  However, as especially
protected methods ar ment to be overwritten by other users in their
subclasses, these methods might also need a throws declaration.

I agree with you, that the whole matter could need some thought.  Do you
know Bruce Eckel's position paper on not using checked exceptions already?

> The conciseness of my review obviously could not cover the detailed
> issues. 
> 
> In a dynamic typed system like Smalltalk or Python, there are many
> possible sources of overhead in using collections such as a hashtable:
>   a) dynamic method lookup to call method on key object to 
>      get it's hash code

I don't consider dynamic method lookup an overhead as it must occur in
these languages anywhere, not only in collections.

>   c) overhead of handling basic datatypes such as integers or simple
>      records in a manner which is similar to "objects" for the runtime
>      (i.e.  usually involving memory allocations)

As you already wrote, if you'd have a more clever unification of so
called primitive data and objects than Java and C# have, this isn't
really a problem.  Smalltalk's small integers are typically represented
as encoded pointers, a very efficient trick used for centuries in Lisp
systems.

> C# is faster at (a) and (b) by using C++ style static vtable lookups
> for methods. "really fancy" Smalltalk runtimes like the SELF/Smalltalk
> runtime which eventually became hotspot can sometimes optimize out
> this overhead at runtime if a single type appears in the
> hashtable.

What you call "really fancy" is actually the norm - not considering
simple interpreter like Squeak or Dolphin Smalltalk.  This kind of
lookup can actually be faster than vtable-style lookup.  If you use
vtables, then simple cases where you can statically determine the method
at compile time as faster for the general call

  gosub obj->vtable[FOO_METHOD_INDEX];

but without further optimizations, it gets slower because you always
need this indirect call with is very harmful to branch prediction and
instruction caches.  A more dynamic approach is to use polymorphic
inline caches (PICs) which basically results in code like this

  c = getClassOf(obj);
  if (c == C1) gosub FOO_METHOD_ADDRESS;
  else if (c == C2) ...
  else do_general_call(FOO_METHOD);

which can be faster - especially if combined with inlining. This also
automatically devirtualizes methods - no need to distinguish virtual and
non-virtual methods at language level or even make non-virtual methods
the default (I really dislike this with C# as I always forget the
modifier and it results in method modifer clutter).

> Java Hotspot JITs try to do the same thing. I'm not sure if
> the MS JIT does this, but it could.

I think they don't as IIRC the .NET specification requires the vtable
architecture for objects.  They still could and I think, Intel's
research implemententation uses PICs.

Another problem with simple vtables occur with interfaces which IMHO
should be the form (or at least not omitted because of performance
fears).  In this chunk of code

  I i = ...;
  i.m();

where

  interface I { void m(); }
  class A : I {
    virtual void n() ...
    virtual void m() ...
  }
  class B : I {
    virtual void m() ...
  }

you can't do an efficient dispatch based on vtables. For this reason,
early Java VMs were much slower on method calls via interface types than
on ordinary method calls.  I stronly hope that modern VMs don't have any
problems anymore.  The more dynamic PICs have no penalty here.

> Generic C# improves (c). Most dynamic languages have some kind of
> hacks to handle this well, so the object allocation does not happen at
> the hashtable insertion.

Actually, using inlining the SELF way, you'd get the same advantage
without generics and everywhere, not only in collections.  Some help
form the user can however of course simplify and speed up the JIT compiler.

> values. This is the biggest reason that integer keyed hash tables are
> FASTER on Python than C# today. Generics will fix this.

Only if they act like C++-style code generating templates and not like
ML-style parametric types... but at least, there's that opportunity.
And of course, I'd like a generic Array<String> much better than special
StringCollection classes as they exist now.

> It's pretty much the same. It's the memory allocation and value
> indirection that kills. A C# could have an "integer pool" similar to
> Python for small integers. That would likely help things, but Generics
> are a better solution.

But that was my point.  In Java you need to do it yourself.  I never use
"new Integer()" in my projects if performance matterns, but something like

   static Integer make(int i) {
    return i >= -1 && i <= 10 ? ints[i + 1] : new Integer(i);
   }

I'd strongly hope that the .NET VES would internally do the same.  I'm
afraid it doesn't but there's still hope :-)

> Microsoft will use "Windows Update" and other mechanisms to push out
> the v2.0 CLI as fast as possible.

That's definitely an advantage over Sun.


> It is unclear how good ILX is. Obviously the ILX people think it is
> somewhat useful. This presentation pans ILX pretty bad:
> 
>  http://www.dcs.ed.ac.uk/home/stg/MRG/comparison/slides.pdf

Where do you read that.  Is it the "but" in the sentence that it uses an
unmanaged code modules to implement closures?  Or the restriction that
ML and Haskell still have semantics (higher order function modules)
which cannot directly represented?

> [...] This would bring the
> hashtable performance even closer to the C/C++ "theoretical max".

Assuming that is a goal, yes.  I'd love to try it out but higher
performacne isn't something I really care about.  I'd love to see stuff
added to the CLI which would enable efficent dynamic languages like
Smalltalk, Lisp or Python or Ruby but unfortunately, that probably will
not happen in near (or even far) future I'm afraid.  Too bad.  Only than
I think one could really argue about CLI being able to support multiple
*different* kinds of languages.


bye
-- 
Stefan Matthias Aust   //
www.3plus4software.de // Inter Deum Et Diabolum Semper Musica Est