[Mono-list] C#/.NET Generics update and summary

David Jeske jeske@chat.net
Sat, 29 Mar 2003 09:34:42 -0800


On Sat, Mar 29, 2003 at 12:48:16PM +0100, Stefan Matthias Aust wrote:
> For me, improved type safety is the more important point.  

I have another proposal to handle checked exceptions without the
annoyance of Java. However, this would have to be in a C# superset
like the Java superset you referred to.

  http://mozart.chat.net/~jeske/unsolicitedDave/csharp_checked_exception_proposal.html

> Smalltalk - I know for sure - has no concept of static types, therefore 
> no concept for typecasts whatsoever and therfore has no such overhead 
> and its collections work with every type of object without any runtime 
> penalty.  So I don't think that your comparison works.  The performance 
> is worse because to regain static type-safeness, both Java and C# have 
> to inject runtime type checks into the getter methods.

The conciseness of my review obviously could not cover the detailed
issues. 

In a dynamic typed system like Smalltalk or Python, there are many
possible sources of overhead in using collections such as a hashtable:
  a) dynamic method lookup to call method on key object to 
     get it's hash code
  b) dynamic method lookup to call hash table methods (get/set)
  c) overhead of handling basic datatypes such as integers or simple
     records in a manner which is similar to "objects" for the runtime
     (i.e.  usually involving memory allocations)

C# is faster at (a) and (b) by using C++ style static vtable lookups
for methods. "really fancy" Smalltalk runtimes like the SELF/Smalltalk
runtime which eventually became hotspot can sometimes optimize out
this overhead at runtime if a single type appears in the
hashtable. Java Hotspot JITs try to do the same thing. I'm not sure if
the MS JIT does this, but it could.

Generic C# improves (c). Most dynamic languages have some kind of
hacks to handle this well, so the object allocation does not happen at
the hashtable insertion. For example, Python pre-allocates some number
of "integer objects" so it can use and reuse them without
allocation. C# and Java must current allocate object to "box"
values. This is the biggest reason that integer keyed hash tables are
FASTER on Python than C# today. Generics will fix this.

> Java furthermore suffers because you have to explicitely box and unbox 
> primitive types.  It might be possible that .NET could provide faster 
> boxing and unboxing because the VM deals with this issue but I don't 
> know. 

It's pretty much the same. It's the memory allocation and value
indirection that kills. A C# could have an "integer pool" similar to
Python for small integers. That would likely help things, but Generics
are a better solution.

> > Generics for .NET support runtime specialization into static code,
> > eliminating the extra typechecks, and bringing the performance of
> > collections closer to that of C/C++.
> 
> Support, but not require.  As I understand the specification, it would 
> be pretty valid to use the same method as generic Java to fulfill the 
> specification.  However, there's an opportunity to generate special 
> kinds of collections for primitive types.

Yes. My words were chosen carefully. Obviously the typesafety benefits
come from Generics at the language level, and performance comes from
that code specilization at the JIT level.

> It's still unclear for me what would be the better approach to spread 
> the use of generics.  The Java way which says we will not change the VM 
> so that any code written with generic types will still run on old 
> installations (although not faster than the old non-generic-code) or the 
> .NET way which says you'll need a new VM (aka VES) but you might get 
> better performance in exchange.

Microsoft will use "Windows Update" and other mechanisms to push out
the v2.0 CLI as fast as possible. There are tools which allow you to
"convert" Generic CIL to normal CIL. I'm sure VS.NET will let you
build "CLI v1.1 compatible" code from your Generic C# and we should
too -- even if it merely involves another conversion step.

> > 5) How is C# Generics different from Generic Java?
> 
> Isn't the extended type system which is the foundation for C# generics 
> also more powerful that what Java has because its modeled after the ILX 
> requirements which shall help to generate more efficient code for 
> functional languages like F# (Ocaml) or Haskell(.NET)?

It is unclear how good ILX is. Obviously the ILX people think it is
somewhat useful. This presentation pans ILX pretty bad:

 http://www.dcs.ed.ac.uk/home/stg/MRG/comparison/slides.pdf

> >    run-time casts. Gyro, a reference implementation of Generic C#/CIL,
> >    is available as a patch to the Microsoft Shared Source CLI.
> 
> Does Gyro already has the mentioned CLI 2.0 and is there a measurable 
> performance improvement?

Yes it has CLI changes. It's not clear if it's exactly the same as
what will be standardized. You can see their performance results in
thei paper. The benefits are as you would expect, removing of the
boxing allocations speeds up value type hashtables.

  http://research.microsoft.com/projects/clrgen/generics.pdf

A JIT could specialize further, by inlining a type test and the
"gethashcode()" call for the expected base type, resorting to the
normal methods when the type tests fail. This would bring the
hashtable performance even closer to the C/C++ "theoretical max".

-- 
David Jeske (N9LCA) + http://www.chat.net/~jeske/ + jeske@chat.net