[Mono-list] mono performance, 20x differential with Java (what am i doing wrong)

Fri Jan 29 21:54:32 EST 2010

On Saturday 30 January 2010 00:20:11 Jonathan Shore wrote:
> On Jan 29, 2010, at 7:19 PM, Jon Harrop wrote:
> >> Jon, I saw your post about that on your blog some time ago.   Someone
> >> familiar with Mono claimed otherwise, was therefore uncertain as to
> >> whether was addressed or not.
> >
> > You should be able to verify my results easily: just run the 8-line
> > example F# program I gave and Mono will stack overflow.
> >
> >> I can live some some inefficiency in tail calls provided one does not
> >> get stack overflow or some other fatal issue.
> >
> > TCO is broken on Mono, not merely inefficient.
>
> As I have no familiarity with the Mono VM code, no idea what it would take
> to fix this.

There are many different solutions. The simplest would be to use LLVM's fast 
calling convention and tail calls as HLVM does.

> >> To be honest I
> >> would get more value out of a Ocaml variant wedded to the .NET platform.
> >
> > Yes. F# is awesome but only on Windows/.NET and not on Mono.
>
> Hmm, very problematic for me ...

That's why Microsoft did it. ;-)

> >> There is just so much momentum and available libraries on the two major
> >> VMs (CLR and JVM), that would be a huge risk for me at the moment.
> >
> > I was actually disappointed with .NET's libraries in the context of
> > technical computing. I felt OCaml had better libraries and it turns out
> > that .NET was about as popular for technical computing as OCaml was when
> > I started. The main exception is WPF but you don't get that with Mono.
>
> I guess it depends where you come from.  First I'll have to be honest and
> say that I am new to Ocaml.    My FP background is Scheme and some dabbling
> in Haskell.    I had heard from real-world users of Ocaml (such as the Jane
> Street capital guys), that the depth of libraries for Ocaml is pretty
> shallow.    They've invested some years into building that up, but is
> private work largely.
>
> Now if we are talking about numerical stuff, then yes, there is not much
> publicly available on either the CLR or JVM.    I was more referring to the
> tech libraries rather than scientific.

Yes. Technical libraries (e.g. graphing) are far more advanced on .NET. I was 
referring only to numerical libraries like BLAS, LAPACK, FFTW and GSL.

> >> I also
> >> have a significant body of imperative VM-bound code that I need to get
> >> access to.    If HLVM could interact with java bytecode or .NET
> >> bytecode, would work for me.
> >
> > You should be able to compile plain numerical code from JVM/CIL to HLVM
> > easily enough, particularly when HLVM is more complete.
>
> I'll look forward to seeing that.   Are you implying that I would be able
> to take a bunch of java classes and make them available?    I guess it
> depends on what you mean by "plain numerical" code.

I mean code like your test1 function. That has an obvious direct translation 
into HLVM code.

> Will or does HLVM support the F# dialect of Ocaml as well?

HLVM is designed to be a language agnostic VM so it could support either in 
theory. In practice, I will probably create a new language and any others 
will be ports done by other people. Currently, both OCaml and F# box tuples 
which would be a disaster on HLVM because my GC is not optimized for 
short-lived values. Objectively, F# should not box tuples either. In fact, if 
Mono implemented TCO and structs correctly and its own F# then it could unbox 
tuples and would see huge performance improvements as a consequence.

> > That doesn't really interest me. F# is so far ahead now that everything
> > else is a toy in comparison from my point of view. HLVM is just a hobby
> > project designed to bring some of the benefits of F# to the open source
> > world for fun but it is a massive undertaking because the open source
> > world doesn't even have any reliable foundations like .NET, let alone
> > decent libraries like WPF built upon them. So I have to build everything
> > from scratch myself. I'm not even sure I will be able to use hardware
> > acceleration due to the poor state of OpenGL drivers on Linux.
>
> Fair enough.   I recognize that you have accomplished quite a bit with the
> performance of your design.   However, as you allude to, it is quite
> another thing to enrich it to the point of being a broad-use platform.  
> For that you need a group of dedicated developers and the momentum to
> foster that community.

Yes. I don't think that will be a problem. So many people love OCaml but want 
decent multicore support that they would leap on HLVM if only it had a decent 
front end and a couple more features. Those are easy enough to implement, it 
is just a question of me finding the time. :-)

> The MS CLR and Mono may never have the specializations that you have done,
> for instance, make boxing / unboxing a non-issue (or at least a lot
> cheaper).    However, they have momentum and breadth.    Getting the best
> of both would be super, but I understand ...

.NET has momentum and breadth that I can never hope to attain but Mono's level 
of adoption seems entirely achievable to me.

> > You cannot work around boxing on the JVM because it lacks value types.
> > Indeed, that is a major advantage of .NET on the JVM that Mono should
> > inherit.
>
> I'm totally with you on .NET over the JVM.    Sun sat on the JVM and Java
> design for many years.    Catchup now is too late.

Yep. They've left a huge gap in the market for Mono though. :-)

Just to clarify my point, if you benchmark these Java and C# programs that put 
10M floats into a hash table:

  import java.util.HashMap;

  public class Hashtbl {
    public static void main(String args[]){
      int n = 10000000;
      HashMap hashtable = new HashMap(n);

      for(int i=1; i<=n; ++i) {
        double x = i;
        hashtable.put(x, 1.0 / x);
      }

      System.out.println("hashtable(100.0) = " + hashtable.get(100.0));
    }
  }

  using System.Collections.Generic;

  public class Hashtbl {
    public static void Main(){
      int n = 10000000;
      Dictionary<double, double> hashtable = new Dictionary<double, 
double>(n);

      for(int i=1; i<=n; ++i) {
        double x = i;
        hashtable[x] = 1.0 / x;
      }

      System.Console.WriteLine("hashtable(100.0) = " + hashtable[100.0]);
    }
  }

You'll find that Mono is 24x faster than Java in real time and 94x faster in 
terms of CPU time:

  $ java -version
  java version "1.6.0_17"
  Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
  Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
  $ time java Hashtbl
  hashtable(100.0) = 0.01

  real    0m37.379s
  user    2m7.404s
  sys     0m2.788s

  $ mono --version
  Mono JIT compiler version 2.6 (tarball Fri Dec 18 02:02:28 GMT 2009)
  Copyright (C) 2002-2008 Novell, Inc and Contributors. www.mono-project.com
          TLS:           __thread
          GC:            Included Boehm (with typed GC and Parallel Mark)
          SIGSEGV:       altstack
          Notifications: epoll
          Architecture:  x86
          Disabled:      none
  $ time ./Hashtbl.exe
  hashtable(100.0) = 0.01

  real    0m1.555s
  user    0m1.360s
  sys     0m0.184s

Coupled with the fact that Java's FFI is disasterously slow as well and you've 
got a ticking time bomb of crippling design flaws in the JVM that you will 
not be able to escape from.

The moral: don't let Guy Steele drag you halfway to Lisp if you want 
performance that doesn't suck. ;-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e