[Mono-list] Gcc summit...interesting stuff

Chris Lattner sabre@nondot.org
Wed, 26 Nov 2003 13:48:28 -0600 (CST)


> > While possible, it would be _very_ difficult.  LLVM code is more
> > expressive/low-level than CIL code: for example array bounds checks are
> > not implicit, and there is no object model.  I'm not sure exactly how you
> > would map general C programs onto the managed runtime at least, much less
> > general LLVM programs.
>
> LLVM should map to *unverifiable* CIL without too much difficulty, I think.
> Well, actually you'd map to a subset of that: you wouldn't use the
> object model instructions at all.

Ah, ok.  I thought the unverifiable CIL was basically just machine code.
I didn't know it used the stack machine: cool!

> It's mostly fairly straight-forward to map general C programs onto
> unverifiable CIL.  Casting a pointer to int or vice versa is easy, just
> push as one type and pop as another.  Pointer arithmetic is just integer
> addition.  The C heap is unmanaged memory which can be allocated either
> as a global array or using OS-specific code.

Ok.  There are _inherently_ difficult parts though.  For example, you
can't really translate '#ifdef BIG_ENDIAN' style code into a portable
representation, no matter what it is.

> There are some tricky parts,
> such as volatile and setjmp/longjmp, but these are not insurmountable
> hurdles -- they can be handled, it just requires a little more cleverness.

The hardest part is probably handling all of the libc functions that
everyone expects: signals, stdio, etc.  Running a subset of C programs
probably wouldn't be that hard.  Also, it's not volatile itself that is
the problem: it's the reasons that volatile exists which you probably
wouldn't be able to support (mmap'd IO, etc).

Also, you might be interested to know that LLVM already maps
setjmp/longjmp into exception handling constructs, so I expect sjlj to not
be a big problem in a CIL mapping...

> > The best way to do this would be to make a _new_ C/C++ compiler like
> > Microsoft did, which adds language restrictions for managed mode.
>
> Microsoft's C++ compiler, and lcc, and the C compiler in Portable.NET
> can all compile almost every C construct to unverifiable IL.  I don't
> know if any of them handle volatile properly, and AFAIK none of them
> handle setjmp/longjmp.  But that's just lack of development resources.

LLVM preserves volatile correctly and maps SJLJ into exceptions.  It also
supports the full set of GCC extensions, and uses the G++ "3.4" parser.
If anyone would like to try out LLVM, please download it (or you can use
the webpage: http://llvm.cs.uiuc.edu/demo ).  Of course, I would be happy
to answer any questions...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/