[Mono-devel-list] PaX and Mono

Mon Feb 14 13:32:02 EST 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm interested in Mono running under a PaX system with full
restrictions.  PaX is used in the GrSecurity project to supply enhanced
memory protections.  The younger and less developed Exec Shield
technology is also considerable, though it has flaws which make it
unuseful for any real secure environment.

PaX employs restrictions which prevent memory from being executable and
writable at the same time.  By removing the mprotect() restrictions, a
binary can be allowed to mprotect() memory writable and executable, or
transition from non-executable to executable.  It should also be
possible, however, to create a more complete and durable solution for
JIT compilers in which full restrictions can be applied, creating the
most secure environment.

I first explained this method to the Kaffe team, and it is available on
their mailing list archives[1].  It entails using temporary files during
the JIT process, rather than building directly in memory.

Discussing this in #mono, it was raised that the design of the JIT
raises large overhead problems.  To build a JIT which caches to disk,
such problems have to be overcome.  I took a 5 minute idea as a
beginning for this.  Perhaps if you go this way you could use or refine it.

The idea is based on no looks into the code.  I've been told where to
look to make changes, and will poke around and see if I can get it to at
least do it.  The actual method shown below is just out of my head and
probably won't actually work without more intrusive changes.

Below is the theoretical model complete with small bits of pseudocode to
illustrate a 2 level caching on JIT compilation of a method.  This
should significantly lower the probability of calling the JIT when a
method is called, especially more than a few seconds into run.
Compiling 10 methods at a time is significantly faster than compiling

 - Method M() is called
 - M() calls N()
 - N() calls L()

We'll cache 2 levels deep, but only for methods that are uncompiled.
The caching is done to methods that are uncompiled.

if ( !(is_compiled(method_to_be_called)) )
    compile(method_to_be_called, 0, 0, NULL);

void compile(void *method, int level, int fd,
  void **indata) {
  void *call_list;
  void **data;
  void *mydata;
  int i;
  char tmpfn[255] = {'\0'};
  if (!fd) {
    /*tempfn must be a writable string, a local variable
     * is best.  Besides, we're generating it.
     * secure_dn is the directory name gotten from the
     * mkdtemp() call done on initialization of the JIT
     */
    snprintf(tmpfn, 254, "%s/jit_code_XXXXXX",secure_dn);
    fd = mkstemp(tmpfn);
    if (!fd)
      abort("Cannot create temporary file!\n");
  }
  /* We use our data array if we weren't passed one.
   * In the recursion, we wind up passing *mydata continuously
   */
  if (!indata)
    data = &mydata;
  else
    data = indata;
  /*do_compile() is a function which actually compiles
   * the bytecode, puts a list of called functions into
   * call_list, and writes the output to the end of data.
   * data is resized if needed.  Nothing is written to fd
   * yet; do_jit_mmap_in() does that before the mmap().
   *
   * It also enters the address of each method into a table
   * for the file, which is linked to by the major table once
   * do_jit_mmap_in() is called.
   */
  do_compile(method, &call_list, fd, data);
  i=0;
  /* up to 2 levels deep, iterate the
   * list of methods called by method and compile them
   */
  while(level < 2 && call_list[i]) {
    if ( !(is_compiled(call_list[i])) )
      compile(call_list[i], level + 1, fd);
  }
  /*dump our local call list*/
  free(call_list);
  /*only level 0 writes and frees and maps*/
  if (!level) {
    /* This mmap()s the fd as executable.
     * It first flushes data to the file and frees
     * data from memory
     */
    do_jit_mmap_in(fd, data);
  }
  close(fd);
}

Above we see the following:

 1.  Upon entering an uncompiled method M(), that method is compiled
 2.  Upon compiling method M() as per (1), all methods N() called by
     M() are compiled if and only if they're not already compiled
 3.  Upon compiling methods N() as per (2), all methods L() called by
     N() are compiled if and only if they're not already compiled
 4.  No more methods are compiled

So, if M() is compiled and calls N3() which is already compiled, but
calls L2() which is not compiled, L2() is not compiled; however, if
M() is compiled and calls N4() which is not compiled and calls L3()
which is not compiled and calls Z12(), M() N4() and L3() are compiled,
but not Z12().

Everything compiled in one go is dumped to the same file.  That file
is then mmap()ed in at the end of the compile cycle, and a lookup
table generated in memory as the file is compiled is finally linked
into a main GOT in memory.

[1]http://www.kaffe.org/pipermail/kaffe/2004-October/099938.html

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

    Creative brains are a valuable, limited resource. They shouldn't be
    wasted on re-inventing the wheel when there are so many fascinating
    new problems waiting out there.
                                                 -- Eric Steven Raymond
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCEO6hhDd4aOud5P8RAjyUAJ9lmMpPoiTZUm7gGixv4qmAGvjrswCghs2h
95akqzumc5h0+a48zEP5DWo=
=PANP
-----END PGP SIGNATURE-----