[Mono-bugs] [Bug 81685][Wis] Changed - Stack Overflow detection

Wed May 23 07:29:49 EDT 2007

Please do not reply to this email- if you want to comment on the bug, go to the
URL shown below and enter your comments there.

Changed by lupus at ximian.com.

http://bugzilla.ximian.com/show_bug.cgi?id=81685

--- shadow/81685	2007-05-21 19:34:26.000000000 -0400
+++ shadow/81685.tmp.458	2007-05-23 07:29:49.000000000 -0400
@@ -74,6 +74,58 @@
                 } finally {
                         Recurse (1);
                 }
         }
 }
 
+
+------- Additional Comments From lupus at ximian.com  2007-05-23 07:29 -------
+The detection must happen also in the trusted code, because the user
+can trivially fill most of the stack and then call a function in the
+trusted code that makes it overflow, so we need to be able to handle
+the overflow there, too.
+
+It may be possible for us to implement the check without slowing down
+normal function calls (but we'd still need to use sigaltstack in this
+case). It should be as follows: we setup a small sigalt stack for each
+thread so that the SEGV handler can run: the handler would be very
+simple, basically just check if it happened in managed/unmanaged code.
+
+In unmanaged code we have no other sane option than abort. In the
+managed case we'd create a handler call frame in the user stack and
+transfer control to it. Note that no GC objects are manpulated in this
+stack, so it doesn't need to be registered with the GC and tracked.
+8 KB of memory should be plenty for this stack.
+
+Now, before setting up the handler frame in the original stack, we
+need to properly mprotect() the stack area that was PROT_NONE before
+to trigger the SEGV (if this was a stack overflow, otherwise this is
+not needed). As we unwind the stack we can mprotect(PROT_NONE) again.
+
+We should always leave at least one page protected to handle cases
+like the finally one above (but Robert's test won't work with this new
+setup, because you'd neeed to trigger the handler with an actual stack
+overflow, not just a null reference exception).
+The handler running in the altstack can detect if a stack overflow
+handler was already executing and simply abort the finally clause that
+caused the new overflow (the call-finally thunk can setup a LMF entry,
+for example).
+
+The trick here is to reserve a few pages at the end of the stack for
+this purpouse, in addition to the system-setup page. Managed code
+wouldn't need any code in the prolog, because the handler can run in
+the alt stack (for systems that don't support the alt stack we'll need
+to add the checks so that there is enough room for the kernel to put
+the signal handling frame there).
+
+Now we need to protect unmanaged code from overflowing the stack: to
+prevent this we need to add the stack-banging code in the
+managed->unmanaged wrappers (this includes compilation trampolines);
+they should touch a few pages (configurable), depending on how much
+stack our runtime needs. To reduce the need for stack in the runtime
+we need to audit and remove all the uses of alloca() and heavy stack
+users.
+
+Having the checks only in the managed->unmanaged wrappers wouldn't
+reduce our performance in a significant way.
+
+