[Mono-dev] Case unaware applications on Mono, IOMAP, FUSE and LD_PRELOAD - a call for discussion

Marek Habersack grendel at twistedcode.net
Tue Mar 10 10:59:18 EDT 2009


Hello,

	As most of you know, Mono has an I/O layer mode in which it is able to
cover problems with some .NET applications incoherence of dealing with
file names. Some operating systems have file systems which are
case-aware but not case-sensitive. Mono supports an environment variable
called MONO_IOMAP which allows one to quickly work around those issues.
When the variable is found in the application's environment, the runtime
will look at the file access/open requests and do two things (depending
on the IOMAP setting - see man mono(1) for details):

1. Remove any DOS drive references
2. If the file with the passed name is not found, it will become case
insensitive and attempt to find the file that way.

	#2 above can be a a big performance hit if the application is big
and/or makes a lot of requests to files on disks. MONO_IOMAP was
intended to be an aid utility for those wanting to quickly test some
application on Mono/Unix before making a decision of a move to Unix.
Ideally, after seeing the application works, developers should make sure
that the on-disk files are accessed with their actual names when the
application is running on Unix. The assumption is fine, but there are
some problems with it:

1. Mixed case in file names is so ubiquitous in Windows applications
that if an application is large, it can be a considerable effort to make
sure all file names are consistently used across the source code (and/or
markup in case of ASP.NET applications)

2. Many Windows developers are unaware of the case-insensitivity
problems which may result in bug reports against Mono (the best case) or
giving up to use Mono (the worst case) because "it does not work"

3. It works only for managed code (see below for why it is bad)

	Items #1 and #2 may lead to the decision of using MONO_IOMAP
permanently and, thus, hurting the application performance. Item #3 may
affect for instance ASP.NET applications running under Apache, as
mod_mono (until recently) wasn't taking any steps to make sure a file
name it is passed from the managed back-end to send to the client using
sendfile(2) (or an equivalent call) and there had been situations that
despite the file name being handled correctly by the manged code, apache
would issue a file not found error, thus causing some applications to
break. It may also affect applications embedding the Mono runtime, for
precisely the same reason - the unmanaged code being passed a file name
which doesn't really exist on disk.

	An intermediate solution for mod_mono (a VERY suboptimal one) has been
implemented a few days ago in which mod_mono, failing to locate a file
on disk, resorts to the same process as the Mono runtime to attempt to
locate the file (this process is triggered only if the application is
configured to use IOMAP), thus doubling the time required to locate the
file on disk. Of course, this hurts the performance of the application
even more if it has a lot of mixed-case file access attempts.

	The correct solution to this issue would be to modify the Mono iolayer
and some classes in System.IO to provide the managed code with the real
path to the file. It isn't hard to implement, but I think it's better to
remove the IOMAP capability from the Mono runtime and replace it with
something else:

      * FUSE solution.
        Long ago, Gonzalo developed a FUSE file system which would
        lowercase all file names and thus take care of the problem
        (http://gonzalo.name/blog/archive/Mono/2006/Aug-30.html). Given
        that today FUSE exists for all operating systems we run Mono on,
        this approach could be used, with small modifications to the
        code found above, to implement the IOMAP in this way.
        Mounting a FUSE file system is a user-level operation which does
        not require root privileges, can be done automatically from
        inside, say, mod_mono (if the IOMAP environment variable is
        found) and the fs can even take care of stripping the drive
        information from the paths. That way both managed and unmanaged
        code take advantage of the support, the IOMAP code doesn't
        clutter Mono runtime and also using FUSE may be a more conscious
        decision on the user part than using IOMAP in the runtime.
        This is an elegant and portable solution. The disadvantage is
        that we add a possible dependency on FUSE (not a problem for any
        modern Linux distribution, might be a problem for other
        operating systems)
      * LD_PRELOAD solution.
        This solution is technically different to FUSE, but the outcome
        is similar. Instead of using a mounted file system, we would put
        the code in a shared library which would take over system calls
        responsible for opening/accessing the files (open, stat, access)
        and do the rest in the same way as FUSE.
        While this is more seamless than FUSE and, in theory, lighter
        than it, it might pose technical problems on systems which don't
        support LD_PRELOAD or restrict its use to root accounts.
        
Both solutions can be used to implement an analysis mode in which they
act as logging proxies and only print problems with file access - this
can be wrapped in a nice GUI to please user's eyes and make consuming
the information easier.

So, to end this longish mail I call everybody interested in the subject
to discuss and suggest what (if anything) should be done about the issue
described above.

marek

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://lists.ximian.com/pipermail/mono-devel-list/attachments/20090310/f767fafd/attachment.bin 


More information about the Mono-devel-list mailing list