[Mono-list] Functional-style lists for Mono

Jim Blandy jimb@redhat.com
10 May 2003 19:39:25 -0500


As far as I can see, the Mono class library doesn't include any sort
of collection that corresponds to the sorts of lists used in
functional languages like Lisp, ML, or Haskell.  There are some
aspects of these languages' lists that make it difficult to fit them
into the Mono type world, but I have an idea about how it could be
done, if one is willing to change the language a bit.


* What do I mean by a "functional list"?

Most functional languages have a standard list type where the
fundamental construction operations are:

- make an empty list (call this 'nil'), and
- given a new element E and an existing list L, produce a new list
  whose first element is E and whose subsequent elements are L (call
  this 'cons').

So applying 'cons' to 4 and 'nil' gives you a list of element, 4.

The important thing to notice here is that the 'cons' operation
doesn't change the tail list L you give it at all.  L still has its
original, unchanged value.  Rather, 'cons' just allocates a new list
node for the new element, whose 'next' points to L.

The fundamental destruction operators on lists are:

- Is this list empty or not?,
- What is the first element of this (non-empty) list?, and
- What is the tail (all but the first element) of this (non-empty)
  list?

Again, none of these operations change the list at all.

>From these basic operations, you build the more common functions to
index lists, concatenate lists, extracting sublists, and search lists
for elements.


* What makes that a functional list?

Well, nothing, really.  When I call it 'functional', the important
thing is that none of the basic operations involve side-effects: you
can do all your computation without changing a list.

But this isn't the only data structure that could have that property.
It's not hard to invent a set of functions for working with arrays
that involves no side effects: for example, your fundamental
operations could be "create a singleton array", "append two arrays",
"what is this array's length?" and "what is the n'th element of this
array?".  No side-effects there.  Of course, building an array from
these operations is at best O(n log n), and O(n^2) if you do it in the
simplest way, but, whatever.

And even this functional list data structure could support side
effects.  Haskell doesn't let you mutate anything; but ML (sort of)
lets you change a list node's element, if you make it a list of refs;
and Lisp will let you change both the car (head) and the cdr (tail) of
a list node.

What's important, though, is that this is a traditional data structure
for functional languages.  Just about every functional language, pure
or otherwise, provides something like this.  The functional style is
full of idioms for working with this kind of list.


* Why is it hard to use functional lists in Mono?

There's nothing stopping you from saying:

    class List {
      object elt;
      List tail;  // null is end of list
    };

That's basically what you get in functional languages.  (Tagged
pointers are an implementation detail; the operations can be
implemented just fine.)

But if a type doesn't implement the IList interface, then I don't
think it should be called a true Mono list.  And the above type can't
implement IList.

The IList 'Add' operation adds an element to the list by mutation; in
particular, if L is the empty list, then 'L.Add (4)' should change L
into a singleton list, whose element is 4.  But the representation of
the empty list is 'null' --- there's no way to 'Add' something to
null.  null has no fields that could be changed.

In Lisp, all empty lists are actually the same object.  'nil' is
actually a single, specific value, like 'null' in C#.  In a running
Lisp program, the last pair in every non-empty proper list points to
the same object.  So if you applied 'Add' to 'nil', the Lisp system
can't tell which of all the lists in the system you wanted the element
to go into: it doesn't have enough information.

(In Haskell and ML, I don't think you can tell whether all empty lists
are the same object, since they don't have an operation like Lisp's
'eq', but they're typically implemented the same way.)

You could introduce a wrapper type for the above list type, say,
ListList (by analogy with ArrayList), like this:

    class ListList : IList
    {
      List l;
      ...
    }

Now you've got an object you can side-effect, even when it's the empty
list.  But I think this is kind of clumsy: you'll use on type for
lists in your Lisp / ML / Haskell programs, but you need to introduce
a new, wrapper type if you want to share data with the rest of the
Mono world.  This isn't really an ideal solution.


* So how should one do functional lists in Mono?

I don't think there's any way to implement, say, Common Lisp or Scheme
in Mono and have their lists be ILists.

But, what if you dispensed with the requirement that all empty lists
be the same object?  What if the 'nil' operation returned a fresh
"empty list" object each time?  Instead of writing (eq x nil), you'd
have to be careful to write (null x), and you'd need to write
something like (nil) instead of '().  But as long as the
representation allows you to mutate an empty list into a singleton
list, then you can support the Add operation.

Of course, when someone suggests throwing out compatibility with all
existing Lisps because they think the ListList wrapper type is ugly,
it might be appropriate to question their priorities --- or just laugh
at them.  But otherwise, a true Lisp in the Mono world will always
have an "accent" in its interfaces; if you want to avoid that accent,
you have to give up on a unique empty list object.

This is untested, and I'm very new to C#, so I may be doing something
dumb here, but this is what I have in mind.

    // An interface to lists built from "pair" objects.
    public interface IPairList: IList
    {
        // Return a new IPairList element whose Head is the arbitrary
        // object HEAD, and whose Tail is the IPairList TAIL.
        IPairList Pair (object head, IPairList tail);

        // Return true if this instance is a list Pair, false otherwise.
        bool IsPair { get; }

        // Return a new IPairList End.
        IPairList End ();

        // Return true if this instance is a list End, false otherwise.
        bool IsEnd { get; }

        // Get/set the head/tail of a Pair.  All of these operations
        // throw an IndexOutOfRangeException on an End, except setting
        // its Tail.
        object Head { get; set; }
        object Tail { get; set; }
    }

I do think this shows up a weakness in the whole CLS idea: it is, in
general, very difficult to allow many different languages to share
data transparently with each other, and still be true to their
original semantics.  Even allowing languages as similar as Emacs Lisp
and Scheme to share data has weird little issues involving the
appropriate meanings of #f, '(), and nil.  I think it will be
difficult to allow an even broader set of languages to share data
transparently with each other.  Simply providing a common runtime and
a common type system isn't enough.  The public interfaces will always
have a little "accent" hinting at their implementation language.  It's
not the end of the world, but it's a pity.