[UMN logo]

CSCI 5106: Programming Languages
Spring 2006
Some Comments on the ML Module Language


These are some brief comments that link up our discussion of the core language of ML with its modules part. For a more thorough exposition you should look at the technical report by Mads Tofte that is available in three parts: part 1, part 2 and part 3. The (examples from this paper are also available online.). The discussion here is based on this report. Its only virtue is that it hooks into the discussions in our course and so might help you get started with Mads Tofte's report.

Module notions are intended to support programming-in-the-large. Generally, this requires them to provide a means for

SML provides these capabilities through the notions of structures for encapsulating environments that contain a collection of declarations, functors that are functions from structures to structures for permitting interactions between ``structures'' and signatures that are ``types'' on structures that indicate what they provide to the outside world and that determine whether or not a particular structure can be provided as an argument to a given functor. In what follows, these different notions are illustrated with an example from Mads Tofte's paper.

First, there is the notion of a structure. This is in essence the objectification of a collection of declarations of the kind we have seen in the core language. The objectification is carried out by placing the declarations between the keywords struct and end. The object thus created can be given a name through a structure declaration that has the syntax

  structure  <Str-Name> =  <Str-Object>
As a particular example, consider a structure that defines the idea of a heap over integers where a heap is a binary tree in which the root is, recursively, the smallest item in the tree. This is identified by the SML code that you find here.

Once the SML system has consumed this declaration, one gets an environment from which objects can be used. However, they have to be accessed through the name of the structure. For example, we cannot use the name initial meaningfully at the top-level. However, we can use the name IntHeap.initial and will get the value 0 in response. It is possible to ``open'' a structure at the top level. In the case of the particular structure under consideration, this is done using the expression

  open IntHeap
Once this is done, the name initial can be used directly at the top level. Of course, this is going counter to the purpose of defining the module and so this feature should be used with care.

Just as ordinary value objects in ML have a type, so too do structure objects. The type of a structure object is called a signature. The signature is a collection of types and the value and function objects defined in the structure together with their types. For example, the output produced by (an old version of) SML on consuming the structure definition shows this:

structure IntHeap : 
  sig
    eqtype item
    datatype tree
      con L : item -> tree
      con N : item * tree * tree -> tree
    exception InitHeap
    val depth : tree -> item
    val initHeap : int -> tree
    val initial : int
    val insert : item * tree -> tree
    val isHeap : tree -> bool
    val leq : item * item -> bool
    val max : item * item -> item
    val maxHeap : tree -> item
    val min : item * item -> item
    val replace : item * tree -> item * tree
    val size : tree -> int
    val top : tree -> item
    infix leq
  end
In the example shown, the signature of IntHeap was inferred by SML. This need not always be the case. In particular, the user may also specify the signature. In general, a signature is objectified by placing it between the keywords sig and end and it can be named via a signature declaration that is illustrated by the following:
signature INTHEAP =
  sig
    eqtype item
    datatype tree = L of item 
      | N of item * tree * tree
    exception InitHeap
    val depth : tree -> item
    val initHeap : int -> tree
    val initial : int
    val insert : item * tree -> tree
    val isHeap : tree -> bool
    val leq : item * item -> bool
    val max : item * item -> item
    val maxHeap : tree -> item
    val min : item * item -> item
    val replace : item * tree -> item * tree
    val size : tree -> int
    val top : tree -> item
    infix leq
  end
This signature may then be used in typing a particular structure as shown here.

In reality, a signature is also like an interface declaration for a structure. The anology with types is illuminating: just like a type determines whether or not an identifier can be used appropriately in a given context, so too does a signature determine whether or not a structure can be used sensibly in a desired fashion. The particular aspect of ``signature checking'' determines whether or not the structure contains all the required types and functions. A structure may have more in it than what the signature requires. One may look at this as a manifestation of polymorphism. Notice, in fact that, in the heap example, several components of the signature as defined above are irrelevant to the outside. For example, it is unnecessary to know that a tree representation is used and it is also not necessary to have initial, max and min be visible from the outside. These could be dropped from the signature. By doing this we also make these components inaccessible from outside and thereby realize the important idea of hiding implementation details. A declaration that manifests this character is shown here. Compare the signature used here with the one in the earlier declaration to appreciate the difference.

In matching a structure to a signature - the notion of ``type checking'' a structure - certain logical rules must be followed. Thus, there should be definitions in the structure of all the things mentioned in the signature, the types of function and value objects should be at least as general as those in the signature, types should correspond to the extent specified in the signature and exceptions should match totally.

We have so far restricted our attention to integer heaps. However, what we usually want is a notion of heaps parameterized by the choice of the type of item on the heap. To achieve this, we would think of defining the same functions as in the signature, but we would want to parameterize these definitions by

We could think of these as identifying an item structure. Heap structures then need to be parameterized by item structures. This is where functors in SML become useful.

To realize the scheme described above, we first of all define the signature that item structures must satisfy:

signature ITEM =
    sig
	type item
	val leq : item * item -> bool
	val initial : item
    end;
Then we define a Heap functor using a declaration that has the following format:
functor Heap (Item : ITEM) : HEAP = 
    < a ``structure'' declaration that uses item type
        initial and comparison operation obtained from
        the structure Item that is provided as argument >
An elaboration of this definition can be found here.

The only remaining aspect is to show how functors might be used to generate particular structures. The idea is to create these structures via function application. For example, to create an integer heap, we first define an integer item structure:


structure IntItem : ITEM =
    struct
	type item = int;
	fun leq (i:item,j) = i <= j;
	val initial = 0
    end
Then we ``apply'' the heap functor to this structure to generate an integer heap
structure IntHeap = Heap(IntItem)
The application here will first of all check for compatibility - if IntItem provides the necessary components required of the signature of the argument of Heap then compatibility is assured - and then it will carry out the linking of code that is necessary to get a usable object in IntHeap.

The Heap functor can be used to create other kinds of heaps, some of these being illustrated by the code you will find here.


Last updated by Gopalan Nadathur (gopalan@cs.umn.edu) on April 12, 2001