[UMN logo]

CSCI 5106: Programming Languages
Spring 2006, University of Minnesota
Assignment 7
Comments on Grading


General Remarks

This assignment was graded out of 28 points. The points distribution was as follows: Problem 1 was worth 5 points, Problem 2 was worth 5.5 points, Problem 3 was worth 3.3 or 3 points depending on which part you did, Problem 4 was worth 7 points, Problem 5 was worth 4 points and Problem 6 was worth 3.5 points. Subdivisions within problems are indicated in the more detailed comments below.

The average on this homework was 20.59, the standard deviation was 6.01 and the highest score was 27.75.


Problem 1.

The first part was really a `giveaway.' All that I wanted you to observe here was that the underlying representation of lists and, in general, of structured data would be very similar in both C and ML. The only difference is that ML does not require you to think of the layout explicitly.

The second part required you to go through the kind of exercise we went through in class in connection with the append function. I was worried that this was going to be too simple, but some of you still had difficulty. Well, here is an outline of how the type of mymap is inferred. Looking at the first clause in its definition, we can see that its type should have the general structure

 
    'a -> 'b list -> 'c list
Evidently, mymap takes two arguments and the two nils in relevant places tell us that the second argument and the result should both be of list type. There are, however, no further constraints.

Now from the second clause, we discover that the first argument of mymap should be a function that is applicable to an element of the list that is the second argument. Further, applying it produces an element of the list that is the result. Using these facts, the type of mymap gets refined to

    ('b -> 'c) -> 'b list -> 'c list
We need also to check that this type `works' in the use that is made of mymap in the second clause. This is easily done.

The last part required you to comment on the type of mymap. The function is clearly polymorphic: the variables in its type can be instantiated by any `concrete' type in an instance of use. However, this polymorphism does not come at the cost of type checking. The common occurrences of type variables constrains the function argument to mymap to be of a kind that works on elements of the list argument. Note also that this kind of checking is done statically. And where is this type checking done? At every use of mymap. The other thing to note is that if the argument types check out at that use, then we also get specific information about the result of the application. For instance, if the function argument had type int -> bool then we know that the result has to have the type bool list from the type of mymap. This type is, in turn, useful in further type checking at the context of use. None of these abilities, neither the ability to check the appropriateness of the function nor the ability to predict the result type, were available with the C definition; the void * type loses all useful information.


Problem 2.

The points for the parts were 2, 1.5 and 2 respectively.

For the first part, you simply had to define a datatype that had the capability of handling the different cases for logical expressions. Something simple like

     datatype logexp =  vbl of string 
                      | andex of logexp * logexp
                      | orex of logexp * logexp
                      | notex of logexp
would do. This would let you write down expressions in ML such as the following
    andex(orex(vbl "p", vbl "q"), notex(vbl "p"))
corresponding to the expression that appears in the homework writeup. You can also include true and false in your expressions with obvious meanings. The right way to do this would be to add extra cases to the datatype definition:
     datatype logexp =  truex
                      | falsex
                      | vbl of string 
                      | andex of logexp * logexp
                      | orex of logexp * logexp
                      | notex of logexp

You could use one of two approaches to encoding the assignment of truth values to variables. In the first case, this could just be a list of variables, the meaning being that these are given the value true and all others the value false. Thus the list [vbl "p"] means that p gets true and q gets false. The other alternative to encoding is to use a list of pairs of variables and truth values.

The function eval can be defined in an obvious way by recursion over the datatype structure; essentially, this function will have one clause for each of the cases.


Problem 3.

If you did part 1, you were actually graded out of 3.3. If you did part 2, 1 point was reserved for subpart 2.1 and 2 for subpart 2.2.

For Part 1, there were two things to explain: the algorithm and its realization. The algorithm works as follows. We look at all the natural numbers starting from 2. At any point we would have collected a set of primes. To determine the remaining primes, we sift the remaining natural numbers by the property of not being divisible by those already found. We can actually do this sequentially: first sift by the property of not being divisible by the first prime already found, then sift the resulting sequence by the property of not being divisible by the second prime already found, and so on. After we have sifted by all the primes we have already found, the first element of the sequence is the next prime number. To find the rest, we simply repeat the process described. Now, if we are lazy in how we realize the sifting---and we had better be, because we cannot represent infinite sequences otherwise---then after we have found the new prime number, we can simply add the new sift to the sifted sequence that we have in suspended form. This is basically what the function sieve encapsulates, using the idea of a stream to realize sequences `in generation.' Once you have gotten this idea, the rest is easy.

The main ideas in the second part is discussed in the handout. There is a difference between what is in the handout and what is in the question that some of you missed. In the setup for the assignment, a stream is represented in the form (stream fn) where fn is a function that will generate a pair containing the first item and a stream encoding the rest when it is supplied with a dummy argument. In the handout, a stream is represented in the form stream pair where pair holds the first element of the stream and a stream representing the rest. In this setup, any function that returns a stream must be defined in such a way that it takes a dummy argument to produce a stream. Think about this a little and you should be able to see the difference and also understand how to use each mechanism.


Problem 4

The grade break up for this problem was 1.5 for each of the functions whilestat and repeatstat and 2 points each for each of the two example encodings.

This problem brings together our earlier discussions of control flow constructs and the present discussion of higher-order functions. For example, consider a statement of the form

while cond 
do body
This statement can be unravelled into one of the form
if cond
then { body; 
       while cond
       do body;
     }
else ()  /* i.e. do nothing */
Once you have done this, you can write down a definition of whilestat immediately:
   fun whilestat cond body =
      (ifstat cond 
              (seq body (whilestat cond body))
              (fn x => x)
      )
Notice that in this definition we have used the previously defined encodings of sequence and if-then-else statements. Notice also that to represent a `statement' that does nothing, I have used the tuple transformer (fn x => x) that simply preserves the given tuple.

There are obviously many other ways to define whilestat. An important point to note is that, no matter how you define this function, you have to make sure that the last argument is the state or, more precisely, the tuple representing this. Not doing so destroys our ability to think of statements as tuple transformers. If we cannot think of them in this way, then we lose many conveniences. For example, in the definition displayed above, use is made of the fact that regardless of what body is, it can certainly be viewed as a tuple transformer. Other pitfalls to avoid are to build into the definitions a restriction of states to those based on two variables only (i.e. a pair for a state) and using ML imperative features for encoding. A while in ML is no good for us; evaluating a while returns a value of type unit and we need a tuple representing a state.

I will assume that you can write a similar definition for repeatstat based on this discussion.

When it came to encoding imperative programs, many of you ended up doing strange things like building up `state' using ML definitions and then eventually writing out a while encoding. This was not acceptable. As a simple example, consider the code

x = 0;
y = 2;
while (x < 10) do x = x + 1;
This would have to be rendered into an ML expression such as
(seq 
   (assignx (fn (_,b) => 0))                     (* encoding of x = 0 *)
   (seq (assigny (fn (a,_) => 2))                (* encoding of y = 2 *)
        (whilestat (fn (a,_) => (a < 10))        (* encoding for while *)
                   (assignx (fn (a,b) => a+1)))  (* encoding of x=x+1 *)
   )
)
Look at what I have written above carefully and see why what you did was not quite adequate as an answer to this question.


Problem 5

One way to think of this problem is that it is asking for a definition of a function corresponding to the righthand side of each equation using reduce with suitable arguments. Here is a solution to the first part under this interpretation:
   fun length x = reduce (fn (x,y) => y + 1) x 0
Other parts have a similar solution.

Each part carried 1 point.


Problem 6

The hint in this part practically gave away the solution. What you have to do is look at the elements in the list of trees one after another, throwing away empty trees and accumulating the integer value and adding the left and right subtrees for later processing to the list of trees. This gives us the following kind of definition:
   fun sumtree nil n = n
     | sumtree (Empty :: t) n = sumtree l n
     | sumtree ((Node(d,l,r)) :: t) n = sumtree (l :: r :: t) (n + d)
Some of you thought of accumulating the value at the node into the sum but recursing separately over the left and right branches. This does not result in a tail recursive program: after recursing over the left part, you still have to return to the body of the function. From an implementation perspective, this means that you cannot discard the activation record prior to the recursive call.


Last updated on May 1, 2006 by gopalan@cs.umn.edu and xqi@cs.umn.edu.