[UMN logo]

CSCI 5106: Programming Languages
Spring 2006, University of Minnesota
Assignment 5
Comments on Grading


General Remarks

The assignment was graded out of 30 with 3 additional points for the extra credit part. The break-up was as follows: Problem 1 carried 8 points with 2 points for each part, Problem 2 carried 8 points with 1 point each for the first two parts and 2 points each for the rest, Problem 3 carried 6 points with 2 points each for the three required parts and Problem 4 carried 5 points. The extra credit part in Problem 3 was graded out of 3 points and 3 points were reserved overall for testing of programs.

The mean on this homework was 22.62, the median was 23.75, the highest score was 28 and the standard deviation was 4.05.


Comments on Problem 1 Grading

This homework may have been a little more difficult than expected for some of you because not all of you have used C before. However, this should not have been a real obstacle; the conceptual part should have been familiar to you already, and the pointers to the online reference for C programming should have been the only other thing you needed.

It appears that some of you have some difficulty (or at least some hesitation) writing programs involving recursive data structures. You can, in general, realize relevant computations over such structures using either recursion or iteration. The first is quite easy to program but the second may be more efficient at least in C; functional programming language implementors sweat a lot to make recursion efficient, so this may not be true there.

To understand the two different ways of programming, let us consider the append function. First, here is how I would define a list type:

   typedef  struct ListCell *ListType;
   struct ListCell { 
        int head;
        ListType tail; };
Now, a recursive version of append can be defined simply as
   ListType  append(ListType l1, ListType l2)
   {  ListType newl;
      if (l1 == NULL) return l2;
      else 
        { newl = (ListType)malloc(sizeof(struct ListCell));
          newl->head = l1->head;
          newl->tail = append(l1->tail,l2);
          return newl;
        }
   }
You should contrast this with the code for append that we have seen both in Scheme and in ML---the correspondence is fairly direct.

The iterative version involves a little more work. The main problem is that you walk down the first list seeing elements in the reverse order to that in which they have to be added to the beginning of the second list. Once you have noted this fact, there are several ways to realize the required computation. Here is one way:

   ListType append(ListType l1, ListType l2)
   { ListType applist, l1copy;

     if (l1 == NULL) return l2;
     applist = (ListType)malloc(sizeof(struct ListCell));
     l1copy = applist;
     while (l1->tail != NULL)
     { l1copy->head = l1->head;
       l1copy->tail = (ListType)malloc(sizeof(struct ListCell));
       l1copy = l1copy->tail;
       l1 = l1->tail;
     }
     l1copy->head = l1->head;
     l1copy->tail = l2;
     return applist;
   }
A point to note with regard to all the functions is that they should not modify their inputs unless they are explicitly required to do this. This is just part of good programming style. If your functions do modify their input, then you have side effects that are not apparent to someone who doesn't understand your function. Also, you can't do something like append(L1, L1), even though this should be a legal function call. Another point to note is that empty lists are legal inputs and so should be treated correctly by your program/functions. The code turned in by some of you did not handle these situations correctly.


Comments on Problem 2 Grading

Apart from using a union type, the main new thing in this problem was to parameterize the printlist and map functions with functions. For map there is no choice. For printlist, this becomes obvious if you think about it for a bit; essentially, how to print is dependent on what the element type really is and the best solution to this is to pass in the element printing function with the list. Several of you used a type tag either passed as a parameter to printlist or included in the union type. Passing a tag parameter to printlist (instead of a function to print elements) is problematic. For one, the programmer needs to know something about the innards of printlist to decide correctly what tag to send along. For another, the code for printlist itself gets cluttered; apart from just the nested ifs or a case statement, this code itself has to be changed if you add one more possibility to the union. Adding a tag to the node itself, and defining the function that prints elements in such a way that it looks at the tag to decide what to do does not change the modularity aspect substantially. You still have to define one element printing function that anticipates all possible cases for the data element to be able to use such a structure. If you want to add one more type for the data element, you have to find this printing function and modify its innards as well. This distributes the modification over code in an undesirable way. In contrast, if you provided a function for printing items of the relevant element type as an argument to printlist then the only changes for extending the allowed element types would be adding a new case to the union type and defining a function for printing the new element type. As you can see, this change is much more modular. Of course, the function for printing the element type under this model would have to know how to interpret the union type it gets, but this is an acceptable burden; at the point where you are calling printlist, the caller must know what element type the list contains.

One other comment concerning the structure of a union. If you wanted to, you could equalize the sizes of the different alternatives by using, for instance, a union of pointers. This can have a space advantage if you end up using the union type with wildly different data sizes. On the other hand there is an extra indirection with associated space and time costs for this.


Comments on Problem 3 Grading

The definition of the list type should have been easy in this problem after the hints in the writeup and in class. If not, here is how I would have defined the type:
   typedef struct ListCell *ListType;

   struct ListCell {
     void *head;
     ListType tail; };
One problem with defining a generic type like this is ensuring that each cell has enough space for any type of data. The only way to do this for all cases is to use a pointer for the data. The other problem is to determine the type of the data. In C, the only solution is to essentially forget the type. This does pose problems but ones that can be eliminated with a richer type system.

The changes to the structure of map and printlist should be fairly obvious; essentially the function they take would have to work with void * data. For example, the header for map might be the following:

   typedef void *(*pfi)(void *i);

   ListType map(ListType l, pfi f);


Comments on Problem 4 Grading

We were looking for a discussion in this problem of the issues this homework exposed you to. We did expect you to pay attention to the details of the other problems and would have noted it in your writeup if you did not do this. For example, a general statement such as `the third approach leads to a loss of type safety' was not sufficiently informative. There are specific ways in which it leads to such a loss: thus, the compiler can give us no assistance in determining if the function that is input to map is right for the list that is input to it. Similarly, we do not have any means of checking that, for instance, the result returned from map is a list of integers before we start trying to add these numbers. You will, hopefully, pay attention to such issues when doing the first problem in Homework 7.

A common kind of comment is that unions (in contrast to the approach in Problem 3) could lead to a substantial space penalty. This is not really necessary; as pointed out already, you can use a union of pointers that would, in a sense, mimic the void * solution or approach. Thus, there is a tradeoff between extra space for a pointer (and the time needed to dereference it) and the potentially wasted space if you did not use a pointer but the list elements needed differing amounts of space. Some of you also pointed to the cost of the nested ifs in your code. However, this is not a necessary cost since, as pointed out already, nested ifs are probably not the only or even the best way to realize different alternatives in this case.

Another worry is that of releasing space in conjunction with the last approach. The problem arises only if you share the representation of list elements, something that is generally reasonable to do. However, this is a general problem, not one restricted to this approach. For example, what else would you do if you were manipulating lists of lists? Neither C nor C++ offer any real solutions to this kind of problem and programmers generally have to develop a method for cleaning up garbage (if this is important) in addition to solving the real problem in these languages. The saner, more elegant, way to handle this is to leave garbage collection worries to some external process as is done in languages such as ML, Java and Prolog.


Last updated on April 11, 2006 by gopalan@cs.umn.edu and xqi@cs.umn.edu