There will be no final exam in this course.
Online discussion using Phorum
New Information:
5/6/2008
5/5/2008
4/25/2008
Earlier announcements still of interest
Appel's book is good for our course because it emphasizes
all aspects of compiler implementation evenly and it will
complement well a process of learning by constructing. However, it is
a little incomplete in its coverage of classic issues such as
parsing. You might want to look up a book such as
Like any sensible compiler course, this one involves a fair bit of
programming. Most of the programming will be done in SML and you will
also use some SML based compiler tools. This
page provides you links at the end to reference manuals that
should suffice for the purposes of this course.
However, if you would like to have a reference in hand, you might
consider
Contact Information
Office Hours: Tues 11:15 a.m. - 12:15 p.m., Thurs 2:00 - 3:00 p.m.
Office Hours: M 3:30 - 4:30 p.m., W 12:00 - 1:00 p.m.,
Room: EE/CSci 2-209,
612-626-7512.
Course Texts
The text for this course is
Modern Compiler Implementation in ML, Andrew Appel, Cambridge
University Press, ISBN 0521607647.
The bookstore has copies of this book and it and can also be ordered
online; this may be especially pertinent if the bookstore has not
ordered enough copies. I looked at Barnes and Nobel, for
example, and they seemed to have copies in stock. If you are not able
to get your hands on a copy, please let me know right away. I will try
to find some other means for you to get relevant portions of the text
till a copy becomes available.
Compilers: Principles, Techniques and Tools, Aho, Lam, Sethi and
Ullman, ISBN: 0321486811,
Addison-Wesley Press
to learn more about these aspects. The book has been updated and
released in the Second Edition recently. The Walter Library has a copy
of the older version that I will have put on reserve. You can also
order copies of this book online if you are so inclined.
Elements of ML
Programming, J.D. Ullman.
Two online sources to get started with SML that I recommend especially
are Bob Harper's Introduction
to Standard ML and Mads Tofte's Four Lectures on
Standard ML. For more such sources, check out the SML/NJ literature page.
You will need to have significant familiarity with programming and
with programming languages as such. This
course will require you to think flexibly about programming language
constructs---you need this before you can understand how to
translate them---and you will also have to write and debug a
reasonably large SML program. The background in programming languages
can be obtained at the University of Minnesota by taking CSCI 5106,
the programming languages course, and a number of undergraduate
courses could help you develop the skills needed for writing big
programs. Prior familiarity with SML could be a plus, but you
should be fine if you have programmed a fair amount in some other
language, especially one that supports functional programming.
However, you should be interested in, and not overwhelmed by, the
prospect of building a big and reasonably complicated program in order
to fully enjoy this course. Most of the grade will be oriented
around constructing a compiler that is significantly more complex as a program
than anything you would have seen in CSci 1901, CSci 1902 or CSci 3081W.
You will have to do a fair amount of reading outside of
class. Specifically, you will have to look at manuals for tools for
constructing parsers and lexers that will only be discussed cursorily
in class. The TA and I will be happy and available to
help you at designated times outside of class and we have also set up
Phorum bulletin boards that are very effective for discussions. In
short, reasonable forms of help will be available throughout the
term. However, there is a basic theshhold that is needed to benefit
from all this: If you like to have everything explained
in detail in lectures before you undertake the assignments, then you
will have a lot of difficulty with this course. Please make this
determination at the very outset, using the first couple of
homeworks as a guide, so as to save agony or trauma in the middle of
the semester.
There will be a mid term in the course that will count for 15% of the
grade. This exam will be an open book one. You may be permitted to use
my class notes; this will be determined closer to the date of the
exam. The date for the mid term is March 11, 2008. There will be
no final exam in this course.
The last 5% of the grade is reserved for class participation. This
participation may occur through Phorum discussions, questions posed
or responses provided in class and other similar forms. The message
here is that we would like to see all of you being enthusiastic about
the material we study---I am hoping that everyone will
get the full 5 points in this category!
Certain topics, such as those
related to parsing and programming tools for constructing parsers,
will be covered somewhat cursorily in lectures. In this case, the onus
will be on you to fill in the missing parts. The relevant sources will
usually be mentioned in class and you can also ask for these through
Phorun. The knowledge you obtain from such readings will usually be
relevant to the programming assignments but some of it may also be
needed for the exams.
Homeworks must be turned in at the beginning of class on the due date
indicated for each. Late homeworks will generally not be
accepted unless there is a substantial reason, such as illness, for
the lateness. A broad requirement for passing the course is that you
have over 50% of the score on pretty much every homework and that
your final program works on a significant number of the test
cases.
Grades for programming assignments will be usually be determined by
a combination of the completeness of coverage as determined by
(published and secret) test cases and the clarity and elegance
of the code and how well you have structured the presentation
of the material. The last is not a whimsical matter; an
important component of any large piece of software (such as a
compiler) is structuring that facilitates modifiability and
documentation that supports understanding. Note that there will be
partial credit for homeworks but this is dependent on your
turning in something indicative of your work. Usually, a prerequisite
for a non-zero grade on a programming assignment is that your program
can be compiled and run successfully on some of the test cases.
Course Description
The translation of high-level directives into
machine-executable instructions is a spectacular success of applied
computer science. This course teaches formal and systematic techniques
for syntax-directed translation. Topics include lexical analysis,
parsing, abstract syntax, semantic analysis and elements of code
generation. A compiler will be built for a small Algol-like
block-structured programming language called Tiger. The
compiler will be implemented in the language SML.
Syllabus
The goal is to cover the first part of the book, i.e. Chapters
1-12. We will build a compiler for the Tiger language that is
described in the book as we go along.
Course Prerequisites
The formal prerequisites for this course are CSci 2021 and CSci 5106.
Computing Resources
All of you must have accounts on the IT Labs cluster and should
quickly find out how to compile and run SML programs in this
setup. (If you have
difficulty with the latter, let us know right away and we will put
some instructions up.) Whether or not you do your programming on the
IT Labs machines, it is your responsibility to make sure that the
programs that you submit run in this environment. This will be the
first criterion of judgement: programs that do not compile in
this framework will usually not be looked at further and ones that do
not run correctly here on relevant test cases will lose significant
credit. Starting code that we need to give you for the different
assignments will be Unix based and will be guaranteed to work only in
the IT Labs setup.
Required Work and Grading
Most of the grade in this course will be determined by programming
assignments. Cumulatively, these will account for 80% of the
score. There will be scores for each assignment that will add up to
60%. All but the first assignment will correspond to different parts
of a compiler. The remainder of the homework grade, 20%, will be for
the final program that you turn in for this compiler, i.e. one that
integrates all the parts. Hopefully, this will be an complete compiler
for the Tiger language. The due date for this final
program/project is May 9, 2008.
Collaboration and Academic Honesty
Discussion related to the homework assignments and especially related
to the construction of the compiler is strongly encouraged. Right
before we start on this project, I will ask you to organize yourselves
into groups of two or three individuals to facilitate such
interactions. The HyperNews facility can also be used for this
purpose, even across these groups. However, I expect each one of
you to write the programs that you turn in completely
independently. In particular, you must not share any of
this code. If such sharing takes place, it is usually easy to
detect---there are even electronic tools available for this
purpose---and it will be dealt with as a breach of the academic
honesty guidelines for this course. Penalties for such
transgressions will range, at my discretion, from no credit for the
assignment in question to a failing grade in the course. Also,
all these cases will be reported to the relevant departmental and
university offices that monitor such offenses. At a personal, and
perhaps more important, level, this kind of dishonesty interferes
seriously with your learning process and is, for this reason,
extremely detrimental to your own intellectual development.
Other Course Information and Resources
Here are some links to software and manuals we will use in the
course. More will be added as needed.