The definitive reference for the C language is the ISO standard. The 1999 standard introduced a number of language extensions, some of which we may chose to ignore in class. A working draft of that standard was openly available and a copy may be downloaded here. A more recent revision known as "C11" which was adopted in 2011 is here
In this course, we will be building a heavily simplified compiler for
C-99. In fact, with all the features that will become optional,
the standard will be closer to C-89. Therefore, any of the
standards above can be used for reference. There will be small details
that affect backward compatability and we can discuss these in class.
Remember, these test cases are not a comprehensive test suite! Your output
does not have to match exactly.
Topics: Errors vs warnings, error recovery, error reporting,
memory allocation strategies within the compiler.
Topics: compile-time vs run-time, syntax-directed translation, abstract
syntax trees, type systems, symbol tables and scoping rules, feedback
between lexer, parser and semantic stages, overview of C language
type and namespace issues.
Lecture notes 3 (upd 2/13)(/A>
Unit 4 -- Semantic Analysis
Lecture notes 4 (upd 3/7)(
Suggested Readings
* Sections 6.2 and 6.7 of the C language standard.
* H & S, chapters 4 and 5.
Assignment #3: Parsing declarations, representing
types, managing symbol tables. There are some files here
that you might find useful in testing your code. They are not intended
as exhaustive compliance tests. As with previous test cases, your
output doesn't have to match my sample output exactly. Target: March 14
Graphical vs linear IR, internal vs external IR. 1-address and 3-address
linear IRs. Overview of IR approaches of popular languages including
Perl, Python, Java, PHP. Overview of LLVM IR. Techniques for generating
IR from AST.
Lecture notes (updated 3/29)
Having completed AST generation and declaration/symbol handling for the
entire language, it is now time to enter the final phase "how do we do it?"
Target: April 25
Assignment #5 PDF
Theory of optimization at the quad level. Examples of several optimization
algorithms.
Lecture notes (Revised Apr 5)
We'll explore the X86 assembly language in some depth, with special
attention to variables, addressing modes, and function calling/register
usage conventions. If time permits, we'll contrast this CISC architecture
with a RISC architecture.
Lecture notes
Overview of Instruction selection and Register Allocation.
Challenges in one- and two-address architectures.
Lecture notes
It's the final curtain!All work must be submitted by the last
day of the semester, May 12
Assignment #6 PDF
Here are some resources to help you complete this last assignment:
Note: these are old, I'm looking for more recent versions
Intel X86 reference manual (over 2000 pages!)
Note that the instruction set is presented according to Intel syntax, which differs from the "UNIX" or "AT&T" syntax which is used on Solaris, Linux and BSD systems.
Sun Microsystems x86 Assembly Language Reference
This booklet presents an overview of the X86 assembler in UNIX notation and
provides a translation of opcode names and addressing mode nomenclature between
UNIX and Intel formats. It does not describe the individual instructions
in detail
SPARC Assembly Reference Manual
Provides an overview of the SPARC V8 and V9 assemblers. Doesn't give a
detailed explanation of each instruction but there is a summary of all
instructions and addressing modes
SPARC Architecture Manual
Provides detailed coverage of the SPARC V9 architecture including
operation of the processor, register windowing, addressing modes,
opcodes and instruction formats. This manual was written by the
SPARC International group and as such is OS and assembler agnostic.
But unlike the Intel reference manuals, the notation used follows
the UNIX standard very closely.