ECE 466 - Compilers
Syllabus
Please read the syllabus which
contains information about suggested text references and course expectations!
Lecture Notes and Assignments
The 2023 lecture notes and assignments are posted here. Some of these
will be updated.
Unit 1 - Intro/Lexical
Topics include: overview of the compilation process, front-end vs back-end,
the role of lexical, syntactical and semantic analysis, regular expressions,
finite automata construction, lex/flex programs.
Lecture Notes .
Suggested readings and activities
* Read the documentation for flex. Try out some examples.
* Read section 6.4 (Lexical Elements) of the C language standard. And/or
read Chapter 2 of Harbison and Steele.
* Read Chapter 3 of the dragon book. Don't bother reading chapters 1 and 2;
they will just confuse you.
Assignment 1 - Lexer
The first assignment is to write the lexical analysis phase of the C compiler.
Assignment sheet
Support files
Unit 2 - Syntactical Analysis (Parsing)
Topics: context-free grammars, derivations and parse trees, LL and LR
grammars, top-down and bottom-up parsing, overview of yacc/bison.
Ambiguous and non-LALR grammars and resolution, precedence/associativity.
Embedded semantic actions, synthesized vs inherited attributes.
Suggested study
* Read the entire Bison manual.
* Read Chapters 4 and 5 of the Dragon Book.
* Read Chapter 7 (Expressions) in H&S. And/or read section 6.5 of the C standard.
Lecture notes 2
Assignment 2 - Expression Parsing
This is the first exploration of Bison. You will write an
interpreter which understands and evaluates the C expression syntax (limited
subset) and transforms it into Abstract Syntax Trees. Target for completion:
Feb 21
Assignment 2 worksheet
some test cases and sample output from test cases
Remember, these test cases are not a comprehensive test suite! Your output
does not have to match exactly.
Unit 3 -- Error handling and memory allocation
Topics: Errors vs warnings, error recovery, error reporting,
memory allocation strategies within the compiler.
Lecture notes 3
Unit 4 -- Semantic Analysis
Topics: compile-time vs run-time, syntax-directed translation, abstract
syntax trees, type systems, symbol tables and scoping rules, feedback
between lexer, parser and semantic stages, overview of C language
type and namespace issues.
Lecture notes 4Updated 2/26/2025
Suggested Readings
* Sections 6.2 and 6.7 of the C language standard.
* H & S, chapters 4 and 5.
Assignment #3: Parsing declarations, representing
types, managing symbol tables. There are some files here
that you might find useful in testing your code. They are not intended
as exhaustive compliance tests. As with previous test cases, your
output doesn't have to match my sample output exactly.
Assignment #4: Completion of the AST to include
all statements. There are a few test cases along
with example AST output. As always, these are examples only and there
is no requirement that your output match up exactly.
Assignment #4 PDF
Unit 5 -- Intermediate Representations
Graphical vs linear IR, internal vs external IR. 1-address and 3-address
linear IRs. Overview of IR approaches of popular languages including
Perl, Python, Java, PHP. Overview of LLVM IR. Techniques for generating
IR from AST.
Lecture notes Updated 3/26/2025
Assignment 5 -- Generation of Quads
Having completed AST generation and declaration/symbol handling for the
entire language, it is now time to enter the final phase "how do we do it?"
Assignment #5 PDF
Unit 6 -- Architecture-Neutral IR Optimization
Theory of optimization at the quad level. Examples of several optimization
algorithms.
Lecture notes revised 4/21/25
Unit 7 -- Target Environment
We'll explore the X86 assembly language in some depth, with special
attention to variables, addressing modes, and function calling/register
usage conventions. If time permits, we'll contrast this CISC architecture
with a RISC architecture.
Lecture notes revised 5/9/25
Unit 8 -- Target Code Generation
Overview of Instruction selection and Register Allocation.
Challenges in one- and two-address architectures.
Lecture notes revised 5/7/25
Assignment 6 -- Generation of Quads
All work must be submitted by May 15.
Assignment #6 PDF
C Language Standards
In this course, we'll be constructing a rudimentary compiler for the
C language. We won't have time to implement the entire C language standard,
but we'll do enough that our compiler can generate some demonstrative
results.
This raises the question of what is the C standard?
The original definition of C was known as K&R after the authors
of the language, Kernighan and Ritchie. It was not a formalized standard.
In 1989, an ANSI standard was created, which later became an ISO standard.
This is known as C-89. Another significant revision was made in C-99.
The C-11 standard was mostly concerned with multi-threading support
and library functions, which do not concern us in this course.
The C-18 standard (sometimes called C-17) made minor corrections to
the C-11 standard. Finally, C-23 has introduced a number of fairly
minor changes to the language itself (such as the creation of
true and false keywords) and deprecation of some legacy K&R
syntax.
For this course, we can implement a compiler which is somewhere between
C-89 and C-99 and that is more than enough work! Now, another problem with
these standards is that the final version is behind a paywall. But, working
draft versions are released which are almost identical to the final
versions. I've gathered some of these for you:
C-99 working draft
C-11 working draft
C-23 working draft
Assembly/Processor Standards
Intel Architecture Volume 1A
breezy 500 page introduction
Intel 32 and 64 bit Instruction Set Reference Manual 2605 pages of easy reading!
Intel System Programming ReferenceThis material would be more of interest for Operating Systems
Note that the instruction set is presented according to Intel syntax, which differs from the "UNIX" or "AT&T" syntax which is used on Solaris, Linux and BSD systems.
Sun Microsystems x86 Assembly Language Reference
This booklet presents an overview of the X86 assembler in UNIX notation and
provides a translation of opcode names and addressing mode nomenclature between
UNIX and Intel formats. It does not describe the individual instructions
in detail
SPARC Assembly Reference Manual
Provides an overview of the SPARC V8 and V9 assemblers. Doesn't give a
detailed explanation of each instruction but there is a summary of all
instructions and addressing modes
SPARC Architecture Manual
Provides detailed coverage of the SPARC V9 architecture including
operation of the processor, register windowing, addressing modes,
opcodes and instruction formats. This manual was written by the
SPARC International group and as such is OS and assembler agnostic.
But unlike the Intel reference manuals, the notation used follows
the UNIX standard very closely.