Chapter 2: Scanning
Chapter 2: Scanning
CS106 — Compiler Principles and Construction
Fall 2011
MUST FIT
Zhiyao Liang
Scanning Dr. Zhiyao Liang
1
Chapter Overview
We have already studied:
Regular Expressions
Finite Automata
From Regular Expressions to DFAs
We will study:
The scanning process
Implementation of a TINY Scanner
Scanning Dr. Zhiyao Liang
2
Overview of scanning
Scanning is also called lexical analysis.
It is a phase of compiler
The task is reading the source file, and dividing it into tokens.
Tokens include:
keywords: if, while, return …
identifiers: x, sum, result, …
special symbols: +, -, *, /, >=, …
The common patterns of tokens are specified by
Regular expressions and finite automata.
Scanning Dr. Zhiyao Liang
3
DFA of the TINY scanner
Scanning Dr. Zhiyao Liang
4
Some character need to be taken from input before DONE, such as =. In other cases, before done, the “other” character which is just read need to be pushed back to the input stream.
4
DFA of the TINY scanner, made by jflap
Scanning Dr. Zhiyao Liang
5
token type
token attribute
token record
getToken
Scanning Dr. Zhiyao Liang
6
Extensions to RE
Scanning Dr. Zhiyao Liang
7
Example of RE for Tokens
Scanning Dr. Zhiyao Liang
8
Ambiguity, White Space, and Lookahead
Scanning Dr. Zhiyao Liang
9
Implementation of Finite Automata in Code
Scanning Dr. Zhiyao Liang
10
/docProps/thumbnail.jpeg