代写代考

Nondeterministic Finite
Automata (NFA)

Copyright By PowCoder代写 加微信 powcoder

Previous Lecture

Scanner: converts a sequence of characters to
a sequence of tokens
Scanner implemented using FSMs
FSM: DFA or NFA

This Lecture

NFAs from a formal perspective
Theorem: NFAs and DFAs are equivalent
Regular languages and Regular expressions

Creating a Scanner

Scanner Generator

NFAs, formally

finite set of states

the alphabet
(characters) start state

final states

transition function 0 1
s1 {s1} {s1,

To check if string is in L(M) of NFA M, simulate
set of choices it could make

1 1 1
s1 s2 st st
s1 s1 s2 st
s1 s1 s1 s2
s1 s1 s1 s1

At least one sequence of transitions that:

Consumes all input (without getting stuck)

Ends in one of the final states

NFA and DFA are Equivalent

Two automata M and M’ are equivalent iff L(M) = L(M’)

Lemmas to be proven

Lemma 1: Given a DFA M, one can construct an NFA
M’ that recognizes the same language as M, i.e., L(M’)

Lemma 2: Given an NFA M, one can construct a DFA
M’ that recognizes the same language as M, i.e., L(M’)

Proving Lemma 2

Lemma 2: Given an NFA M, one can construct a DFA M’
that recognizes the same language as M, i.e., L(M’) =

Part 1: Given an NFA M without ε–transitions, one can
construct a DFA M’ that recognizes the same language as M

Part 2: Given an NFA M with ε–transitions, one can construct an
NFA M’ without ε–transitions that recognizes the same
language as M

NFA w/ ε NFA w/o ε DFA
Part 2 Part 1

NFA w/o ε–Transitions to

NFA M to DFA M’

Intuition: Use a single state in M’ to simulate
sets of states in M
M has |Q| states
M’ can have only up to 2|Q| states

Defn: let succ(s,c) be the set of choices the NFA
could make in state s with character c

succ(A,x) = {A,B}
succ(A,y) = {A}
succ(B,x) = {C}
succ(B,y) = {C}
succ(C,x) = {D}
succ(C,y) = {D}

A {A, B} {A}

NFA w/o ε–Transitions to

Build new DFA M’ where Q’ = 2Q

succ(A,x) = {A,B}
succ(A,y) = {A}
succ(B,x) = {C}
succ(B,y) = {C}
succ(C,x) = {D}
succ(C,y) = {D}

To build DFA: Add an edge from state S on character c to state S’
if S’ represents the set of all states that a state in

S could possibly transition to on input c

A {A, B} {A}

Proving Lemma 2

Lemma 2: Given an NFA M, one can construct a DFA M’
that recognizes the same language as M, i.e., L(M’) =

Part 1: Given an NFA M without ε–transitions, one can
construct a DFA M’ that recognizes the same language as M

Part 2: Given an NFA M with ε–transitions, one can construct an
NFA M’ without ε–transitions that recognizes the same
language as M

ɛ-transitions
E.g.: xn, where n is even or divisible by 3

Useful for taking union of two FSMs

In example, left side accepts even n;
right side accepts n divisible by 3

Eliminating ɛ-transitions

Definition: Epsilon Closure
eclose(s) = set of all states reachable from s using

zero or more epsilon transitions

We want to construct ɛ-free NFA M’ that is equivalent to M

P {P, Q, R}

Proving Lemma 2

Lemma 2: Given an NFA M, one can construct a DFA M’
that recognizes the same language as M, i.e., L(M’) =

Part 1: Given an NFA M without ε–transitions, one can
construct a DFA M’ that recognizes the same language as M

Part 2: Given an NFA M with ε–transitions, one can construct an
NFA M’ without ε–transitions that recognizes the same
language as M

Summary of FSMs

DFAs and NFAs are equivalent

An NFA can be converted into a DFA, which can be
implemented via the table-driven approach

ɛ-transitions do not add expressiveness to NFAs
Algorithm to remove ɛ-transitions

Regular Languages and
Regular Expressions

Regular Language

Any language recognized by an FSM is a
regular language
• Single-line comments beginning with //
• Integer literals
• {ε, ab, abab, ababab, abababab, …. }
• C/C++ identifiers

Regular Expression

A pattern that defines a regular language

Regular language: set of (potentially infinite) strings

Regular expression: represents a set of (potentially
infinite) strings by a single pattern

{ε, ab, abab, ababab, abababab, …. } ⇔ (ab)*

Why do we need them?

Each token in a programming language can be
defined by a regular language

Scanner-generator input: one regular
expression for each token to be recognized by

Regular expressions are inputs to a scanner

Regular Expression

operands: single characters, epsilon
operators: from low to high precedence
“or”: a | b
“followed by”: a.b, ab
“Kleene star”: a* (0 or more a-s)

Regular Expression

Conventions:
aa is a . a
letter is a|b|c|d|…|y|z|A|B|…|Z
digit is 0|1|2|…|9
not(x) all characters except x
. is any character
parentheses for grouping, e.g., (ab)* is {ɛ, ab, abab, ababab, … }

Regexp, example
Precedence: * > . > |
digit | letter letter

(digit) | (letter . letter)
one digit, or two letters

digit | letter letter*

(digit) | (letter . (letter)*)

one digit, or one or more letters
digit | letter+

Regexp, example

Hex strings
start with 0x or 0X
followed by one or more hexadecimal digits
optionally end with l or L

0(x|X)hexdigit+(L|l|ɛ)
where hexdigit = digit|a|b|c|d|e|f|A|…|F

Regexp, example

Integer literals: sequence of digits preceded by
optional +/-

Example: -543, +15, 0007

Regular expression
(+|-|ε)digit+

Regexp, example

Single-line comments
Example: // this is a comment

Regular expression
//(not(‘\n’))*’\n’

Regexp, example

C/C++ identifiers: sequence of letters/digits/
underscores; cannot begin with a digit; cannot
end with an underscore

Example: a, _bbb7, cs_536

Regular expression
letter | (letter|_)(letter|digit|_)*(letter|digit)

Regular Languages
Languages recognized/defined by FSMs

Regular Expressions

Single-pattern representations of regular languages

Used for defining tokens in a scanner generator

Creating a Scanner

Scanner Generator

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com