程序代写代做代考 algorithm c/c++ C Nondeterministic Finite Automata (NFA)

Nondeterministic Finite Automata (NFA)
CS 536

Previous Lecture
Scanner: converts a sequence of characters to a sequence of tokens
Scanner implemented using FSMs FSM: DFA or NFA

This Lecture
NFAs from a formal perspective
Theorem: NFAs and DFAs are equivalent Regular languages and Regular expressions
3

Creating a Scanner
Last lecture: DFA to code
This lecture: NFA to DFA
This lecture: token to Regexp
Next lecture: Regexp to NFA
Scanner
Scanner Generator

NFAs, formally
M≡
finite set of states
the alphabet (characters)
final states start state
0
1
s1
{s1}
{s1, s2}
s2
transition function
5

NFA
To check if string is in L(M) of NFA M, simulate set of choices it could make
111
s1 s2 st st s1 s1 s2 st s1 s1 s1 s2 s1 s1 s1 s1
At least one sequence of transitions that: Consumes all input (without getting stuck) Ends in one of the final states
6

NFA and DFA are Equivalent
Two automata M and M’ are equivalent iff L(M) = L(M’) Lemmas to be proven
Lemma 1: Given a DFA M, one can construct an NFA M’ that recognizes the same language as M, i.e., L(M’) = L(M)
Lemma 2: Given an NFA M, one can construct a DFA M’ that recognizes the same language as M, i.e., L(M’) = L(M)

Proving Lemma 2
Lemma 2: Given an NFA M, one can construct a DFA M’ that recognizes the same language as M, i.e., L(M’) = L(M)
Part 1: Given an NFA M without ε–transitions, one can construct a DFA M’ that recognizes the same language as M
Part 2: Given an NFA M with ε–transitions, one can construct an NFA M’ without ε–transitions that recognizes the same language as M
NFA w/ ε
Part 2
NFA w/o ε
Part 1
DFA

NFA w/o ε–Transitions to DFA
NFA M to DFA M’
Intuition: Use a single state in M’ to simulate sets of states in M
M has |Q| states
M’ can have only up to 2|Q| states

x,y
Defn: let succ(s,c) be the set of choices the NFA succ(A,x) = {A,B}
could make in state s with character c
succ(A,y) = {A}
xY
A succ(B,{xA) ,=B{}C}
{A}
succ(B,y) = {C}
B {C}
succ(C,x) = {D}
C {D}
succ(C,y) = {D}
{C}
{D}
D {}
{}
NFA w/o ε–Transitions to DFA
A x B x,y C x,y
D
10

A x B x,y C x,y Build new DFA M’ where Q’ = 2Q
D
x,y
succ(A,x) = {A,B}
xy
succ(A,y) = {A}
A {A,B} {A} succ(B,x) = {C}
B {C} {C}
succ(B,y) = {C}
C {D} {D}
succ(C,x) = {D}
succ(C,y) =
D {}
{D}
{}
To build DFA: Add an edge from state S on character c to state S’ if S’ represents the set of all states that a state in
S could possibly transition to on input c 11

Proving Lemma 2
Lemma 2: Given an NFA M, one can construct a DFA M’ that recognizes the same language as M, i.e., L(M’) = L(M)
Part 1: Given an NFA M without ε–transitions, one can construct a DFA M’ that recognizes the same language as M
Part 2: Given an NFA M with ε–transitions, one can construct an NFA M’ without ε–transitions that recognizes the same
language as M

ɛ-transitions
E.g.: xn, where n is even or divisible by 3
Useful for taking union of two FSMs
In example, left side accepts even n; right side accepts n divisible by 3
13

Eliminating ɛ-transitions
We want to construct ɛ-free NFA M’ that is equivalent to M
Definition: Epsilon Closure
eclose(s) = set of all states reachable from s using zero or more epsilon transitions
eclose
P
{P, Q, R}
Q
{Q}
R
{R}
Q1
{Q1}
R1
{R1}
R2
{R2}
14

Proving Lemma 2
Lemma 2: Given an NFA M, one can construct a DFA M’ that recognizes the same language as M, i.e., L(M’) = L(M)
Part 1: Given an NFA M without ε–transitions, one can construct a DFA M’ that recognizes the same language as M
Part 2: Given an NFA M with ε–transitions, one can construct an NFA M’ without ε–transitions that recognizes the same
language as M

Summary of FSMs
DFAs and NFAs are equivalent
An NFA can be converted into a DFA, which can be implemented via the table-driven approach
ɛ-transitions do not add expressiveness to NFAs Algorithm to remove ɛ-transitions

Regular Languages and Regular Expressions

Regular Language
Any language recognized by an FSM is a regular language
Examples:
• Single-line comments beginning with //
• Integer literals
• {ε, ab, abab, ababab, abababab, …. }
• C/C++ identifiers

Regular Expression
A pattern that defines a regular language
Regular language: set of (potentially infinite) strings
Regular expression: represents a set of (potentially infinite) strings by a single pattern
{ε, ab, abab, ababab, abababab, …. } ⇔ (ab)*

Why do we need them?
Each token in a programming language can be defined by a regular language
Scanner-generator input: one regular expression for each token to be recognized by scanner
Regular expressions are inputs to a scanner generator

Regular Expression
operands: single characters, epsilon operators: from low to high precedence
“or”: a|b
“followed by”: a.b, ab
“Kleene star”: a* (0 or more a-s)
22

Regular Expression
Conventions:
aa is a . a
a+ is aa*
letter is a|b|c|d|…|y|z|A|B|…|Z
digit is 0|1|2|…|9
not(x) all characters except x
. is any character
parentheses for grouping, e.g., (ab)* is {ɛ, ab, abab, ababab, … }
23

Regexp, example
Precedence: * > . > | digit | letter letter
(digit) | (letter . letter)
one digit, or two letters digit | letter letter*
(digit) | (letter . (letter)*)
one digit, or one or more letters digit | letter+
24

Regexp, example
Hex strings
start with 0x or 0X
followed by one or more hexadecimal digits optionally end with l or L
0(x|X)hexdigit+(L|l|ɛ)
where hexdigit = digit|a|b|c|d|e|f|A|…|F
25

Regexp, example
Integer literals: sequence of digits preceded by optional +/-
Example: -543, +15, 0007
Regular expression (+|-|ε)digit+
26

Regexp, example
Single-line comments
Example: // this is a comment
Regular expression //(not(‘\n’))*’\n’
27

Regexp, example
C/C++ identifiers: sequence of letters/digits/ underscores; cannot begin with a digit; cannot end with an underscore
Example: a, _bbb7, cs_536
Regular expression
letter | (letter|_)(letter|digit|_)*(letter|digit)
28

Recap
Regular Languages
Languages recognized/defined by FSMs
Regular Expressions
Single-pattern representations of regular languages Used for defining tokens in a scanner generator

Creating a Scanner
Last lecture: DFA to code
This lecture: NFA to DFA
Next lecture: Regexp to NFA
This lecture: token to Regexp
Scanner
Scanner Generator