Nondeterministic Finite Automata
COMP1600 / COMP6260
Australian National University
Semester 2, 2021
DFA Minimisation
Elimination of equivalent states.
if two states are equivalent, one can be eliminated
Elimination of Unreachable States
if a state cannot be reached from the initial state then it can also be eliminated.
Example. S not reachable ?0
3 1-0
– S0 S1 1
6
A6: 1
0
– S 3
1/40
The Standard Minimisation Algorithm
Main Idea.
aggregate states into groups (of possibly equivalent states) initially, all states are possibly equivalent
split a group of possibly equivalent states if we have evidence that they are not equivalent.
a non-final state is never equivalent to a final state
two states are non-equivalent if the transition function takes them into
different groups (with the same letter) repeat until no more groups can be split.
Realisation.
The working data structure for the algorithm is a list of lists (“groups”) of states
On each iteration, we test one of the groups with a symbol from the alphabet.
If we notice differing behaviour, we split the group.
2/40
The Algorithm Details
Input: A list containing two “groups”. (a group is represented as a list of states). One group consists of the Final states and the other consists of the non-final states.
Data: The working data structure, WDS : [[State]], is a list of groups of states. When two states are in different groups, we know they are not equivalent.
Loop: Pick a group, {s1, …sj } and a symbol, x.
If the states {N(si,x) | i = 1,…,j} are all in the same group, then the group {s1 , …sj } is not split.
If the states {N(si,x) | i = 1,…,j} belong to different groups of WDS, then the group {s1, …sj } should be split accordingly.
Continue until we cannot, by any choice of letter, split any group.
3/40
Our Previous Example
Our running example is trivial. The initial split is it.
[[s0, s2], [s1, s3]]
?0 0 ?0 ?
10
– S0 – S1 [[s0,s2],[s1,s3]] – Sa
?0 ′
A: 6 1 [[s0,s2],[s1,s3]] A: 61 111
? ? ?
000
- S3 1 S2
[[s0,s2],[s1,s3]] 1
?
[[s0, s2], [s1, s3]]
Sb
4/40
Minimisation: Second Example
Q. What is the language of this automaton? Can you find a simpler automaton with the same language?
a ++ a S4
a
// S0 ee
b
b
b b && //
S1 a S3WW a,b
5/40
Minimisation Step by Step
a ++
<
As a transition table.
01
→ S0 {S0, S1} {S0, S3}
// S0WW 0,1 1
>>S2 hh 0,1
S1 {S2} ⊙S2 {S2}
S3 ∅
∅ {S2} {S2}
S3
1
Both convey precisely the same information. What is the language of this automaton?
14/40
Acceptance for NFAs
Given. An NFA A = (Σ,S,F,s0,R). Then A accepts a word
w = a1a2 . . . an (in symbols: w ∈ L(A)) if there exists a sequence of states
a1 a2 an−1 an
s0 −→s1 −→···−→sn−1 −→sn
a where s0 is the starting state, sn ∈ F is an accepting state, and s −→ t if
(s,a,t) ∈ R.
Aside. This is like for deterministic automata, the only difference is that
for
a
non-deterministic automata we have s −→ t if (s, a, t) ∈ R (that is, the automaton can make a transition)
a
deterministic automata we have s −→ t if N(s,a) = t (that is, the automaton makes the transition)
15/40
Eventual State Relation for NFAs
Basic Idea. The eventual state relation R∗(s,w,s′) is true if s′ is a state that the NFA can reach, starting in state s and reading string w.
Formal Definition. The eventual state relation has type R∗ ⊆ S × Σ∗ × S
or R∗ : S × Σ∗ × S → Bool and is defined inductively as follows:
R∗(s, ε, s)
R∗(s, xα, s′) = ∃s′′.R(s, x, s′′) ∧ R∗(s′′, α, s′)
16/40
Eventual State Relation: Example
The “double digits” automaton
>> S1 00
//S0WW 0,1 1
>>S2hh 0,1 1
Eventual State Relation.
(S0, ε, S0) ∈ R∗ by definition
S0 →0 S0 →0 S0 →1 S0, hence (S0, “001”, S0) ∈ R∗. S0 →0 S1 →0 S2 →1 S2, hence (S0, “001”, S2) ∈ R∗. S1 →0 S2 →0 S2 →1 S2, hence (S1, “001”, S2) ∈ R∗.
S3
17/40
An Important (but Unsurprising) Theorem about R∗
For all states s, s′ and for all strings α, β ∈ Σ∗
R∗(s, αβ, s′) if and only if ∃s′′. R∗(s, α, s′′) ∧ R∗(s′′, β, s′)
The proof is similar to the corresponding result for N∗ in DFAs.
18/40
Language of a NFA
Let A = (Σ,S,s0,F,R) be a NFA. Theorem. A string w is accepted by A if
∃s ∈ F. R∗(s0,w,s)
(Compare with the definition of acceptance for NFAs earlier)
Language of an NFA.
The language accepted by A is the set of all strings accepted by A L(A)={w∈Σ∗ |∃s∈F.R∗(s0,w,s)}
Informally. That is, w ∈ L(A) iff there exists a path through the diagram for A, from s0 to a final state s (s ∈ F), such that the symbols on the path match the symbols in w
19/40
Power of Nondeterminism?
Q. Is there a language that is accepted by an NFA for which we cannot find a DFA that (also) accepts it?
it seems easier to construct NFAs but in examples, DFAs did also exist
A. A simple “no”.
Theorem. If language L is accepted by a NFA, then there is some DFA
which accepts the same language.
Moreover, this DFA can be computed using an algorithm.
just like the minimal automaton can be computed using state equivalence
Drawback. The resulting DFA may have exponentially many states Have to record a set of states that the NFA could be in.
20/40
Constructing the Equivalent DFA from an NFA
Assumption. We have an NFA with state set {q0, . . . , qn}.
Basic Idea.
consider all possible runs of the NFA in parallel as a consequence, can be in a set of states
Construction.
A state of the DFA is a set of states of the NFA
e.g. {q3, q7} or ∅
signifies the states that the NFA can be in after reading some input
transition function: records possible next states
e.g. from {q3,q7} with letter x, take union of transitions (with x) from q3 and q7
final states are state sets that contain a final state.
21/40
Subset Construction: The Finer Points
Given. NFAA=(Σ,S,s0,F,R). Subset Construction.
states are subsets of S but each subset plays the role of a single state! transitions: for a state Q ⊆ S and a letter a ∈ Σ:
N(Q,a)={s1∈S|s→a s1forsomes∈Q}
= {s1 ∈ S | (s,a,s1) ∈ R for some s ∈ Q}
22/40
Determinisation: Example
The “double digits” automaton
Subset Construction: transition table
01
>> S1
→ {S0} {S0, S1} {S0, S3}
0,1 {S0, S1, S2} {S0, S2, S3} {S0, S2}
{S0, S1} {S0, S1, S2} {S0, S1} {S0, S1, S2} {S0, S2} {S0, S1, S2}
{S0, S3}
{S0, S3} {S0, S2, S3} {S0, S2} {S0, S2, S3} {S0, S2, S3}
0
0
>>S2
1
//
0,1 1
S0 WW
hh
Note.
S3
don’t have transition for all states, just those that are reachable from {S0 }
all others are not relevant (cf. elimination of unreachable states) having all states would require 24 = 16 entries.
23/40
Determinisation Example, as Diagrams
Double Digits, as NFA.
>> S1 00
//S0WW 0,1 1
>>S2hh 0,1 1
S3
0
Double Digits as DFA.
//
S
0 // S
>> KK01
01ZZ2 S001 S02
0
1 0
1
S03 S023
DD 1
// 0 1 WW
1
24/40
Recall Minimisation . . .
Q. Can there be a simpler DFA (with fewer states) that recognises the same language?
S
0 0 //
initial split: {S ,S ,S }, 0 01 03
{S012, S02, S023}
next split: {S0}, {S01}, {S03}, {S012, S02, S023}
no more splits, so S012, S02 and S023 can be merged.
>> KK01
S
01ZZ2
1
//
0
0
S001 S02 DD
1
S03
// S023
1
1 WW 1
0
25/40
More Expressive Power: ε-transitions
Extra Ingredient: Spontaneous transitions that don’t “eat” a letter NFAs that may change state without consuming a symbol. NFAs of this kind are called NFAs with ε-transitions
can convert NFAs with ε-transitions to (standard) NFAs
Formal Definition. An NFA with ε-transitions is an NFA, but the transition relation has the form
R ⊆ S × Σ ∪ {ε} × S
cf. NFAs with transition relation R ⊆ S × Σ × S
R(s, ε, s′) is a spontaneous transition (without reading input symbol) ε is not an element of the alphabet!
26/40
ε-NFA: Example
General Pattern. ε-transitions say “or”
ε
11 s 0 ))s
661ii0 2
// s0
ε (( 1 ))
Interpretation.
s3YYii s4YY 1
“top” automaton (with start state s1) requires even number of 0’s “bottom” automaton (with start state s3) requires even number of 1’s
entire automaton (with start state s0) accepts either an even number of 1’s or an even number of 0’s
00
27/40
Example and Acceptance
Language of this Automaton?
abc
//s ε //s ε //s 012
Acceptance. An ε-NFA A accepts a word w = a1 …an if there is a sequence of states
ε∗ a1 ′ ε∗ a2 ′ an ′ ε∗
s0 −→r1 −→r1 −→r2 −→r2…rn −→rn −→f
where s0 is the starting state, f ∈ F is an accepting state and a
s −→ t if there is an a-transition from s to t, i.e (s,a,t) ∈ R ε∗
s −→ t if there is a sequence of ε-transitions (only!) from s to t. ε∗
In particular: the empty string ε ∈ L(A) if s0 −→ f for a final state f ∈ F.
28/40
Eventual State Relation for ε-NFAs
Given. Anε-NFA(Σ,S,s0,F,R)(i.e. R⊆Q×(Σ∪{ε})×Q)thenthe
ε-closure of a state s ∈ S is given by
eclose(s) = {s′ ∈ S | there is a sequence of ε-transitions from s to s′}
and the eventual state relation is given by
R∗(s, ε, s′) ⇐⇒ s′ ∈ eclose(s)
R∗(s, aw, s′) ⇐⇒ there are s0 and s1 such that
s0 ∈ eclose(s),(s0,a,s1) ∈ R,(s1,w,s′) ∈ R∗
As for DFAs / NFAs:
A string w is accepted by an ε-NFA A (in symbols: w ∈ L(A)) if (s0,w,f)∈R∗ forsomefinalstatef ∈F,thatis
L(A)={w ∈Σ∗ |∃f ∈F.(s0,w,f)∈R∗} Q. How does this relate to the notion of acceptance earlier?
29/40
Relationship Between NFAs and ε-NFAs Q. Are there languages only accepted by ε-NFAs?
A. No. Every ε-NFA A = (Σ,S,s0,F,R) can be converted to an NFA A′ without ε-transitions so that L(A) = L(A′).
Construction. Put A′ = (Σ,S,s0,F′,R′) where
Make s ∈ S an accepting state in A′ if s can reach an accepting state
in A by ε-transitions:
F′ = {s ∈ S | eclose(s) ∩ F ̸= ∅}
a′′a Putanarcs−→tintoA ifthereisatransitions −→tinAwith
s′ ∈ eclose(s):
R′ = {(s,a,t) | (s′,a,t) ∈ R for some s′ ∈ eclose(s)}
(and convince yourself that A and A′ accept the same strings!)
30/40
Regular Expressions
Challenge. Understand the computational power of DFAs / NFAs. Approach. Characterise the languages that can be accepted by an NFA in
a different form.
One Characterisation. Regular expressions (cf. Perl, Ruby, grep)
Basic Operators used to construct new expressions from old: vertical bar (pipe): choose either the left or right expression Kleene star: repeat strings from an expression
ε, the empty string, and every letter of the alphabet concatenation, for sequencing expressions
parentheses, for grouping
Example.
a∗ indicates 0 or more as.
yes | no is the language with just the 2 given strings. (0 | 1)∗ indicates the set of binary numerals.
31/40
Regular Expressions — More Examples
0|(1(0|1)∗) is the set of binary numerals with no leading zeros.
(a | b)∗c(a | b)∗ is the set of strings over {a,b,c} with just one c.
(0∗10∗10∗)∗ is the language of bit-strings that have an even number of ones. (Alternatively 0∗(10∗10∗)∗)
(z∗(x∗ | y∗) z))∗ is the set of strings over {x,y,z} with no x and y adjacent.
1 | (0 ( ε |(.(0 | 1)∗1)))) is binary fractional numerals between 0 and 1 with no trailing zeroes. (e.g. 0.1, 0.110011 but not .1 or 0.10)
32/40
The Definition of Regular Expressions
Key Concept.
regular expressions are purely syntactical – just like formulae
but: every expression denotes a set of strings – this is the meaning. Definition. The regular expressions over alphabet Σ and the sets that
they denote are:
∅ is a regular expression and denotes the empty set ∅
ε is a regular expression and denotes the set {ε}
for each a ∈ Σ, a is a regular expression and denotes the set {a}
If α and β are regular expressions denoting languages R and S respectively, then:
α | β denotes R ∪ S
α β denotes RS which is {xy | x ∈ R ∧ y ∈ S }
α∗ denotes R∗, ie, the set of finitely many ri ∈ R, concatenated
R∗ is (inductively) defined as {ε} ∪ RR∗
33/40
Regular Expressions and DFAs
Key Insight.
Regular expressions and NFAs / DFAs are equivalent.
for every DFA A, have regular expression r with L(A) = L(r)
for every regular expression r, have DFA A with L(r) = L(A)
so the “power” of NFAs / DFAs are completely described by regular expressions.
Q. Can we “compute” more than what can be described by regular expressions?
34/40
Regular Expressions to ε-NFAs
Key Insight.
regular expressions are an inductively defined structure e.g. representable by an inductive data type in Haskell
as a consequence, we can give inductive definition of the corresponding automaton
Construction. (start state on left, final state on right)
When the regular expression is a symbol a of the alphabet (language
is {a}) the automaton is
When the regular expression is ε (language is {ε}) the automaton is
ε
When the regular expression is ∅ (language is ∅) the automaton has no edges
a
35/40
Regular Expressions to NFAs, ctd
Suppose the NFA corresponding to some R is:
Then NFAs corresponding to composite regular expressions are defined as follows:
R1 R2
R*
R1 R2
εε
εε
R1
R
ε
ε εε
R2
R
R1
R2
36/40
Example
Given the regular expression for binary numerals without leading zeros, (0 | 1(0|1)∗), the above algorithm gives this NFA.
0
εε
ε
ε
ε εε
0
ε
ε
1
1
εε
37/40
Closing the Loop
Given. A finite alphabet Σ and a language L ⊆ Σ∗. The following are equivalent:
L can be described by a regular expression L can be recognised by an ε-NFA
L can be recognised by an NFA
L can be recognised by a DFA …
as we can convert regular expressions into ε-NFAs into NFAs into DFAs.
Missing Link. Construction of regular expressions from DFAs (not covered in this course)
38/40
Summary.
Starting Point. Finite Automata
motivated by computers having finite memory (only) solving simple problems: is string s accepted?
Limitations of Finite Automata
e.g. cannot recognise L = {anbn | n ≥ 0} Characterisation of expressive power
can go back and forth between automata and regular expressions
Q. Are finite automata a “good” model of computation? if yes, why?
if not, why not? What is missing?
39/40
Literature.
Introduction to Automata Theory, Languages, and Computation By Hopcroft, Motwani, and Ullman.
A classic text that has been re-worked from a standard textbook. Introduction To The Theory Of Computation by
The part on Automata and Languages covers (more than) what we have discussed here.
40/40