BU CS 332 – Theory of Computation
Lecture 5:
• More on pumping
• Regular expressions
• Regular expressions = regular languages
Mark Bun February 5, 2020
Reading: Sipser Ch 1.3
More on Pumping
2/5/2020 CS332 – Theory of Computation 2
Pumping Lemma (Formal)
Let 𝐿 be a regular language.
Then there exists a “pumping length” 𝑝 such that
For every 𝑤 ∈ 𝐿 where |𝑤| ≥ 𝑝,
𝑤 can be split into three parts 𝑤 = 𝑥𝑦𝑧 where:
1. |𝑦| > 0
2. |𝑥𝑦| ≤ 𝑝
3. 𝑥𝑦𝑖𝑧𝐿forall𝑖 ≥ 0
2/5/2020 CS332 – Theory of Computation 3
General Strategy for proving 𝐿 is not regular Proof by contradiction: assume 𝐿 is regular.
Then there is a pumping length 𝑝.
1.Find𝑤∈𝐿with 𝑤 ≥𝑝
2. Show that 𝑤 cannot be pumped
3. Conclude 𝐿 must not have been regular
2/5/2020 CS332 – Theory of Computation 4
Pumping down
Claim:𝐿= 0𝑖1𝑗 𝑖>𝑗≥0}isnotregular
Proof: Assume 𝐿 is regular with pumping length 𝑝
1.Find𝑤∈𝐿with 𝑤 ≥𝑝
2. Show that 𝑤 cannot be pumped
Formally If 𝑤 = 𝑥𝑦𝑧 with |𝑥𝑦| ≤ 𝑝, then…
2/5/2020 CS332 – Theory of Computation 5
Reusing a Proof
Pumping a language can be lots of work… Let’s try to reuse that work!
How might we show that
𝐵𝐴𝐿𝐴𝑁𝐶𝐸𝐷 = 𝑤 𝑤 has an equal # of 0s and 1s}
is not regular?
0𝑛1𝑛 𝑛 ≥ 0} = 𝐵𝐴𝐿𝐴𝑁𝐶𝐸𝐷 ∩ 𝑤 all 0s in 𝑤 appear before all 1s}
2/5/2020 CS332 – Theory of Computation 6
Using Closure Properties
If 𝐴 is not regular, we can show a related language 𝐵 is not regular
𝐵∩𝐶= (regular)
𝐴
(not regular) any of {∘, ∪, ∩} or, for one language, {¬, R, *}
By contradiction: If 𝐵 is regular, then 𝐵 ∩ 𝐶 (= 𝐴) is regular.
But 𝐴 is not regular so neither is 𝐵! 2/5/2020 CS332 – Theory of Computation 7
Example
Prove 𝐵 = {0𝑖1𝑗|𝑖 ≠ 𝑗} is not regular
using nonregular language 𝐴 = 0𝑛1𝑛 𝑛 ≥ 0
and regular language
𝐶 = 𝑤 all 0s in 𝑤 appear before all 1s}
2/5/2020
CS332 – Theory of Computation 8
Regular Expressions
2/5/2020 CS332 – Theory of Computation 9
Regular Expressions
• A different way of describing regular languages
• A regular expression expresses a (possibly complex) language by combining simple languages using the regular operations
“Simple” languages: ∅, 𝜀 , {𝑎} for some 𝑎 ∈ Σ Regular operations:
Union: 𝐴 ∪ 𝐵
Concatenation:𝐴 ∘𝐵= 𝑎𝑏 𝑎∈𝐴,𝑏∈𝐵} Star:𝐴∗ ={𝑎1𝑎2…𝑎𝑛|𝑛 ≥0and𝑎𝑖 ∈𝐴}
2/5/2020 CS332 – Theory of Computation 10
Regular Expressions – Syntax
A regular expression 𝑅 is defined recursively using the following rules:
1. 𝜀, ∅, and 𝑎 are regular expressions for every 𝑎 ∈ Σ 2. If 𝑅 and 𝑅 are regular expressions, then so are
12
(𝑅 ∪ 𝑅 ), (𝑅 ∘ 𝑅 ), and (𝑅∗) 12121
Examples: (over Σ = {𝑎, 𝑏, 𝑐})
𝑎 ∘ 𝑏 ((((𝑎 ∘ (𝑏∗)) ∘ 𝑐) ∪ (((𝑎∗) ∘ 𝑏))∗)) (∅∗)
2/5/2020 CS332 – Theory of Computation 11
Regular Expressions – Semantics
𝐿(𝑅) = the language a regular expression describes
1. 𝐿(∅)=∅
2. 𝐿 𝜀 = 𝜀
3. 𝐿(𝑎) = {𝑎}forevery𝑎∈Σ
4. 𝐿((𝑅 ∪𝑅 ))=𝐿(𝑅 )∪𝐿(𝑅 ) 1212
5. 𝐿((𝑅 ∘𝑅 ))=𝐿(𝑅 )∘𝐿(𝑅 ) 1212
6.𝐿𝑅∗ =(𝐿𝑅)∗ 11
Example: 𝐿(((𝑎∗) ∘ (𝑏∗))) =
2/5/2020 CS332 – Theory of Computation 12
Simplifying Notation
• Omit∘symbol: 𝑎𝑏 = 𝑎∘𝑏
• Omit many parentheses, since union and concatenation
are associative:
𝑎∪𝑏∪𝑐 = 𝑎∪(𝑏∪𝑐) = (𝑎∪𝑏)∪𝑐
• Order of operations: Evaluate star, then concatenation, then union
𝑎𝑏∗ ∪ 𝑐 = (𝑎 𝑏∗ ) ∪ 𝑐
2/5/2020 CS332 – Theory of Computation 13
Examples
Let Σ = {0,1}
1.
𝑤 𝑤 contains exactly one 1}
2.
𝑤 𝑤 has length at least 3 and its third symbol is 0}
𝑤 everyoddpositionof𝑤is1}
2/5/2020 CS332 – Theory of Computation 14
3.
Syntactic Sugar
• For alphabet Σ, the regex Σ represents 𝐿(Σ) = Σ • For regex 𝑅, the regex 𝑅+ = 𝑅𝑅∗
Not captured by regular expressions: Backreferences
2/5/2020 CS332 – Theory of Computation 15
Equivalence of Regular Expressions, NFAs, and DFAs
2/5/2020 CS332 – Theory of Computation 16
Regular Expressions Describe Regular Languages
Theorem: A language 𝐴 is regular if and only if it is described by a regular expression
Theorem 1: Every regular expression has an equivalent NFA Theorem 2: Every NFA has an equivalent regular expression
2/5/2020 CS332 – Theory of Computation 17
Regular expression -> NFA
Theorem 1: Every regex has an equivalent NFA Proof: Induction on size of a regex
Base cases:
𝑅=∅ 𝑅=𝜀 𝑅=𝑎
2/5/2020 CS332 – Theory of Computation 18
Regular expression -> NFA
Theorem 1: Every regex has an equivalent NFA Proof: Induction on size of a regex
Inductive step:
𝑅 = (𝑅 ∪ 𝑅 ) 12
𝑅 = (𝑅 𝑅 ) 12
𝑅 = 𝑅∗ 1
2/5/2020
CS332 – Theory of Computation 19
Example
Convert (1(0 ∪ 1))∗ to an NFA
2/5/2020 CS332 – Theory of Computation 20
NFA -> Regular expression
Theorem 2: Every NFA has an equivalent regex
Proof idea: Simplify NFA by “ripping out” states one at a
time and replacing with regexes
0 01*0 0
1
2/5/2020 CS332 – Theory of Computation 21
Generalized NFAs
• Every transition is labeled by a regex
• One start state with only outgoing transitions
• Only one accept state with only incoming transitions • Start state and accept state are distinct
2/5/2020 CS332 – Theory of Computation 22
Generalized NFA Example
𝑎∪𝑏
𝑎∗𝑏 𝑞𝑠
𝑞 𝑎 𝑞𝑎 𝑅(𝑞𝑠,𝑞) =
𝑅(𝑞𝑎,𝑞) = 𝑅(𝑞,𝑞𝑠) =
2/5/2020
CS332 – Theory of Computation 23
NFA -> Regular expression
NFA
GNFA
𝑘 states
𝑘 + 2 states
𝑘 + 1 states
GNFA
GNFA
2 states
2/5/2020 CS332 – Theory of Computation
24
Regex
…
NFA -> GNFA
NFA
ε
ε
ε
ε
• Add a new start state with no incoming arrows.
• Make a unique accept state with no outgoing arrows.
2/5/2020 CS332 – Theory of Computation 25
GNFA -> Regular expression
Idea: While the machine has more than 2 states, rip one out and relabel the arrows with regexes to account for the missing state
𝑎∗𝑏 𝑞1
𝑞𝑎𝑞 23
𝑞1
𝑞3
2/5/2020
CS332 – Theory of Computation
26
GNFA -> Regular expression
Idea: While the machine has more than 2 states, rip one out and relabel the arrows with regexes to account for the
missing state
𝑎∪𝑏
𝑞𝑎𝑞 23
𝑎∗𝑏 𝑞1
𝑞1
𝑞3
2/5/2020
CS332 – Theory of Computation
27
GNFA -> Regular expression
Idea: While the machine has more than 2 states, rip one out and relabel the arrows with regexes to account for the
missing state
𝑎∪𝑏
𝑞𝑎𝑞 23
𝑏
𝑎∗𝑏 𝑞1
𝑞1
𝑞3
2/5/2020
CS332 – Theory of Computation
28
GNFA -> Regular expression
Idea: While the machine has more than 2 states, rip one out and relabel the arrows with regexes to account for the
missing state
𝑅2
𝑞 𝑅3 𝑞
𝑅4
𝑅1
𝑞1
23
𝑞1
𝑞3
2/5/2020
CS332 – Theory of Computation
29
Example
𝑎 𝑎
1
2
𝑎
3
𝑏 𝑏
2/5/2020
CS332 – Theory of Computation 30
2/5/2020 CS332 – Theory of Computation 31
2/5/2020 CS332 – Theory of Computation 32
2/5/2020 CS332 – Theory of Computation 33
2/5/2020 CS332 – Theory of Computation 34