CS 332: Theory of Computation Prof. Mark Bun
Boston University February 15, 2021
Homework 3 – Due Thursday, February 18, 2021 at 11:59 PM
Reminder Collaboration is permitted, but you must write the solutions by yourself without as-
sistance, and be ready to explain them orally to the course staff if asked. You must also identify
your collaborators and write “Collaborators: none” if you worked by yourself. Getting solutions from
outside sources such as the Web or students not enrolled in the class is strictly forbidden.
Problems There are 5 required problems. Problem 2 will be autograded by AutomataTutor.
1. (Regex to description) Give plain English descriptions of the languages generated by each of the
following regular expressions
(a) (a ∪ b)∗ ∪ c∗
(b) 1(000)∗1
(c) a(ba)∗b
(d) ∅∗
(e) (∅ ∪ ε)∗
2. (Regular expressions vs. finite automata) Please log on to AutomataTutor to submit solutions
for this question.
(a) (Description to regex) Give regular expressions generating the following languages:
i. {w ∈ {0, 1}∗ | w has exactly two 0’s and at least one 1}
ii. {w ∈ {0, 1}∗ | w is not the string 01}
iii. {w ∈ {0, 1}∗ | the number of 1’s in w is divisible by 3}.
(b) (Regex to NFA) Use the procedure described in class (also in Sipser, Lemma 1.55) to convert
(AT ∪ TA ∪ CG ∪ GC)∗ to an equivalent NFA. Simplify your NFA.
(c) (NFA to regex) Convert the following NFA to an equivalent regular expression.
q2
q0start q1
ε
B
A
A
A
3. (Conversion procedures as algorithms) Consider the following pseudocode describing an algo-
rithm taking as input a regex and outputting the description of an equivalent NFA.
Here, you can assume that the subroutines NFA.emptyLanguage(), NFA.emptyString(), and NFA.symbol(a)
return NFAs recognizing the languages ∅, {ε}, {a}, respectively, as described in Sipser’s proof of
Lemma 1.55 or in Lecture 5, slide 24. Moreover, NFA.union(N1, N2) takes as input two NFAs
and outputs the NFA recognizing L(N1) ∪ L(N2) described in Sipser’s proof of Theorem 1.45, and
similarly for NFA.concatenate and NFA.star.
1
RegexToNFA(R)
Input : Regular expression R
Output: Equivalent NFA N
if R = ∅ then
return NFA.emptyLanguage();
else if R = ε then
return NFA.emptyString();
else if R = a then
return NFA.symbol(a);
else if R = R1 ∪R2 then
return NFA.union(RegexToNFA(R1), RegexToNFA(R2));
else if R = R1 ◦R2 then
return NFA.concatenate(RegexToNFA(R1), RegexToNFA(R2));
else if R = R∗1 then
return NFA.star(RegexToNFA(R1));
(a) If N1 and N2 are NFAs with s1 and s2 states, respectively, how many states does NFA.union(N1,
N2) have? How about NFA.concatenate(N1, N2)? NFA.star(N1)?
(b) The size of a regular expression R is the the number of appearances of ∅, ε,∪, ◦,∗ and alphabet
symbols in R. If R is a regular expression of size 1, what is the maximum number of states in
RegexToNFA(R)?
(c) For a natural number k, let S(k) be the maximum number of states RegexToNFA(R) can have
over all regexes R of size k. Prove by induction on k that S(k) ≤ 2k.
Now consider the following pseudocode describing an algorithm taking as input an NFA and out-
putting an equivalent regex.
NFAtoRegex(N)
Input : NFA N
Output: Equivalent regular expression R
M0 ← NFAtoGNFA(N);
k ← number of states of M0;
for i← 1 to k − 2 do
Obtain Mi from Mi−1 by ripping out state qi and updating transitions appropriately;
end
return the regex labeling the transition from q0 to qaccept in Mk−2;
(d) Suppose the starting NFA N has exactly one symbol labeling each transition present in its
state diagram. (This simplifying assumption makes the calculations cleaner, and in particular,
independent of the alphabet size.)
Let `(i) be the maximum possible size of a regular expression appearing on any transition in
Mi. Prove by induction on i that `(i) ≤ 4i+1 − 3.
(e) Show that if N is an NFA with s states, then NFAtoRegex(N) is a regular expression of size
at most 4s+1.
4. (Distinguishing set method)
2
(a) Let REP2 = {ww | w ∈ {0, 1}2}. Show that S = {00, 01, 10, 11} is pairwise distinguishable by
REP2. That is, for every pair x, y ∈ S, argue that there is a string z such that exactly one of
xz and yz is in REP2.
(b) What is the smallest number of states a DFA recognizing REP2 can have? Explain your
answer.
(c) For any k ≥ 1, let REPk = {ww | w ∈ {0, 1}k}. Show that every DFA recognizing REPk
requires at least 2k states.
(d) Show that every NFA recognizing REPk requires at least k states.
5. (Non-regular languages) Prove that the following languages are not regular. You may use the
distinguishing set method and the closure of the class of regular languages under union, intersection,
complement, and reverse.
(a) L1 = {0n12n | n ≥ 0}.
(b) L2 =
{
w ∈ {0, 1}∗ | w 6= wR
}
.
(c) L3 =
{
www | w ∈ {0, 1}∗
}
.
(d) L4 =
{
x/y/z | x, y, z ∈ {0, 1}∗ are binary numbers such that x + y = z}. The alphabet for
this language is {0, 1, /}. For example, 10/10/100 ∈ L3 and 11/1/001 /∈ L3.
3