A Subset of the URM Language; FA and NFA
0.0.1 Definition. If M is a FA, then its L(M) is called the regular set associated with M, or even the reg- ular language recognised/accepted (decided, actually) by M.
This Note continues from where Note #9 left but we will present first a few more simple examples of automata∗ that decide /recognise some given set of strings over some alphabet.
∗Plural of automaton.
Intro to Automata© 2020, by George Tourlakis
2
0.1. Examples
0.1.1 Example. We want to specify (to “program”!) an automaton M over Σ = {0,1}, such that L(M) = {0n1 : n ≥ 0}.
We recall that, for any string x, x0 =Def λ, while
n+1 copies of x n+1 Def n induction! x =x∗x = x∗x∗…∗x
where I denoted concatenation by ∗. Thus the strings in {0n1 : n ≥ 0} are
1, 01, 001, 0001, 00001, . . . (1)
We readily see that the following automaton’s only ac- cepting paths will follow zero or more times the “loop” labeled 0 (attached to the start state), and then the edge labeled 1 to end up with an accepting state.
0
1
01
0, 1
Intro to Automata© 2020, by George Tourlakis
0.1. Examples 3 The state at the very bottom is a trap state. What is
the need for it?
Well, the FA must be fully specified, so I am obliged to say what the accepting state does when it sees one or the other legal input.
And remember: Accepting states do NOT stop the machine! Any state stops the machine IFF it has just scanned eof .
Intro to Automata© 2020, by George Tourlakis
4
A new thing we learnt in the above example is that in depicting an automaton as a graph we do not necessarily need to name the states!
As in all mathematical arguments, we will of course assign names to objects (in particular to states) if we need to refer to them in the course of the argument —it is convenient to refer to them by name!
The reader should also note the use of two shorthand notations in labeling:
One, we used two labels on the vertical down-pointing edge.
This abbreviates the use of two edges going from the accepting to the trap state, one labeled 0, the other 1.
We could also have used the label “0, 1” both at the left or right of the arrow, “,” serving as a separator. This latter notational convention was used in labeling the loop attached to the trap state.
Intro to Automata© 2020, by George Tourlakis
0.1. Examples 5
0.1.2 Example. The two FAs below, each over the in- put alphabet {0, 1}, accept the languages ∅ (the top one) and {0, 1}∗ (the bottom one).
0, 1
Intro to Automata© 2020, by George Tourlakis
0, 1
6
0.1.3 Example. The FA below over the input alphabet {0, 1} accepts the language {λ}.
0, 1
0, 1
Indeed, we saw in Notes #9 that making the start (ini- tial) state also accepting we do accept λ. Moreover, the FA above accepts nothing else since any input symbol leads to the rejecting trap state.
Intro to Automata© 2020, by George Tourlakis
0.2. Some Closure Properties of Regular Languages 7 0.2. Some Closure Properties of Regular Lan-
guages
0.2.1 Theorem. The set of all regular languages over an alphabet Σ is closed under complement. That is, if L⊆Σ∗ is regular, then so is L=Def Σ∗ −L.
Proof. Let L = L(M) for some FA M over input alpha- bet Σ and state alphabet Q. Moreover, let F ⊆ Q be the set of accepting states of M.
We need a FA that recognises/decides L.
Trivially, we want to swap the “yes” (accepting state) and “no” (rejecting state) behaviour of M , changing noth- ing else.
Thus, L = L(M), where the FA M is the same as M,
except that M ’s set of accepting states is Q − F .
What makes the above proof tick is that FA are “total”: Every input string will be scanned all the way to its eof . Only the yes/no decision changes.
Intro to Automata© 2020, by George Tourlakis
8
0.2.2 Example. The automaton that accepts the com- plement of the language in Example 0.1.1 is found with- out comment below, just following the construction of the L(M) complement for some FA M, given above.
0
1
01
0, 1
Intro to Automata© 2020, by George Tourlakis
0.2. Some Closure Properties of Regular Languages 9
0.2.3 Theorem. The set of all regular languages over an alphabet Σ is closed under union. That is, if L ⊆ Σ∗ and L′ ⊆Σ∗ are regular, then so is L∪L′.
Proof. This proof will wait until after the introduction of NFA
which make the proof much easier!
0.2.4 Corollary. The set of all regular languages over an alphabet Σ is closed under intersection. That is, if L⊆Σ∗ and L′ ⊆Σ∗ are regular, then so is L∩L′.
Proof. L∩L′ =L∪L′.
Intro to Automata© 2020, by George Tourlakis
10
0.3. Proving Negative Results for FA; Pumping Lemma
Lecture #20, Nov. 25
IsthereaFAM suchthatL(M)={0n1n :n≥0}?
How can we tell?
Surely, not by trying each FA (infinitely many) out there as a possible fit for this language!
The following theorem, known as the pumping lemma can be used to prove “negative” results such as: There isnoFAM suchasL(M)={0n1n :n≥0}. Inshort, the language {0n1n : n ≥ 0} is not regular.
Intro to Automata© 2020, by George Tourlakis
0.3. Proving Negative Results for FA; Pumping Lemma 11
0.3.1 Theorem. (Pumping Lemma) If the language S is regular, i.e., S = L(M) for some FA M, then there is a constant C that we will refer to as a pumping con- stant such that for any string x ∈ S, if |x| ≥ C, then we can decompose it as x = uvw so that
(1) v ̸= λ
(2) uviw ∈ S, for all i ≥ 0
and
(3) |uv| ≤ C.
A pumping constant is not uniquely determined by S.
Proof. So, let S = L(M) for some FA M of n states. We will show that if we take C = n† this will work.
Letthenx=a1a2···an···am beastringofS. Ascho- sen, it satisfies |x| ≥ C. An accepting computation path of M with input x looks like this:
….. Say repeats as
…..
where p1, p2, . . . denotes a (notationally) convenient renaming‡ of the states visited after q0 in the computation.
†You see why C is not unique, since for any S that is an L(M) we can have infinitely many different M that accept S. Can we not?
‡ Why rename? What is wrong with q1 , q2 , . . .? Well, the set Q is given as something like {q0, q1, q2, q3, . . .} using some arbitrary fixed enumeration order without repetition for its members. Now, it would be wrong to expect that the arbitrary input x caused the FA to walk precisely along q1, q2, q3, etc., after it saw the first symbol of x.
Intro to Automata© 2020, by George Tourlakis
12
q0,p1,p2,…,pn
we have named n + 1 states, while we only have n states
in the FA’s “Q”.
Thus, at least two names “pi and pj” refer to the same
state “qr” —as we say, two states repeat.
We may redraw the computation above as follows tak- ing without loss of generality that pi = pj indicating the
repeating state pi = pj:
…
…
picture above: We set
u = a1a2 …ai
In the sequence
… …
We can now partition x into u, v and w parts from the
and
Note that
v = ai+1ai+2 …aj w = aj+1aj+2 …am
Intro to Automata© 2020, by George Tourlakis
0.3. Proving Negative Results for FA; Pumping Lemma 13
(1) v ̸= λ, since there is at least one edge (labelled ai+1) emanating from pi on the sub-path that connects this state to the (identical) state pj. In short ai+1 is part of v.
(2) We may utilize the loop v zero or more times (along with u in the front and w at the tail) to always obtain a NEW accepting path. Thus, all of uviw belong to L(M) —i.e., S.
(3) Since |uv| = j ≤ n, we have also verified that |uv| ≤
C.
The repeating pair pi, pj may occur anywhere between q0 and pn. A different “graphical proof” is common in the literature.
Let x = a1a2 …am as above.
Below we show the original x as an input stream array x = a1 . . . ai+1 . . . ajaj+1 . . . an . . . am, where the repeating pi = pj is shown.
upi w ↓
x= a1 a2 … ai
Observe:
v
aj+1 an an+1 am ↑
pj
ai+1
…
aj
…
…
1. By determinism, the subcomputation that starts at symbol aj+1 (blue) —while in state pj ( = pi)— will end at the eof after consuming the string w and will be uniquely at (accepting) state pm.
Intro to Automata© 2020, by George Tourlakis
14
2. After consuming the prefix a1 …ai of x the FA is
uniquely at state pi.
3. By determinism, the subcomputation that starts at symbol ai+1 (red) in state pi, will consume v and end at pj —uniquely, today, tomorrow and in 10350000 years from now— ready to process aj+1 (blue).
Thus, all of uv,uvvw,uvvvw,…,uvnw,… are in L(M).
Of course, so is x = uvw (given).
Intro to Automata© 2020, by George Tourlakis
0.3. Proving Negative Results for FA; Pumping Lemma 15 0.3.2 Example. The language over {0, 1} given as L =
{0n1n : n ≥ 0} is not regular.
Suppose it is. Then the pumping lemma holds for L, so let C be an appropriate pumping constant and con- sider the string x = 0C1C of L. We can then decompose x as uvw with |uv| ≤ C so that we can “pump” v ̸= λ as much as we like and the obtained uviw will all be in L.
We will prove the statement in red false, so we cannot pump; but then L cannot be regular!§
“The red statement” is false due to the observations: 1. By |uv| ≤ C, uv (and hence v) lie entirely in the
0C-part of the chosen x = 0C1C.
2. So, if we pump down —or above with i ≥ 2 (i.e., use
v0 orvi,i≥2)weobtainuw∈Loruviw∈L,i≥2.
Butuw=0K1C whereK
–Subcase b > 1. Then taking i = b, one of the numbers of the form (1) is (a + 1)b. But (a + 1)b is not prime (recall that a + 1 ≥ 2 since a ≥ 1).
– Subcase b = 1. Then take i = 2+a to obtain the number (of type (1)) a(2+a)+1 = a2 +2a+1 =
(a + 1)2. But this is not prime!
Intro to Automata© 2020, by George Tourlakis
20
The preceding shows that we can have a set that is suf- ficiently complex and thus fails to be FA-decidable even over a single-symbol alphabet. Here is another such case.
0.3.7 Example. Consider Q = {0n2 : n ≥ 0} over the alphabet A = {0}. It will not come as a surprise that Q is not FA-decidable.
For suppose it is. Then, if C is an appropriate pump- ing constant, consider x = 0C2.
• Clearly, x ∈ Q and is long enough.
So, split it as x = uvw with |uv| ≤ C and v ̸= λ.
Now, by 0.3.1, But
uvvw ∈ Q (1) C2 =|uvw|<|uvvw|≤|uvw|+|uv|≤C2 +C