Slide 1
Decidability of Languages That Do Not Ask Questions about Turing Machines
Chapter 22
Undecidable Languages That Do Not Ask Questions About TMs
Diophantine Equations, Hilbert’s 10th Problem
Post Correspondence Problem
Tiling problems
Context-free languages
Hilbert’s 10th Problem
A Diophantine system is a system of Diophantine equations such as:
4×3 + 7xy + 2z2 – 23x4z = 0
Hilbert’s 10th problem: given a Diophantine system, does it have an integer solution?
Tenth = {
Tenth is not in D?
Proved in 1970 by Yuri Matiyasevich.
Restricted Diophantine Problems
Suppose all exponents are 1:
A farmer buys 100 animals for $100.00. The animals
include at least one cow, one pig, and one chicken, but
no other kind. If a cow costs $10.00, a pig costs $3.00,
and a chicken costs $0.50, how many of each did he
buy?
● Diophantine problems of degree 1 and Diophantine problems of a single variable of the form axk = c are efficiently solvable.
● The quadratic Diophantine problem is NP-complete.
● The general Diophantine problem is undecidable, so not
even an inefficient algorithm for it exists.
Emil Post (1940)
Consider the following blocks:
Assuming an unlimited supply of each block, does there exist a sequence of such blocks such that the strings on top and bottom are the same?
Solution:
Post Correspondence Problem
a
baa
ab
aa
bba
bb
a
baa
ab
aa
bba
bb
bba
bb
Consider two equal length, finite lists, X and Y, of strings over
some alphabet :
X = x1, x2, x3, …, xn
Y = y1, y2, y3, …, yn
Does there exist a finite sequence of integers i1, i2,…, ik such that
xi1xi2…xik = yi1yi2…yik ?
Post Correspondence Problem
PCP Example
Solution:
b
aab
abb
b
aba
a
baaa
baba
b
aab
aba
a
baaa
baba
PCP Example
No solution!
bb
abb
ab
a
aab
bba
PCP Example
Shortest solution has length 252!
bbab
b
abba
bb
b
bba
The Language PCP
= (x1, x2, x3, …, xn)(y1, y2, y3, …, yn)
The problem of determining whether a particular instance P of
the Post Correspondence Problem has a solution can be recast
as the problem of deciding the language:
PCP = {
: P has a solution}
The language PCP is in SD \ D.
PCP is in SD
Theorem. PCP is in SD.
Proof: Build a Turing Machine that tries all possible solutions of length 1, then all possible solutions of length 2, and so on.
PCP is not in D
Theorem. PCP is not in D.
Proof: Two reductions:
La ≤ MPCP
MPCP ≤ PCP
La = {
La is not in D
MPCP (modified PCP): defined in the same way except that any solution must start with the first block:
x1xi2…xik = y1yi2…yik
PCP is not in D
La ≤ MPCP
Given an instance of MPCP able to simulate derivations in G = (V, , R, S). Using % and & new symbols, not in V, a derivation S x1 x2 … w is simulated as %S x1 x2 … w& PCP is not in D : end: derivate: , for each rule α β in R It can be formally proved by induction on the length of the derivation that G can derive w iff P has a solution. The idea is that any solution of P must start with the first block (it is an MPCP) and then the only way to continue is by the bottom string to catch up to the top by using blocks corresponding to rules of G. %S & β PCP is not in D %S & ABc b a B A S ABSc BA bc a a PCP is not in D S ABc BAc ac P simulates: %S ABc BAc ac& as follows: %S & ABc BA a c PCP is not in D Given an instance Assume X = x1, x2, …, xn Construct A = a0, a1, a2, …, an, an+1 ai = xi with # after each symbol, for 1 ≤ i ≤ n a0 = #a1 b0 = b1 PCP is not in D a baa #a# a# b#a#a# $ PCP is not in D If Example: baa #a# b#a#a# $ A Tiling Problem Is it possible to tile an arbitrary surface in the plane? A Set of TilesThat Cannot Tile the Plane Is the Tiling Language in D? as Let TILES = { Is the Tiling Language in D? Proof: Lexicographically enumerate partial solutions. Theorem: TILES is not in SD. Proof (sketch): If TILES were in SD, then it would be in D. But we show that it is not by reduction from H. We map an arbitrary Turing machine M into a set of tiles T: ● Each row of tiles corresponds to a configuration of M. 1. Given a CFL L and a string s, is s L? A configuration of a TM M is a 4 tuple: A computation of M is a sequence of configurations: ● C0 is the initial configuration of M, A computation history encodes a computation: (s, , q, x)(q1, , a, z)( …. )( …. )(qn, r, s, t), Example: (s, , q, x) We show that CFGALL is not in D. The proof is by reduction (R) from H: R will build G to generate the language L# composed of: all strings in *, Then: If M does not halt on w, there are no computation histories of M on w so G generates * and Oracle will reject. CFGALL = { It is easier for R to build a PDA than a grammar. So R will first build a PDA P, then convert P to a grammar. For a string s to be a computation history of M on w: It must be a syntactically valid computation history. 2. C0 must be a starting configuration. 3. The last configuration must be a halting configuration. 4. Each configuration after C0 must be derivable from the valid syntax can be checked with a regular expression Precisely, P will guess that a configuration is not derivable from the previous one and check that. (q1, aaaa, b, aaaa)(q2, aaa, a, baaaa). Okay. P has to use its stack to record the first configuration and then compare it to the second. But … The stack cannot be used to check that two strings are equal but it can be used to check that two strings are the reverse of each other (WWR is context-free, WW is not). Consider modified computation histories: For C1C2… computation history, the modified one is: C1C2RC3C4R… Consider the language B# of all strings except modified computation histories. P will accept B#. R( , where P accepts all strings in B#. If Oracle exists, then C = Oracle(R( But no machine to decide H can exist, so neither does Oracle. CFG= = { R is a reduction from CFGALL to CFG= defined as follows: If Oracle exists, then C = Oracle(R( But no machine to decide CFGALL can exist, so neither does Oracle. PDAMIN = { R( : P# is a PDA with a single state s that is both the . If Oracle exists, then C = Oracle(( = (x1, x2, x3, …, xn)( y1, y2, y3, …, yn), Example: (b, abb, aba, bbaaa)(bab, b, a, babaaa). Gx: Sx bSx1 Sx b1 Gx could generate: b abb aba abb 2 3 2 1 IntEmpty = { : P has a solution} R (?Oracle) L2 = { R( ) = If Oracle exists, then C = Oracle(R( )) decides PCP: PCP: P has at least one solution. So both Gx and Gy PCP: P has no solution. So there is no string that can But no machine to decide PCP can exist, so neither does Oracle. CFGUNAMBIG = { : P has a solution} R (?Oracle) CFGUNAMBIG = { R( ) = G generates L(G1) L(G2) by generating all the derivations that G1 can S Sx or S Sy. CFGUNAMBIG = { ) = If Oracle exists, then C = Oracle(R( )) decides PCP: PCP: P has a solution. Both Gx and Gy generate some string: w(i1, i2, …ik)R, where w = xi1xi2…xik = yi1yxi2…yik. PCP: P has no solution. No string can be generated by But no machine to decide PCP can exist, so neither does Oracle.
start: copy the rest: , for c V
c
c
%
w&
α
Example:
G = ({S,A,B,a,b,c}, {a,b,c}, R, S), w = ac
R: S ABc
S ABSc
AB BA
Bc bc
BA a
A a
P:
c
c
%
ac&
S
b
a
B
A
S
S
AB
Bc
BA
A
For the derivation of w:
c
c
%
ac&
S
AB
BA
c
MPCP ≤ PCP
Y = y1, y2, …, yn
B = b0, b1, b2, …, bn, bn+1
bi = yi with # before each symbol, for 1 ≤ i ≤ n
an+1 = $ bn+1 = #$
Example:
ab
aa
#a#b
#a#b
#a#a
#$
Then has the solution 0, i2, i3, …, ik, n+1.
aa
#a#b
#a#a
#$
Given a finite set T of tiles of the form:
We can represent any set of tiles as a string. For example, we could
represent
Theorem: TILES is in SD.
● The first row corresponds to M’s initial configuration when started on a blank tape.
● The next row corresponds to M’s next configuration, and so forth.
● There is always a next configuration of M and thus a next row in the
tiling iff M does not halt.
● T is in TILES iff there is always a next row.
● So if it were possible to semidecide whether T is in TILES it would
be possible to semidecide whether M fails to halt on . But H is
not in SD. So neither is TILES.
The Undecidability of the Tiling Language
2. Given a CFL L, is L = ?
3. Given a CFL L, is L = *?
4. Given CFLs L1 and L2, is L1 = L2?
5. Given CFLs L1 and L2, is L1 L2 ?
6. Given a CFL L, is L context-free?
7. Given a CFL L, is L regular?
8. Given two CFLs L1 and L2, is L1 L2 = ?
9. Given a CFL L, is L inherently ambiguous?
10. Given PDAs M1 and M2, is M2 a minimization of M1?
11. Given a CFG G, is G ambiguous?
Context-Free Languages
( M’s current state,
the nonblank portion of the tape before the read head,
the character under the read head,
the nonblank portion of the tape after the read head).
C0, C1, …, Cn for some n 0 such that:
● Cn is a halting configuration of M, and:
● C0 |-M C1 |-M C2 |-M … |-M Cn.
Reduction via Computation History
where qn HM.
…
(q1, aaabbbaa, a, bbbbccc)
(q2, aaabbbaaa, b, bbbccc)
…
Computation Histories
except any that represent a computation history of M on w.
If M halts on w, then there exists a computation history of M on w, so there will be a string that G will not generate; Oracle will accept.
previous one according to the rules in M.
Computation Histories as Strings
hardwire into P the initial configuration
hardwire into P the final configuration
P will nondeterministically check that a flaw exists.
(q1, aaaa, b, aaaa)(q2, bbbb, a, bbbb). Not okay.
Computation Histories as Strings
Modified Computation Histories
1. Construct
2. From P, construct a grammar G that generates L(P).
3. Return
●
history of M on w. So there is a string that G does not
generate. Oracle rejects. C accepts.
●
computation history of M on w. G generates *. Oracle
accepts. C rejects.
The Conclusion
Proof by reduction from: CFGALL = {
R(
1. Construct the description
that generates *.
2. Return
● R is correct:
●
●
generates everything. Oracle rejects.
Recall that M2 is a minimization of M1 iff:
(L(M1) = L(M2)) M2 is minimal.
1. Invoke CFGtoPDAtopdown(G) to construct the description
of a PDA that accepts the language that G generates.
2. Write
start state and an accepting state. Make a transition from s
back to itself on each input symbol. Never push anything onto
the stack. L(P#) = * and P# is minimal.
3. Return
●
L(P) = L(P#). And P# is minimal. Thus P# is a minimization of P.
Oracle accepts.
●
L(P) L(P#). So Oracle rejects.
No machine to decide CFGALL can exist, so neither does Oracle.
where j (xj + and yj +)
Reductions from PCP
i X Y
1 b bab
2 abb b
3 aba a
4 bbaaa babaaa
Sx abbSx2 Sx abb2
Sx abaSx3 Sx aba3
Sx bbaaaSx4 Sx bbaaa4
Gy: Sy babSy1 Sy bab1
Sy bSy2 Sy b2
Sy aSy3 Sy a3
Sy babaaaSy4 Sy babaaa4
From PCP to Grammar
i X Y
1 b bab
2 abb b
3 aba a
4 bbaaa babaaa
PCP = {
1. From P construct Gx and Gy.
2. Return
●
will generate some string:
w(i1, i2, …ik)R, where w = xi1xi2…xik = yi1yxi2…yik.
So L(G1) L(G2) . Oracle rejects, so C accepts.
●
be generated by both Gx and Gy. So L(G1) L(G2) = . Oracle
accepts, so C rejects.
PCP = {
1. From P construct Gx and Gy.
2. Construct G as follows:
2.1. Add to G all the rules of both Gx and Gy.
2.2. Add S and the two rules S Sx and S Sy.
3. Return
produce plus all the ones that G2 can produce, except that each has a
prepended:
R(
1. From P construct Gx and Gy.
2. Construct G as follows:
2.1. Add to G all the rules of both Gx and Gy.
2.2. Add S and the two rules S Sx and S Sy.
3. Return
So G can generate that string in two different ways. G is
ambiguous. Oracle accepts.
both Gx and Gy. Since both Gx and Gy are unambiguous, so is G.
Oracle rejects.