Validity of Formulas in First-order Logic is Undecidable
Prakash Panangaden 22nd March 2021
First-order logic is a formal language for capturing the notion of quantified formulas. I assume that you are familiar with it but I will quickly describe it and give its semantics. The syntax is defined as follows. First we define the ingredients of the language.
• There is a countable set of symbols called variables: x, y, z, . . ..
• There is a set of symbols called constants: these vary from theory to theory. For example they may be the natural numbers: 0, 1, 2, 3, . . . or arbitrary symbols: a, b, c, . . ..
• There is a finite set of function symbols: f,g,h,…. Associated with each function symbol is a positive integer called its arity, this repre- sents the number of arguments that it takes.
• There is a finite set of predicate symbols: P, Q, R, . . .. As with function symbols, each predicate symbol has an arity.
Next we define a set of terms, these denote “things”.
1. A constant is a term.
2. A variable is a term.
3. If f is a function symbol of arity n and t1,…,tn are terms then f(t1,…,tn) is a term.
A term with no variables is called a ground term. Next we define formulas.
1. If t1, t2 are terms then t1 = t2 is a formula. 1
2. If P is a predicate symbol of arity n and t1,…,tn are terms then P(t1,…,tn) is a formula.
3. If φ is a formula then ¬φ is a formula.
4. Ifφandψareformulasthenφ∧ψ,φ∨ψandφ⇒ψareformulas.
5. If φ is a formula and x is a variable then ∀x φ and ∃x φ are formulas.
A formula as such does not mean anything; we cannot say if a formula is true or false if we do not know what it means. We have to define its meaning through an interpretation. In order to start we assume that we have a set D that is the collection of things that we are talking about. An interpretation I consists of: (i) for each constant symbol a an element da of D, (ii) for each function symbol f of arity n a function fD : Dn −→ D and (iii) for each predicate symbol P of arity n a subset PD of Dn. Here Dn stands for the n-fold cartesian product. A valuation is a map ρ : V ar −→ D where V ar is the collection of variables. We will write ρ[x → d] for the valuation that is exactly the same as ρ except for x which is mapped to d. Given a valuation we can define what it means for a formula to be “true”.
We write I,ρ |= φ to mean that the formula φ is true with the given interpretation and valuation. The notion of truth is defined inductively. First we define the interpretation of a term inductively. We write t for the interpretation of a term. We assume a fixed interpretation I throughout and, for the interpretation of terms we assume a fixed valuation ρ.
1. a = da
2. x = ρ(x)
3. f(t1,…,tn) = f(t1,…,tn).
We can now define whether a formula is true in an interpretation with a valuation: we write I , ρ |= φ if the formula φ is true with the interpretation I and the valuation ρ.
1. I,ρ|=t1 =t2 ift1=t2
2. I,ρ |= P(t1,…,tn) if (t1,…,tn) ∈ PD
3. I,ρ|=φ∧ψifI,ρ|=φandI,ρ|=ψ
4. I,ρ|=¬φifI,ρ̸|=φ
5. I,ρ|=∃xφifthereissomeelementdofDsuchthatI,ρ[x→d]|=φ
2
6. I,ρ|=∀xφifforeveryd∈DI,ρ[x→d]|=φ.
Given a formula φ three things could happen. It could be true in every interpretation, it could true in some interpretations and false in others and it could never be true. A formula that is true in every interpretation is said to be valid. These are statements that are true because of the meaning of the logical symbols alone. A formula that is true in some interpretation with some valuation is said to be satisfiable; we say that the interpretation and valuation is a model of the formula and that this model satisfies formula. The word “model” is a short way of saying “interpretation and valuation.” A formula that is never true is said to be invalid. These are all semantic notions and have nothing to do with axioms or provability. We write |= φ to signify a valid formula. We also write Γ |= φ, where Γ is a set of formulas, if every model for which all the formulas in Γ are true also satisfies φ. We will not talk about axioms, rules of inference and provability in this note but I assume that you are familiar with these concepts. A major theorem that links these ideas is soundness and completeness. We write ⊢ φ if φ can be proved from the axioms and rules of logic and Γ ⊢ φ if φ can be proved using the formulas in Γ as assumptions; we are not required to use all the formulas of Γ. Soundness says that Γ ⊢ φ implies Γ |= φ; crudely speaking “only true statements can be proved”. A logical system that is not sound is worthless. Completeness says that
Γ |= φ ⇒ Γ ⊢ φ.
This implies that every valid formula can be proved. It would be nice1 if every logical system were complete; but this is not the case. First-order logic is sound and complete.
The purpose of this note is not to discuss completeness but to prove
Theorem 1. The set of valid formulas of first-order logic is recursively enumerable but not recursive.
Put another way, there is no algorithm for deciding whether a formula is valid or not. There is an algorithm that searches through proofs and if a formula is valid there will be a proof of it (completeness) which the algorithm will find. Thus we can Turing recognize the set of valid formulas. However we cannot decide whether a formula is valid or not. If a formula is not valid at any given stage we will not know whether our algorithm will find a proof later on or whether the formula is not valid.
1Actually the World would be a dull place, so perhaps it would not be nice. 3
We now turn to the undecidability proof. We will exhibit a mapping re- duction to the Post Correspondence Problem or PCP2 We will show that if we could solve the validity problem for FOL we could solve any instance of PCP, which we know is impossible.
The strings we consider will be over the alphabet {0,1}. In our logical formulas we use one constant symbol, a and two unary function symbols f0 and f1. If we have a word w = w1w2 . . . wk over our alphabet, where each wi is 0 or 1, we write fw(a) as shorthand for fwk (fwk−1 (. . . fw2 (fw1 (a)) . . .)). For example we would write f1101(a) for f1(f0(f1(f1(a)))). This allows us to encode words as terms. We will also use a single binary predicate p.
Suppose that we are given an instance of PCP over the alphabet {0, 1}: {(α1, β1), (α2, β2), . . . , (αn, βn)}.
Here there are n domino types, the αi are the strings on the top halves of the dominoes and the βi are the strings on the bottom halves of the dominoes. Now we define a formula nn
p(fαi (a), fβi (a)) ∧ ∀x∀y [p(x, y) ⇒ p(fαi (x), fβi (y))] ⇒ ∃z p(z, z). i=1 i=1
Before we proceed with the proof let us understand the intuition behind this formula which is hard to deconstruct at first sight. The strings that appear in the dominoes are encoded in the terms using f0 and f1 in the manner described above. The a is just there to start the encoding of the string. The predicate p says that the first argument and the second argument describe a pair of strings that one can obtain as the top and bottom strings of some sequence of dominoes. The first conjunct just says that if one takes any single domino one gets a pair that are related in this way. The second conjunct says that if one has a pair of strings that are obtained in this way then by adding any one domino one gets an extended pair of strings that are related. Finally the last formula outside the curly brackets says that there is a string that appears as the top and the bottom string of some sequence of dominoes. Thus this formula says that the PCP has a solution.
More formally we can prove that this formula is valid if and only if the PCP instance has a solution. This will complete the reduction of PCP to the validity problem.
2Please note that in complexity theory, the acronym PCP is used for something com- pletely different.
4
First we assume that the formula is valid: this means that it is true in every interpretation, which means that we are free to construct a convenient interpretation. We will construct an interpretation that will establish that the PCP instance has a solution. The domain of the interpretation is {0, 1}∗ and the constant a will denote ε, the empty string. The function symbol f0 denotes the function that adds the symbol 0 to the end of any word, f0(w) = w0 and f1(w) = w1. The predicate symbol p is interpreted as the set of pairs of words of the form
(αi1αi2αi3 …αik,βi1βi2βi3 …βik)
for some sequence of integers i1, i2, . . . , ik. This interpretation formalises what we have been saying informally in the preceding paragraph. Now since the formula φ is assumed to be valid it is true in this specific interpreta- tion. The antecedents are clearly true. The first conjunct just describes the Post system and and the second conjunct is just the inductive statement of what it means to build pairs of strings with dominoes so the consequent ∃z p(z,z) must be true, but this says precisely that the Post system has a solution.
For the reverse direction we assume that the Post system has a solution; in other words there is a sequence of integers i1, i2, . . . , ik such that
αi1αi2 …αik = βi1βi2 …βik = z;
we are naming this z in anticipation of the argument below. Now we must show that the formula φ is true in every interpretation. To show that φ is true means that we assume the formula in curly brackets and show that the formula ∃z p(z, z) must follow. Now it is a simple induction on the length of sequences of integers that if we have any sequence of integers in the range {1,…,n}, say j1,j2,…,jm that
p(αj1αj2 …αjm,βj1βj2 …βjm).
The first conjunct in the curly brackets asserts the base case and the second conjunct is precisely the inductive step. Thus, it applies to the specific sequence that arises from the solution to the Post system, in short we have ∃z p(z, z). We have shown that the formula is valid.
The consequences are immense. There is no algorithm to check whether formulas in first-order logic are valid. We can decide validity of propositional logic using truth tables (and as far as we know there is no better way) but
5
there is no such method for first-order predicate calculus. There are no fully automatic theorem-proving procedures for even a wimpy logic like first-order predicate calculus. You cannot even express induction in first-order logic. There is no hope at all for arithmetic.
For second-order predicate calculus the set of valid formulas is not even CE or co-CE. Interestingly if we allow only unary predicates we regain decidability but that is another story.
What can we do? An important development is computer-assisted proofs. Here the possibilities are unlimited. A number of major theorems have been proved this way recently. Humans and computers working together are a very powerful combination.
6