Validity of Formulas in First-order Logic is
Undecidable
Prakash Panangaden
22nd March 2021
First-order logic is a formal language for capturing the notion of quantified
formulas. I assume that you are familiar with it but I will quickly describe
it and give its semantics. The syntax is defined as follows. First we define
the ingredients of the language.
• There is a countable set of symbols called variables: x, y, z, . . ..
• There is a set of symbols called constants: these vary from theory to
theory. For example they may be the natural numbers: 0, 1, 2, 3, . . . or
arbitrary symbols: a, b, c, . . ..
• There is a finite set of function symbols: f, g, h, . . .. Associated with
each function symbol is a positive integer called its arity, this repre-
sents the number of arguments that it takes.
• There is a finite set of predicate symbols: P,Q,R, . . .. As with function
symbols, each predicate symbol has an arity.
Next we define a set of terms, these denote “things”.
1. A constant is a term.
2. A variable is a term.
3. If f is a function symbol of arity n and t1, . . . , tn are terms then
f(t1, . . . , tn) is a term.
A term with no variables is called a ground term.
Next we define formulas.
1. If t1, t2 are terms then t1 = t2 is a formula.
1
2. If P is a predicate symbol of arity n and t1, . . . , tn are terms then
P (t1, . . . , tn) is a formula.
3. If φ is a formula then ¬φ is a formula.
4. If φ and ψ are formulas then φ ∧ ψ, φ ∨ ψ and φ⇒ ψ are formulas.
5. If φ is a formula and x is a variable then ∀x φ and ∃x φ are formulas.
A formula as such does not mean anything; we cannot say if a formula is
true or false if we do not know what it means. We have to define its meaning
through an interpretation. In order to start we assume that we have a set D
that is the collection of things that we are talking about. An interpretation
I consists of: (i) for each constant symbol a an element da of D, (ii) for
each function symbol f of arity n a function fD : D
n −→ D and (iii) for each
predicate symbol P of arity n a subset PD of D
n. Here Dn stands for the
n-fold cartesian product. A valuation is a map ρ : V ar −→ D where V ar is
the collection of variables. We will write ρ[x 7→ d] for the valuation that is
exactly the same as ρ except for x which is mapped to d. Given a valuation
we can define what it means for a formula to be “true”.
We write I , ρ |= φ to mean that the formula φ is true with the given
interpretation and valuation. The notion of truth is defined inductively.
First we define the interpretation of a term inductively. We write JtK for the
interpretation of a term. We assume a fixed interpretation I throughout
and, for the interpretation of terms we assume a fixed valuation ρ.
1. JaK = da
2. JxK = ρ(x)
3. Jf(t1, . . . , tn)K = Jf(Jt1K, . . . , JtnK)K.
We can now define whether a formula is true in an interpretation with a
valuation: we write I , ρ |= φ if the formula φ is true with the interpretation
I and the valuation ρ.
1. I , ρ |= t1 = t2 if Jt1K = Jt2K
2. I , ρ |= P (t1, . . . , tn) if (Jt1K, . . . , JtnK) ∈ PD
3. I , ρ |= φ ∧ ψ if I , ρ |= φ and I , ρ |= ψ
4. I , ρ |= ¬φ if I , ρ 6|= φ
5. I , ρ |= ∃x φ if there is some element d of D such that I , ρ[x 7→ d] |= φ
2
6. I , ρ |= ∀x φ if for every d ∈ D I , ρ[x 7→ d] |= φ.
Given a formula φ three things could happen. It could be true in every
interpretation, it could true in some interpretations and false in others and
it could never be true. A formula that is true in every interpretation is said
to be valid. These are statements that are true because of the meaning of
the logical symbols alone. A formula that is true in some interpretation with
some valuation is said to be satisfiable; we say that the interpretation and
valuation is a model of the formula and that this model satisfies formula.
The word “model” is a short way of saying “interpretation and valuation.”
A formula that is never true is said to be invalid. These are all semantic
notions and have nothing to do with axioms or provability. We write |= φ to
signify a valid formula. We also write Γ |= φ, where Γ is a set of formulas,
if every model for which all the formulas in Γ are true also satisfies φ. We
will not talk about axioms, rules of inference and provability in this note
but I assume that you are familiar with these concepts. A major theorem
that links these ideas is soundness and completeness. We write ` φ if φ can
be proved from the axioms and rules of logic and Γ ` φ if φ can be proved
using the formulas in Γ as assumptions; we are not required to use all the
formulas of Γ. Soundness says that Γ ` φ implies Γ |= φ; crudely speaking
“only true statements can be proved”. A logical system that is not sound is
worthless. Completeness says that
Γ |= φ⇒ Γ ` φ.
This implies that every valid formula can be proved. It would be nice1 if
every logical system were complete; but this is not the case. First-order
logic is sound and complete.
The purpose of this note is not to discuss completeness but to prove
Theorem 1. The set of valid formulas of first-order logic is recursively
enumerable but not recursive.
Put another way, there is no algorithm for deciding whether a formula is
valid or not. There is an algorithm that searches through proofs and if a
formula is valid there will be a proof of it (completeness) which the algorithm
will find. Thus we can Turing recognize the set of valid formulas. However
we cannot decide whether a formula is valid or not. If a formula is not valid
at any given stage we will not know whether our algorithm will find a proof
later on or whether the formula is not valid.
1Actually the World would be a dull place, so perhaps it would not be nice.
3
We now turn to the undecidability proof. We will exhibit a mapping re-
duction to the Post Correspondence Problem or PCP2 We will show that if
we could solve the validity problem for FOL we could solve any instance of
PCP, which we know is impossible.
The strings we consider will be over the alphabet {0, 1}. In our logical
formulas we use one constant symbol, a and two unary function symbols f0
and f1. If we have a word w = w1w2 . . . wk over our alphabet, where each
wi is 0 or 1, we write fw(a) as shorthand for fwk(fwk−1(. . . fw2(fw1(a)) . . .)).
For example we would write f1101(a) for f1(f0(f1(f1(a)))). This allows us to
encode words as terms. We will also use a single binary predicate p.
Suppose that we are given an instance of PCP over the alphabet {0, 1}:
{(α1, β1), (α2, β2), . . . , (αn, βn)}.
Here there are n domino types, the αi are the strings on the top halves of the
dominoes and the βi are the strings on the bottom halves of the dominoes.
Now we define a formula{
n∧
i=1
p(fαi(a), fβi(a)) ∧ ∀x∀y [p(x, y)⇒
n∧
i=1
p(fαi(x), fβi(y))]
}
⇒ ∃z p(z, z).
Before we proceed with the proof let us understand the intuition behind this
formula which is hard to deconstruct at first sight. The strings that appear
in the dominoes are encoded in the terms using f0 and f1 in the manner
described above. The a is just there to start the encoding of the string. The
predicate p says that the first argument and the second argument describe
a pair of strings that one can obtain as the top and bottom strings of some
sequence of dominoes. The first conjunct just says that if one takes any
single domino one gets a pair that are related in this way. The second
conjunct says that if one has a pair of strings that are obtained in this way
then by adding any one domino one gets an extended pair of strings that are
related. Finally the last formula outside the curly brackets says that there
is a string that appears as the top and the bottom string of some sequence
of dominoes. Thus this formula says that the PCP has a solution.
More formally we can prove that this formula is valid if and only if the PCP
instance has a solution. This will complete the reduction of PCP to the
validity problem.
2Please note that in complexity theory, the acronym PCP is used for something com-
pletely different.
4
First we assume that the formula is valid: this means that it is true in
every interpretation, which means that we are free to construct a convenient
interpretation. We will construct an interpretation that will establish that
the PCP instance has a solution. The domain of the interpretation is {0, 1}∗
and the constant a will denote ε, the empty string. The function symbol
f0 denotes the function that adds the symbol 0 to the end of any word,
f0(w) = w0 and f1(w) = w1. The predicate symbol p is interpreted as the
set of pairs of words of the form
(αi1αi2αi3 . . . αik , βi1βi2βi3 . . . βik)
for some sequence of integers i1, i2, . . . , ik. This interpretation formalises
what we have been saying informally in the preceding paragraph. Now since
the formula φ is assumed to be valid it is true in this specific interpreta-
tion. The antecedents are clearly true. The first conjunct just describes the
Post system and and the second conjunct is just the inductive statement
of what it means to build pairs of strings with dominoes so the consequent
∃z p(z, z) must be true, but this says precisely that the Post system has a
solution.
For the reverse direction we assume that the Post system has a solution; in
other words there is a sequence of integers i1, i2, . . . , ik such that
αi1αi2 . . . αik = βi1βi2 . . . βik = z;
we are naming this z in anticipation of the argument below. Now we must
show that the formula φ is true in every interpretation. To show that φ is
true means that we assume the formula in curly brackets and show that the
formula ∃z p(z, z) must follow. Now it is a simple induction on the length of
sequences of integers that if we have any sequence of integers in the range
{1, . . . , n}, say j1, j2, . . . , jm that
p(αj1αj2 . . . αjm , βj1βj2 . . . βjm).
The first conjunct in the curly brackets asserts the base case and the second
conjunct is precisely the inductive step. Thus, it applies to the specific
sequence that arises from the solution to the Post system, in short we have
∃z p(z, z). We have shown that the formula is valid.
The consequences are immense. There is no algorithm to check whether
formulas in first-order logic are valid. We can decide validity of propositional
logic using truth tables (and as far as we know there is no better way) but
5
there is no such method for first-order predicate calculus. There are no fully
automatic theorem-proving procedures for even a wimpy logic like first-order
predicate calculus. You cannot even express induction in first-order logic.
There is no hope at all for arithmetic.
For second-order predicate calculus the set of valid formulas is not even
CE or co-CE. Interestingly if we allow only unary predicates we regain
decidability but that is another story.
What can we do? An important development is computer-assisted proofs.
Here the possibilities are unlimited. A number of major theorems have been
proved this way recently. Humans and computers working together are a
very powerful combination.
6