Introduction to AI Knowledge Representation and Reasoning
Introduction to AI
Non-monotonic Reasoning
Francesca Toni
(thanks to Marek Sergot)
Outline
• Classical logic: the qualification problem
• Closed World Assumption
• (non-)Monotonicity
• Non-monotonic – defeasible – reasoning
• Negation-as-failure in logic programming (and Prolog)
We will use the Prolog convention on variables/constant/function
symbols. Clauses/rules will be implicitly universally quantified. 2
Additional :
Section 9.4.5
Section 12.6
Recommended:
Section 5.6
+ sections 1-3 of paper
on stable models (on cate)
The Qualification Problem (1)
“All birds can fly…”
flies(X) ← bird(X)
“…unless they are penguins…”
flies(X) ← bird(X), ¬ penguin(X)
“…or ostriches…”
flies(X) ← bird(X), ¬ penguin(X), ¬ ostrich(X)
“…or wounded…”
flies(X) ← bird(X), ¬ penguin(X), ¬ ostrich(X), ¬wounded(X)
“…or dead, or sick, or glued to the ground, or…”
3
The Qualification Problem (2)
Let BIRDS be the set of sentences about flying birds.
Even if we could list all these exceptions, classical logic would
still not allow:
BIRDS ∪ {bird(tweety)} ╞ flies(tweety)
unless we also affirm all the qualifications, viz.:
¬ penguin(tweety) ¬ ostrich(tweety)
¬ wounded(tweety) ¬ dead(tweety)
¬ sick(tweety) ¬ glued_to_the_ground(tweety)
…
4
Classical logic is inadequate
What we want is a new kind of ‘entailment’:
BIRDS ∪ {bird(tweety)} ╞* flies(tweety)
Namely, from BIRDS and bird(tweety) it follows
by default – in the absence of information to the
contrary – that flies(tweety).
This kind of reasoning will be defeasible/non-
monotonic.
5
Non-monotonic logics
Classical logic is monotonic. For a set of sentences S:
if S ╞ α then S ∪ X ╞ α
Namely, new information X always preserves old conclusions α.
Reasoning by default is typically non-monotonic. We may have that:
S ╞* α but S ∪ X ╞* α
Examples:
• DRINKS ∪ {coffee} ╞* tastes nice
• DRINKS ∪ {coffee} ∪ {diesel oil} ╞* tastes nice
• BIRDS ∪ {bird(tweety)} ╞* flies(tweety)
• BIRDS ∪ {bird(tweety)} ∪ {penguin(tweety)} ╞* flies(tweety)
There is a branch of AI focusing on non- monotonic logic for default
reasoning – within knowledge representation and reasoning in AI.
6
The ‘Closed World Assumption’
Consider the set of sentences that could be describing a database DB:
has-office-in(IBM, Winchester) city(Winchester)
has-office-in(IBM, London) city(London)
has-office-in(IBM, Paris) city(Paris)
has-office-in(MBI, London) city(NewYork)
capital-city(London) company(IBM)
capital-city(Paris) company(MBI)
Does MBI have an office in Paris?
Does IBM have an office in New York?
We do not know.
But it is usual in databases (and many other contexts) to
make a Closed World Assumption (CWA):
if α is not in the DB, assume ¬α
7
Normal logic programs
A normal logic program is a set of clauses of the form:
A ← L1, …, Ln (n ≥ 0)
where A is an atom and each Li is a literal.
A literal is either an atom (a ‘positive literal’) or of the form
not B
where B is an atom. (not B is a ‘negative literal’).
The atom A is the head of the clause; the literals L1, …, Ln are the body of
the clause. When the body is empty (n = 0 above) the arrow ← is usually
omitted.
not is negation as failure (NAF): informally (for now)
not B succeeds when all attempts to prove B fail
8
SLD+Negation-as-failure: SLDNF (1)
We have already seen the computation of a goal (query)
G = L1, …, Lm as a series of derivation steps:
9
G
G1
Gn-1
θ1
θ2
θn
…
• i is the mgu at the i-th
derivation step
• The answer computed is
(the restriction to the vars
of G of) the composition
of all these mgus:
= 1 ◦ … ◦ n
SLD+Negation-as-failure: SLDNF (2)
Now there are two kinds of derivation steps:
(a) select a positive literal Lj = B from the current goal:
L1,…,Lj-1, B, Lj+1,…,Ln
i match B with B’M1,…,Mk , with Bi =B’i
(L1,…,Lj-1, M1,…,Mk, Lj+1,…,Ln)i
(b) select a negative literal Lj = not B from the current goal:
L1,…,Lj-1, not B, Lj+1,…,Ln
i ={} subcomputation:
all ways of computing goal B must fail (finitely)
(L1,…,Lj-1, Lj+1,…,Ln)
Note that the NAF sub-computation just checks not B: it does not generate
bindings (substitutions) for variables.
10
11
SLDNF Example (S is ground)
S: p q, not r
r b
r not q
q
Goal: p
p
q, not r
not r r
b not q q
fail fail
computed answer: {}
subcomputation
subcomputation
12
SLDNF Example (S is not ground)
S: p(X) q(X), not r(X)
r(X) b(X)
r(X) not q(X)
q(a)
Goal: p
p(X)
q(X), not r(X)
not r(a) r (a)
b(a) not q(a) q(a)
fail fail
computed answer: {X/a}
subcomputation
subcomputation
Selection of sub-goals in SLDNF
• Sub-goals can be selected in any order. The answers are
the same, whatever the selection rule
– Prolog selects always the leftmost sub-goal in the current goal.
• However, the selection of sub-goals must be safe: it must
not pick a non-ground negative literal
– Most Prolog systems do not implement this!
E.g. S={p(a), q(b)}
not p(X), q(X) not p(X), q(X)
… not p(b)
fail …
13Wrongly interprets ∃X ¬ p(X) as ¬ ∃X p(X)
SLDNF is non-monotonic
Consider the set of sentences S:
p(X) ← q(X), not r(X)
q(a)
r(b)
Then S ⊢SLDNF p(a)
but S ∪ {r(a)} ⊢SLDNF p(a)
Note: SLDNF builds in a kind of Closed World
Assumption:
S ∪ {r(a)} ⊢SLDNF¬ p(a)
14
Solving the qualification problem using NAF
Example
• Typically (by default, unless there is reason to think
otherwise,…) a bird can fly.
• Except that dead birds cannot fly
can fly(X) ← bird(X), not abnormal_bird(X)
abnormal_bird(X) ← dead(X)
Example
People are innocent by default (unless they can be
proven to be guilty)
innocent(X) ← not guilty(X) 15
Credulous vs Sceptical non-monotonic reasoning:
“The Nixon diamond”
• Quakers are typically pacifists.
• Republicans are typically not pacifists.
• Richard Nixon is a Quaker
• Richard Nixon is a Republican
Do we conclude that Nixon is a pacifist or not?
Sceptical (or cautious) reasoning:
No, since we cannot choose between two possible, conflicting
default conclusions.
Credulous (or brave) reasoning:
Yes, to both, since we have reason to believe both.
Which form of reasoning to choose? It depends on the needs of
our application/problem
16
The Nixon diamond using NAF
• Quakers are typically pacifists.
• Republicans are typically not pacifists.
• Richard Nixon is a Quaker
• Richard Nixon is a Republican
Do we conclude that Nixon is a pacifist or not?
pacifist(X) ← quacker(X), not abnormal_quacker(X)
not_pacifist(X) ← republican(X), not abnormal_republican (X)
abnormal_quacker(X) ← not_pacifist(X)
abnormal_republican (X) ← pacifist(X)
quacker(nixon)
republican(nixon)
17
18
SLDNF for the Nixon diamond
By shortening predicates (pacifist becomes p etc)
p(X) ← q(X), not ab_q(X) n_p(X) ← r(X), not ab_r (X)
ab_q(X) ← n_p(X) ab_r (X) ← p(X)
q(nixon) r(nixon)
Goal (for example. Here nixon becomes nix): p(nix)
p(nix)
q(nix), not ab_q(nix)
not ab_q(nix) ab_q(nix)
n_p(nix)
r(nix), not ab_r (nix)
not ab_r (nix) ab_r (nix)
…
Infinite computation!
Semantics for normal logic programs
• Completion semantics
• Well-founded model *
• Stable models (see answer sets in Part II) *
* these semantics equate each normal logic
program S to the set of all its ground instances
over the underlying Herbrand universe
19
Semantics of sets of definite clauses =
positive logic programs (recap)
Given a set of definite clauses S:
• the Herbrand universe is the set of all ground terms
obtained from constants and function symbols in S, and
the Herbrand base (HB) is the set of all atoms from
predicate symbols in S and terms in the Herbrand
universe
• A Herbrand model is a subset of HB that renders S true
• The meaning of S is the least Herbrand model (LHM) of S
Note: the LHM is the least fixed point of the (continuous)
immediate consequence operator
20
From positive to normal logic program
semantics
Given a normal logic program S:
TS(X)= {a ∈ 𝐻𝐵 | a b1, …, bm,not bm+1, …,not bm+n∈ S,
{b1, …, bm} ⊆ X,
{bm+1, …, bm+n}∩ X={}}
• TS is no longer guaranteed to be continuous,
as it may not be monotonic:
– E.g. S={p not q}
TS({})={p}
TS({q})={}
21
Completion semantics (1)
1. Rewrite all rules in S as rules of the form (with ∧, ¬)
p(𝑋) 𝑋 = Ԧ𝑡, body
– e.g. p(X,3) q(X), not r(3)
becomes p(X,Y) Y=3 ∧ q(X) ∧¬ r(3)
2. Combine all rules with the same head
p(𝑋) 𝑋 = Ԧ𝑡1 ∧ body1 … p(𝑋) 𝑋 = Ԧ𝑡n ∧ bodyn
into (∀)p(𝑋)(𝑋 = Ԧ𝑡1 ∧ body1)∨…∨(𝑋 = Ԧ𝑡n ∧ bodyn)
– e.g. p(X,Y) Y=3 ∧ q(X) ∧¬ r(3), p(X,Y) X=2
becomes (∀) p(X,Y) (Y=3 ∧ q(X) ∧¬ r(3)) V (X=2)
3. If some q(_) (in vocabulary of S) is not the head of any
rule in S, add
(∀) q(𝑋) false 22
Completion semantics (2)
4. Finally, add Clark’s Equality Theory (CET):
– f(Ԧ𝑡) g( Ԧ𝑠) for all pairs of distinct terms in the Herbrand
universe of S
– (∀) X=X, X=Y→Y=X , X=Y∧Y=Z→X=Z
– (∀) Ԧ𝑋= 𝑌→p( Ԧ𝑋)= p(𝑌) for all predicate symbols p in S
– Ԧ𝑋 t[ Ԧ𝑋] for all terms t where Ԧ𝑋 occurs
23
The resulting logical theory is called
the completion of S
Completion: example
S: p(X) q(X), not r(X)
r(X) b(X)
r(X) not q(X)
q(a)
Completion of S: CET+
∀ X (p(X) q(X) ∧¬ r(X))
∀ X (r(X) b(X) ∨¬ q(X))
∀ X (q(X) X=a)
∀ X (b(X) false)
24
Soundness of SLDNF (with respect to
the completion semantics)
Let Comp(S) be the completion of S
• If there exists an SLDNF-refutation of P from S
with answer then Comp(S) ╞ (∀) P*
• If there exists a finitely failed SLDNF-
refutation of P from S then Comp(S) ╞¬ ∃P
25
* Where not is interpreted as ¬
Negation as finite failure
Completion semantics for SLDNF: issues
What if SLDNF does not terminate? What does non-
termination “mean”?
• S={p q, p not q, q q}, P=p
Completion of S = {p q ∨¬q, q q}+CET ╞ p
• S={p not p}, P=p
Completion of S = {p ¬p}+CET ╞ p,¬p
(inconsistent)
26
Well-founded model semantics
27
Negation as infinite failure
• S={p q, p not q, q q}, P=p
Well-founded model of S is ({p},{not q}):
S “wfm-entails” P
• S={p not p}, P=p
Well-founded model of S is ({},{}):
S “does not wfm-entail” P
SLDNF+tabling: XSB Prolog: http://xsb.sourceforge.net/
Well-founded model
(In, Out) such that
– In ⊆ HB
– Out ⊆ HBnot={not a | a ∈ 𝐻𝐵}
Note:
• atoms in HB- (In ∪ { a | not a ∈ Out}) are
“undecided”
• e.g. for S={p not p}, the well-founded model is
({},{}) and p is “undecided”
28
(definition of well-founded model omitted
and thus non examinable)
Nixon diamond using the completion
semantics and the well-founded model
p(V) ← q(V), not ab_q(V) n_p(V) ← r(V), not ab_r(V)
ab_q(V) ← n_p(V) ab_r (V) ← p(V)
q(nix) r(nix)
• Completion= CET+
p(V) q(V) ∧¬ ab_q(V) n_p(V) r(V) ∧¬ ab_r(V)
ab_q(V) n_p(V) ab_r (V) p(V)
q(V) V=nix r(V) V=nix
– p(nix), n_p(nix) hold in different models
• Well-founded model =({q(nix), r(nix)},{})
– p(nix), n_p(nix) are both undecided
31
Stable models
• Given a (ground) normal logic program S and
XHB, the reduct of S by X (referred to as SX) is
obtained in two steps:
1. Eliminate all rules with not p in the body, for every
pX
2. Eliminate all negative literals from the body of all
remaining rules
(SX is a set of definite clauses)
• XHB is a stable model of S iff the LHM of SX is X
Several efficient answer set solvers exist – see Part II
32
Nixon diamond using stable models (1)
p(V) ← q(V), not ab_q(V) n_p(V) ← r(V), not ab_r (V)
ab_q(V) ← n_p(V) ab_r (V) ← p(V)
q(nix) r(nix)
Here HB={q(nix), r(nix), p(nix),not_p(nix), ab_q(nix), ab_r(nix)}. Let S be
the set of all ground instances of these rules over HB.
Consider X={p(nix), ab_r (nix), q(nix), r(nix),}. Is X a stable model? YES.
To see this, let us construct SX
1. Eliminates n_p(nix) ← r(nix), not ab_r(nix)
2. Gives p(nix) ← q(nix),
ab_q(nix) ← n_p(nix), ab_r (nix) ← p(nix),
q(nix), r(nix)
The LHM of SX is {q(nix), r(nix), p(nix), ab_r (nix) }=X, as required. 33
Nixon diamond using stable models (2)
X={q(nix), r(nix), n_p(nix), ab_q(nix)} is also a
stable model (Exercise: verify this)
• Multiple “extensions” = stable models
• Credulous (whatever holds in some stable
model) vs sceptical (whatever holds in all stable
models)
34
Summary
• Classical logic: the qualification problem
• Closed World Assumption
• (non-)Monotonicity
• Non-monotonic/defeasible reasoning
• Negation-as-failure/SLDNF for non-monotonic
reasoning
• Completion semantics, well-founded semantics
(just a mention) and stable model semantics
35