Datalog
Datalog
P.J. McBrien
Imperial College London
P.J. McBrien (Imperial College London) Datalog 1 / 19
The Datalog Language Data
Data is held as extensional predicates
branch
sortcode bname cash
56 ’Wimbledon’ 94340.45
34 ’Goodge St’ 8900.67
67 ’Strand’ 34005.00
account
no type cname rate sortcode
100 ’current’ ’McBrien, P.’ NULL 67
101 ’deposit’ ’McBrien, P.’ 5.25 67
103 ’current’ ’Boyd, M.’ NULL 34
107 ’current’ ’Poulovassilis, A.’ NULL 56
119 ’deposit’ ’Poulovassilis, A.’ 5.50 56
125 ’current’ ’Bailey, J.’ NULL 56
movement
mid no amount tdate
1000 100 2300.00 5/1/1999
1001 101 4000.00 5/1/1999
1002 100 -223.45 8/1/1999
1004 107 -100.00 11/1/1999
1005 103 145.50 12/1/1999
1006 100 10.23 15/1/1999
1007 107 345.56 15/1/1999
1008 101 1230.00 15/1/1999
1009 119 5600.00 18/1/1999
branch(56, ‘Wimbledon’, 94340.45).
branch(34, ‘Goodge St’, 8900.67).
branch(67, ‘Strand’, 34005.00).
account(100, ‘current’, ‘McBrien, P.’, null, 67).
account(101, ‘deposit’, ‘McBrien, P.’, 5.25, 67).
account(103, ‘current’, ‘Boyd, M.’, null, 34).
account(107, ‘current’, ‘Poulovassilis, A.’, null, 56).
account(119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56).
account(125, ‘current’, ‘Bailey, J.’, null, 56).
movement(1000, 100, 2300.00, 5/1/1999).
movement(1001, 101, 4000.00, 5/1/1999).
movement(1002, 100,−223.45, 8/1/1999).
movement(1004, 107,−100.00, 11/1/1999).
movement(1005, 103, 145.50, 12/1/1999).
movement(1006, 100, 10.23, 15/1/1999).
movement(1007, 107, 345.56, 15/1/1999).
movement(1008, 101, 1230.00, 15/1/1999).
movement(1009, 119, 5600.00, 18/1/1999).
P.J. McBrien (Imperial College London) Datalog 2 / 19
The Datalog Language Rules
Rules defined as intentional predicates
current account(No,Name,Sortcode) :-
account(No, ‘current’,Name, ,Sortcode).
deposit account(No,Name,Rate,Sortcode) :-
account(No, ‘deposit’,Name,Rate,Sortcode).
active customers(CName,BName) :-
branch(Sortcode,BName, ),
account(No, ,CName, ,Sortcode),
movement( ,No, , ).
Datalog Rules
Datalog rules take the form
Head :- Body.
Logical semantics:
if Body the Head
Head must be a single
predicate
Body may be any conjunction
of predicates.
Naming of predicates and variables
You cannot use the same name for intentional and extensional predicates
Convention is the start predicate name with small letter
Variables start with a capital letter
A variable that only appears once can be replaced by ‘ ’
P.J. McBrien (Imperial College London) Datalog 3 / 19
The Datalog Language Rules
Quiz 1: Valid Datalog Knowledgebase
Which Datalog Knowledgebase is invalid?
A
single male(‘Peter’).
married to(‘Paul’, ‘Jane’).
male(M) :- married to(M, ).
male(M) :- single male(M).
female(F) :- married to( ,F).
female(F) :- single female(F).
B
male(‘Peter’).
married to(‘Paul’, ‘Jane’).
male(M) :- married to(M, ).
female(F) :- married to( ,F).
C
male(‘Peter’).
male(‘Paul’).
female(‘Jane’).
married to(‘Paul’, ‘Jane’).
D
married to(‘Peter’, null).
married to(‘Paul’, ‘Jane’).
male(M) :- married to(M, ), isNotNull(M).
female(F) :- married to( ,F), isNotNull(F).
P.J. McBrien (Imperial College London) Datalog 4 / 19
The Datalog Language Rules
Model-Theoretic Interpretation
deposit account(No,Name,Rate,Sortcode) :-
account(No, ‘deposit’,Name,Rate,Sortcode).
account(100, ‘current’, ‘McBrien, P.’, null, 67).
account(101, ‘deposit’, ‘McBrien, P.’, 5.25, 67).
account(103, ‘current’, ‘Boyd, M.’, null, 34).
account(107, ‘current’, ‘Poulovassilis, A.’, null, 56).
account(119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56).
account(125, ‘current’, ‘Bailey, J.’, null, 56).
Minimal Model
If we can assign any combination of values to the variables, what is the minimum set
of predicates that must be true.
Minimal Model
deposit account(101, ‘McBrien, P.’, 5.25, 67).
Is not a model, since it implies deposit account(119, ‘Poulovassilis, A.’, 5.50, 56) is
false, but deposit account(119, ‘Poulovassilis, A.’, 5.50, 56) is true due to the rule for
deposit account.
P.J. McBrien (Imperial College London) Datalog 5 / 19
The Datalog Language Rules
Model-Theoretic Interpretation
deposit account(No,Name,Rate,Sortcode) :-
account(No, ‘deposit’,Name,Rate,Sortcode).
account(100, ‘current’, ‘McBrien, P.’, null, 67).
account(101, ‘deposit’, ‘McBrien, P.’, 5.25, 67).
account(103, ‘current’, ‘Boyd, M.’, null, 34).
account(107, ‘current’, ‘Poulovassilis, A.’, null, 56).
account(119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56).
account(125, ‘current’, ‘Bailey, J.’, null, 56).
Minimal Model
If we can assign any combination of values to the variables, what is the minimum set
of predicates that must be true.
Minimal Model
deposit account(101, ‘McBrien, P.’, 5.25, 67).
deposit account(119, ‘Poulovassilis, A.’, 5.50, 56).
deposit account(127, ‘Poulovassilis, A.’, 4.50, 56).
Is not a minimal model, since deposit account(127, ‘Poulovassilis, A.’, 4.50, 56) could be
made false, and the model still be consistent.
P.J. McBrien (Imperial College London) Datalog 5 / 19
The Datalog Language Rules
Model-Theoretic Interpretation
deposit account(No,Name,Rate,Sortcode) :-
account(No, ‘deposit’,Name,Rate,Sortcode).
account(100, ‘current’, ‘McBrien, P.’, null, 67).
account(101, ‘deposit’, ‘McBrien, P.’, 5.25, 67).
account(103, ‘current’, ‘Boyd, M.’, null, 34).
account(107, ‘current’, ‘Poulovassilis, A.’, null, 56).
account(119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56).
account(125, ‘current’, ‘Bailey, J.’, null, 56).
Minimal Model
If we can assign any combination of values to the variables, what is the minimum set
of predicates that must be true.
Minimal Model
deposit account(101, ‘McBrien, P.’, 5.25, 67).
deposit account(119, ‘Poulovassilis, A.’, 5.50, 56).
Is a minimal model
P.J. McBrien (Imperial College London) Datalog 5 / 19
The Datalog Language Rules
Quiz 2: Datalog Queries
active current account(No) :-
account(No, ‘current’, , , ),
movement( ,No, , ).
What is the minimum model?
A
active current account(100).
active current account(101).
active current account(103).
active current account(107).
active current account(119).
active current account(125).
B
active current account(100).
active current account(101).
active current account(103).
active current account(107).
active current account(119).
C
active current account(100).
active current account(103).
active current account(107).
active current account(125).
D
active current account(100).
active current account(103).
active current account(107).
P.J. McBrien (Imperial College London) Datalog 6 / 19
Datalog with Negation
Datalog¬: Datalog with Negation
Safe Negation
Use ¬ infront of a predicate to mean that it must not hold.
Any variable that appears in a negated predicate must have previously appeared in a
non-negated predicate.
✓Find accounts without any
movements
dormant account(No) :-
account(No, , , , ),
¬movement( ,No, , ).
✗Unsafe
dormant account(No) :-
¬movement( ,No, , ).
Minimal Model
dormant account(125).
P.J. McBrien (Imperial College London) Datalog 7 / 19
Datalog with Negation
Quiz 3: Safe Datalog¬ Predicates
Which predicate uses safe negation?
A
non current accounts(No,Type) :-
account(No,Type, , , ),
¬Type = ‘current’.
B
non current accounts(No,Type) :-
¬Type = ‘current’,
account(No,Type, , , ).
C
non current accounts(No) :-
¬Type = ‘current’,
account(No,Type, , , ).
D
non current accounts(No,Type) :-
account(No, , , , ),
¬Type = ‘current’.
P.J. McBrien (Imperial College London) Datalog 8 / 19
Datalog with Negation
Quiz 4: Datalog¬ Queries (1)
branch without recent debit(BName) :-
branch(Sortcode,BName, ),
account(No, , , ,Sortcode),
¬account with recent debit(No).
account with recent debit(No) :-
movement( ,No,Value,TDate),
Value < 0,
TDate > 10/1/1999.
What is the minimum model?
A B
branch without recent debit(′Wimbledon′).
C
branch without recent debit(′Goodge St′).
branch without recent debit(′Strand′).
D
branch without recent debit(′Wimbledon′).
branch without recent debit(′Goodge St′).
branch without recent debit(′Strand′).
P.J. McBrien (Imperial College London) Datalog 9 / 19
Datalog with Negation
Quiz 5: Datalog¬ Queries (2)
branch without recent debit(BName) :-
branch(Sortcode,BName, ),
¬branch with recent debit(Sortcode).
branch with recent debit(Sortcode) :-
account(No, , , ,Sortcode),
movement( ,No,Value,TDate),
Value < 0,
TDate > 10/1/1999.
What is the minimum model?
A B
branch without recent debit(′Wimbledon′).
C
branch without recent debit(′Goodge St′).
branch without recent debit(′Strand′).
D
branch without recent debit(′Wimbledon′).
branch without recent debit(′Goodge St′).
branch without recent debit(′Strand′).
P.J. McBrien (Imperial College London) Datalog 10 / 19
Relational Algebra and Datalog
Projection
π
RA projection is performed by only using a subset of rule body variables in the head
of a rule.
πsortcode account
account sortcode(Sortcode) :-
account( , , , ,Sortcode).
Minimal Model
account sortcode(34).
account sortcode(56).
account sortcode(67).
P.J. McBrien (Imperial College London) Datalog 11 / 19
Relational Algebra and Datalog
Selection
σ
RA selection is performed by naming a variable more than once, or by putting a data
value in the rule body.
σamount>1000movement
big credit(Mid,No,Amount,Date) :-
movement(Mid,No,Amount,Date),
Amount > 1000.
Minimal Model
big credit(1000, 100, 2300.00, 5/1/1999).
big credit(1001, 101, 4000.00, 5/1/1999).
big credit(1008, 101, 1230.00, 15/1/1999).
big credit(1009, 119, 5600.00, 18/1/1999).
P.J. McBrien (Imperial College London) Datalog 12 / 19
Relational Algebra and Datalog
Product
×
RA product is performed by naming two predicates in the rule body.
branch× σrate>0account
product example(BSortcode,BName,Cash,No,Type,CName,Rate,ASortcode) :-
branch(BSortcode,BName,Cash),
account(No,Type,CName,Rate,ASortcode),
Rate > 0.
Minimal Model
(56, ‘Wimbledon’, 94340.45, 101, ‘deposit’, ‘McBrien, P.’, 5.25, 67)
(56, ‘Wimbledon’, 94340.45, 119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56)
(34, ‘Goodge St’, 8900.67, 101, ‘deposit’, ‘McBrien, P.’, 5.25, 67)
(34, ‘Goodge St’, 8900.67, 119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56)
(67, ‘Strand’, 34005.00, 101, ‘deposit’, ‘McBrien, P.’, 5.25, 67)
(67, ‘Strand’, 34005.00, 119, ‘deposit’, ‘Poulovassilis, A.’, 5.50, 56)
P.J. McBrien (Imperial College London) Datalog 13 / 19
Relational Algebra and Datalog
Join
⋊⋉
RA join is performed by naming two predicates in the rule body, and then comparing
their attributes.
πbname,cname σbranch.sortcode=account.sortcode(branch× account)
branch customers(BName,CName) :-
branch(BSortcode,BName, ),
account( , ,CName, ,ASortcode),
BSortcode = ASortcode.
≡
branch customers(BName,CName) :-
branch(Sortcode,BName, ),
account( , ,CName, , Sortcode).
Minimal Model
branch customers(‘Wimbledon’, ‘Poulovassilis, A.’).
branch customers(‘Wimbledon’, ‘Bailey, J.’).
branch customers(‘Goodge St’, ‘Boyd, M.’).
branch customers(‘Strand’, ‘McBrien, P.’).
P.J. McBrien (Imperial College London) Datalog 14 / 19
Relational Algebra and Datalog
Quiz 6: Translating RA to Datalog
πbname σaccount.sortcode=branch.sortcode∧type=‘deposit’(account× branch)
Which datalog rule for query is not equivalent to the above RA query?
A
query(BName) :-
account( , ‘deposit’, , ,Sortcode),
branch(Sortcode,BName, ).
B
query(BName) :-
branch(Sortcode1,BName, ),
account( , ‘deposit’, , ,Sortcode2),
Sortcode1 = Sortcode2.
C
query(BName) :-
branch( ,BName, ).
query(BName) :-
branch(Sortcode,BName, ),
account( , ‘deposit’, , ,Sortcode).
D
query(BName) :-
branch(Sortcode,BName, ),
deposit branch(Sortcode).
deposit branch(Sortcode) :-
account( , ‘deposit’, , ,Sortcode).
P.J. McBrien (Imperial College London) Datalog 15 / 19
Relational Algebra and Datalog
Quiz 7: Self Joins
query(CName,CAcc,DAcc) :-
account(DAcc, ‘deposit’,CName, , ),
account(CAcc, ‘current’,CName, , ).
account
no type cname rate sortcode
100 ’current’ ’McBrien, P.’ NULL 67
101 ’deposit’ ’McBrien, P.’ 5.25 67
103 ’current’ ’Boyd, M.’ NULL 34
107 ’current’ ’Poulovassilis, A.’ NULL 56
119 ’deposit’ ’Poulovassilis, A.’ 5.50 56
125 ’current’ ’Bailey, J.’ NULL 56
What is the result of the Datalog query?
A
CName CAcc DAcc
B
CName CAcc DAcc
‘McBrien, P.’ 100 101
‘Poulovassilis, A.’ 107 119
C
CName CAcc DAcc
‘McBrien, P.’ 101 100
‘Poulovassilis, A.’ 119 107
D
CName CAcc DAcc
‘McBrien, P.’ 100 101
‘Boyd, M.’ 103 null
‘Poulovassilis, A.’ 107 119
‘Bailey, J.’ 103 null
P.J. McBrien (Imperial College London) Datalog 16 / 19
Relational Algebra and Datalog
Union
∪
RA union is performed by having more than one rule definition for an intentional
predicate.
σamount>1000movement ∪ σamount<−100movement big movement(Mid,No,Amount,Date) :- movement(Mid,No,Amount,Date), Amount > 1000.
big movement(Mid,No,Amount,Date) :-
movement(Mid,No,Amount,Date),
Amount < −100.
Minimal Model
big movement(1000, 100, 2300.00, 5/1/1999).
big movement(1001, 101, 4000.00, 5/1/1999).
big movement(1002, 100,−223.45, 8/1/1999).
big movement(1008, 101, 1230.00, 15/1/1999).
big movement(1009, 119, 5600.00, 18/1/1999).
P.J. McBrien (Imperial College London) Datalog 17 / 19
Relational Algebra and Datalog
Difference
−
RA difference is performed using a negation on the predicate being ‘subtracted’:
need Datalog ¬.
πno account− πno movement
dormant account(No) :-
account(No, , , , ),
¬movement( ,No, , ).
Minimal Model
dormant account(125).
P.J. McBrien (Imperial College London) Datalog 18 / 19
Relational Algebra and Datalog
Worksheet: Datalog
branch
sortcode bname cash
56 ’Wimbledon’ 94340.45
34 ’Goodge St’ 8900.67
67 ’Strand’ 34005.00
movement
mid no amount tdate
1000 100 2300.00 5/1/1999
1001 101 4000.00 5/1/1999
1002 100 -223.45 8/1/1999
1004 107 -100.00 11/1/1999
1005 103 145.50 12/1/1999
1006 100 10.23 15/1/1999
1007 107 345.56 15/1/1999
1008 101 1230.00 15/1/1999
1009 119 5600.00 18/1/1999
account
no type cname rate sortcode
100 ’current’ ’McBrien, P.’ NULL 67
101 ’deposit’ ’McBrien, P.’ 5.25 67
103 ’current’ ’Boyd, M.’ NULL 34
107 ’current’ ’Poulovassilis, A.’ NULL 56
119 ’deposit’ ’Poulovassilis, A.’ 5.50 56
125 ’current’ ’Bailey, J.’ NULL 56
key branch(sortcode)
key branch(bname)
key movement(mid)
key account(no)
movement(no)
fk
⇒ account(no)
account(sortcode)
fk
⇒ branch(sortcode)
P.J. McBrien (Imperial College London) Datalog 19 / 19
The Datalog Language
Data
Rules
Datalog with Negation
Relational Algebra and Datalog