COMP2022|2922 Models of Computation
Lesson 1: Propositional Logic Presented by
Sasha Rubin
School of Computer Science
Which logics?
1. Propositional logic
A logic for computing with and reasoning about, e.g., statements such as conditionals in programming, digital circuits.
2. Predicate logic
A logic for computing with and reasoning about, e.g., the correctness of programs, knowledge-bases in AI, declarative programs, database queries.
1/38
What makes up a logic?
1. Syntax tells us the structure of expressions, or the rules for putting symbols together to form an expression.
– (1 + (y × 3)) is an arithmetic expression/formula
2. Semantics refers to the meaning of expressions, or how they are evaluated.
– Itevaluatesto13ifweknowthaty=4.
3. Deduction is a syntactic mechanism for deriving new true expressions from existing true expressions.
– Ifweknowthatz=x+ythenwecandeducethatz=y+x.
2/38
Propositional logic
– You can write propositional logic formulas in most programming languages.
– In Python and Java these are called Boolean expressions
– They are usually used as conditions for control flow, e.g., inside an if or a while.
3/38
Warmup
Do the following two Java Boolean expressions say the same thing?
1. !(x >= 5 && x != y) 2. (x < 5 || x == y)
4/38
Warmup
Do the following two Java Boolean expressions say the same thing?
1. !(x >= 5 && x != y) 2. (x < 5 || x == y)
Note
- They are syntactically different (they look different).
- They are semantically equivalent (they mean the same thing).
- In the sense that no matter the values of x, y, they evaluate to the same thing (i.e., either both true or both false).
- If we know one is true we can infer the other is true.
- Use that not(A and B) is equivalent to (not A or not B), no matter what conditions A and B represent.
4/38
Warmup
Do the following two statements say the same thing?
1. There are fires outside Sydney and the air in Sydney is polluted.
2. The air in Sydney is polluted and there are fires outside Sydney.
5/38
Warmup
Do the following two statements say the same thing?
1. There are fires outside Sydney and the air in Sydney is polluted.
2. The air in Sydney is polluted and there are fires outside Sydney.
Note
- We are not asking if the statements are true or not.
- The statements create different images: the first conveys that the fires are causing the pollution, the second that there is already pollution that may be made worse by the fires.
5/38
Warmup
Do the following two statements say the same thing?
1. There are fires outside Sydney and the air in Sydney is polluted.
2. The air in Sydney is polluted and there are fires outside Sydney.
Note
- We are not asking if the statements are true or not.
- The statements create different images: the first conveys that the fires are causing the pollution, the second that there is already pollution that may be made worse by the fires.
- However, logically, the statements are saying the same thing.
- If we know that one is true, we can infer the other is true.
- The reason for this is that we can use a law that says that “F and P” means the same as “P and F”, no matter what F and P represent.
5/38
Warmup
Do the following two statements say the same thing?
1. If the food is good, then the food is not cheap 2. If the food is cheap, then the food is not good
6/38
Warmup
Do the following two statements say the same thing?
1. If the food is good, then the food is not cheap 2. If the food is cheap, then the food is not good
Note
- We are not asking if the statements are true or not.
- The statements create different images: Good but not cheap food conveys that it is expensive, while cheap but not good food conveys that it is rotten.
6/38
Warmup
Do the following two statements say the same thing?
1. If the food is good, then the food is not cheap 2. If the food is cheap, then the food is not good
Note
- We are not asking if the statements are true or not.
- The statements create different images: Good but not cheap food conveys that it is expensive, while cheap but not good food conveys that it is rotten.
- However, logically, the statements are saying the same thing, i.e., that this food is not both good and cheap
- If we know that one is true, we can infer the other is true.
- The reason for this is that “if G then not C” means the same as “if C then not G”, no matter what G and C represent.
6/38
Propositions
Propositional logic is about statements that can be true or false (although we may not know which). Not every statement can be thought of as a proposition.
Which of the following are propositions?
- If this food is good, then this food is not cheap. - Where are we?
-2<5
7/38
Propositions
Propositional logic is about statements that can be true or false (although we may not know which). Not every statement can be thought of as a proposition.
Which of the following are propositions?
Assume that i is an integer variable, and x is a Boolean variable. -i
- i=7
-x
- x=true
- (i < 2 || i > 5) – (i < 2 && x)
7/38
In a nutshell
Logic of propositional formulas
1. Syntax
(p ∧ (q ∨ p)) is a propositional logic formula 2. Semantics
It evaluates to 0 if p = 0, q = 1. 3. Deduction
If we know the condition (p ∧ (q ∨ p)) is true, we can deduce that p is true.
8/38
Syntax tells us the structure of expressions, or the rules for putting symbols together to form an expression.
9/38
Syntax
Java and Python each have their own syntax for writing propositional formulas. We will use another syntax which is the standard in computer science and mathematics.
Name
Prop. Logic Python Java
conjunction disjunction negation implication bi-implication top/verum bottom/falsum atoms formulas
∧ and && ∨ or || ¬ not ! →
↔ ==
==
true
false
Boolean variables Boolean expressions
⊤
⊥
p, q, r, . . . F, G, H, . . .
True
False
Boolean variables Boolean expressions
10/38
Syntax
We now precisely define the syntactic rules for definining formulas of Propositional Logic.
Definition
- A propositional atom (or simply atom1) is a variable of the form p1,p2,p3,... (or alternatively p,q,r,...)
- A propositional formula (or simply formula2) is defined by the following recursive process:
1. Every atom is a formula
2. If F is a formula then ¬F is a formula.
3. If F,G are formulas then (F ∨G) as well as (F ∧G) are
formulas.
1Also called: propositional variable, atomic proposition, atomic formula. 2Also called: Boolean formula.
11/38
Syntax
Example
Which of the following are formulas? 1. ¬p
2. (¬p)
3. ¬¬p
4. ¬p∨q
5. (¬p∨q)
6. ¬(p∨q)
7. (r∧¬(p∨q))
12/38
Syntax
Why do we need this mathematical definition of formula? 1. It is unambiguous.
- If we disagree on whether something is a formula, we just consult the definition
2. It allows one to design algorithms that process formulas using recursion as follows:
- The base case is rule 1 of the definition
- The recursive case are rules 2 and 3 of the definition
3. One can prove things about propositional formulas.
13/38
Syntax
A formula which occurs in another formula is called a subformula.
Exercise
Give a recursive process that defines the set SubForms(G) of all subformulas of G.
14/38
Syntax
Abbreviations
1. (F1 → F2) instead of (¬F1 ∨ F2)
The symbol → is called “implication” (F1 → F2) is read “F1 implies F2”
2. (F1↔F2)insteadof(F1→F2)∧(F2→F1) The symbol ↔ is called “bi-implication” (F1 ↔F2) is read “F1 if and only if F2”
3. ⊥insteadof(F∧¬F)
The symbol ⊥ is called “bottom” or “falsum”
4. ⊤insteadof(F∨¬F)
The symbol ⊤ is called “top” or “verum”.
5. (ni=1Fi)insteadof(...((F1∨F2)∨F3)···∨Fn) 6. (ni=1Fi)insteadof(...((F1∧F2)∧F3)···∧Fn)
15/38
Homework
Example
Consider the formula ¬((p ∧ q) ∨ ¬r).
- How do you apply the rules to build this formula? Give the
sequence of rules you use to build the following formula. - This formulas has 7 subformulas. What are they?
16/38
Semantics refers to the meaning of expressions, or how they are evaluated
17/38
Semantics of Propostional Logic = Assignments + Truth-values of Formulas
18/38
Semantics: truth values
Recall: Semantics refers to how you derive the value of a formula based on the values of its atomic subformulas.
- The elements of the set {0, 1} are called truth values
- We read 1 as “true”, and 0 as “false”
- After we assign truth values to atoms . . . (e.g., p = 0, q = 1)
- we can compute the truth values of formulas. (e.g., the truth value of (p ∨ q) is 1).
19/38
Semantics: truth values
Recall: Semantics refers to how you derive the value of a formula based on the values of its atomic subformulas.
- The elements of the set {0, 1} are called truth values
- We read 1 as “true”, and 0 as “false”
- After we assign truth values to atoms . . . (e.g., p = 0, q = 1)
- we can compute the truth values of formulas. (e.g., the truth value of (p ∨ q) is 1).
Here are the rules for doing this computation . . .
19/38
Semantics: truth tables
The formula ¬F is true if and only if the formula F is false.
F ¬F 01 10
20/38
Semantics: truth tables
The formula (F ∧ G) is true if and only if both the formulas F and G are true.
F G (F∧G) 000 010 100 111
20/38
Semantics: truth tables
The formula (F ∨G) is true if and only if either F or G or both are true.
F G (F∨G) 000 011 101 111
20/38
Semantics
Example
Evaluate the formula F = ¬((p ∧ q) ∨ r) under the assignment p = 1, q = 1, r = 0.
p q r ¬((p∧q)∨r) 110
21/38
Semantics
Evaluate the formula F = ¬((p ∧ q) ∨ r) under all assignments of the atoms p, q, r.
p q r ¬((p∧q)∨r) 000
001
010
011 100 101 110 111
22/38
Semantics
We now formalise these truth tables with the following definition of the semantics.
Definition
- An assignment is a function α assigning truth values to atoms. So α(p) is the truth value of p under assignment α.
- The truth value tv(F, α) of a formula F under the assignment α is given by the following recursive procedure:
1. The truth value of an atom p is just α(p).
2. The truth value of a formula of the form ¬F is 1 if the truth
value of F is 0, and 0 otherwise.
3. The truth value of a formula of the form (F ∧G) is 1 if the
truth values of both F and G are 1, and 0 otherwise.
4. The truth value of a formula of the form (F ∨G) is 1 if the
truth values of either F or G or both are 1, and 0 otherwise.
23/38
Semantics
We now formalise these truth tables with the following definition of the semantics.
Definition
- An assignment is a function α assigning truth values to atoms. So α(p) is the truth value of p under assignment α.
- The truth value tv(F, α) of a formula F under the assignment α is given by the following recursive procedure:
1. tv(p, α) = α(p) for every atom p
2. tv(¬F,α) = 3. tv((F∧G),α)= 4. tv((F∨G),α)=
0 1
1 0
1 0
if tv(F,α) = 1 if tv(F,α) = 0
if tv(F,α) = 1 and tv(G,α) = 1 otherwise
if tv(F,α) = 1 or tv(G,α) = 1 otherwise
23/38
Semantics
Example
Suppose α(p1) = 1 and α(p2) = 0. Compute the truth value of (¬p1 ∨ p2) under α.
24/38
Semantics
Why do we need this precise definition of semantics?
1. It is unambiguous.
- If we disagree on the truth value of F under α, we just consult the definition.
2. This definition is implemented by the runtime environment of Python and Java to compute the values of Boolean expressions.
3. One can prove things about formulas.
25/38
Semantics
Terminology
- In case the truth value of F under α is equal to 1, we say that α makes F true or that α satisfies F .
- The standard notation in logic for this is different: we say that α models F, written: α |= F.
The symbol |= is called the “double-turnstile”.
- A formula F is satisfiable if at least one assignment satisfies F. Otherwise F is unsatisfiable.
26/38
Exercise
Convince yourself that here is an equivalent (and more succinct) way to define the truth value of formulas under an assignment α:
1. tv(p, α) = α(p) for atoms p
2. tv(¬F,α)=1−tv(F,α).
3. tv((F ∧ G), α) = min{tv(F, α), tv(G, α)}. 4. tv((F ∨ G), α) = max{tv(F, α), tv(G, α)}.
27/38
Semantics
Example
Suppose α(p1) = 1 and α(p2) = 0. Compute the truth value of (¬p1 ∨ p2) under α.
tv((¬p1 ∨ p2), α) = max{tv(¬p1, α), tv(p2, α)} = max{1 − tv(p1, α), tv(p2, α)}
= max{1 − α(p1), α(p2)} = max{1 − 1, 0} = 0
Rule 4. Rule 2. Rule 1.
28/38
Semantics: implication
F G (F→G) 001 011 100 111
29/38
Semantics: implication
- We now discuss why (F →G) is defined as (¬F ∨ G).
- When is the formula (F → G) true?
- It is true whenever, if F is true, then
also G is true.
- So, if F is not true, then it doesn’t matter whether G is true or not, the formula (F → G) will still be true.
- Note. Implication is not the same as the programming construct
“If condition is true then do instruction”. And it does not mean that F causes G to be true.
F G (F→G) 001 011 100 111
30/38
Semantics: implication
Let’s write a function that returns whether or not you are ready to go outside. The constraint is that if it is raining, then, to be ready, you should be carrying_umbrella.
1 def ready:
2 return (not raining or carrying_umbrella)
If Python had a symbol for implication, say →, then we could have written:
1 def ready:
2 return (raining → carrying_umbrella)
31/38
Connection with natural language
What is the connection between propositional logic and natural language? Although logic can be used to model propositions of natural language, there are important differences.
- In logic, semantics is determined by the assignment and the syntactic structure of the formula.
- In natural language, the semantics also depends on the context.
32/38
Connection with natural language
Disjunction
- Inclusive or: (F ∨ G)
- Exclusive or: ((F ∨ G) ∧ ¬(F ∧ G))
33/38
Connection with natural language
Disjunction
- Inclusive or: (F ∨ G)
- Exclusive or: ((F ∨ G) ∧ ¬(F ∧ G))
Examples
One can reach the airport by taxi or bus
You can choose to save your money or your life The error is in the program or the sensor data The program or the sensor data are erroneous
Inclusive Exclusive Exclusive? Inclusive?
33/38
Connection with natural language
Implication/equivalence
“Eating fast-food is equivalent to aiding the destruction of the world’s rainforests”
- This looks like an equivalence,
- but is more likely an implication (it’s unlikely that the speaker is claiming that destroying rainforests results in eating fast food).
34/38
For you think about
How does the following important idea apply to syntax and semantics of propositional logic?
In Computer Science we start with the simplest possible systems, and sets of rules that we haven’t necessarily confirmed by experiment, but which we just suppose are true, and then ask what sort of complex systems we can and cannot build. . . it is a mathematical set of tools, or body of ideas, for understanding just about any system—brain, universe, living organism, or, yes, computer.
Scott Aaronson — theoretical computer scientist
35/38
References
- Chapter 1 of Schöning covers syntax and semantics of propositional logic
- We use slightly more readable notation:
- We use p,q,r,... for atoms instead of A,B,C,...
- We use α for assignments instead of A
- We use tv(−) for truth values of a formula instead of A
36/38
Coda: algorithmic issue
Input: A propositional formula F
Output: Yes if F is satisfiable, and No otherwise
- Can one check every assignment? There are infinitely many.
- Fact. Assignments that agree on the atoms occuring in F also
agree on F.
- So we only need to check finitely many assignments.
- If F contains atoms p1, · · · , pn, there are exactly 2n different assignments. Why?
- Test all of them using truth-tables.
37/38
Coda: algorithmic issue
Question. How efficient is this test for satisfiability?
- Suppose F has n atoms.
- Then the truth-table as 2n rows — exponential growth.
- Suppose you can generate a table at the rate of one row per second.
- If n = 80, you will need 280 seconds to write out the table.
- This is about 2.8 million times the age of the universe. Question. Is there a substantially faster method?
- Probably not.
- This is related to the P vs NP problem. See COMP3027:Algorithm Design.
38/38
COMP2022|2922
Models of Computation
Lesson 2a: Propositional Logic Equivalences and Consequences
Presented by
Sasha Rubin
School of Computer Science
Formulas that “mean the same thing” are called equivalent. We now study common equivalences, also called laws.
1/16
Equivalences
Recall that although the formulas (A ∧ B) and (B ∧ A) are syntactically different formulas, they mean the same thing.
Let’s make this idea precise.
Definition
Two formulas F and G are (logically) equivalent if they are assigned the same truth value under every assignment. This is written F ≡ G.
2/16
Equivalences
Are the following pairs of formulas equivalent?
1. p and q?
2. (p→q) and (q→p)?
3. (p→q) and (¬q→¬p)? 4. (p∨¬p) and (q∨¬q)?
3/16
Some Equivalences
For all formulas F, G, H: (Idempotent Laws)
(Commutative Laws)
(Associative Laws)
(Distributive Laws)
(de Morgan’s Laws)
(Double Negation Law) (Validity Law)
(Unsatisfiability Law)
F ≡ (F ∧ F )
F ≡ (F ∨ F ) (F ∧ G) ≡ (G ∧ F ) (F ∨ G) ≡ (G ∨ F )
(F ∧ (G ∧ H)) ≡ ((F ∧ G) ∧ H)
(F ∨(G∨H))≡((F ∨G)∨H)
(F ∧ (G ∨ H)) ≡ ((F ∧ G) ∨ (F ∧ H)) (F ∨(G∧H))≡((F ∨G)∧(F ∨H))
¬(F ∧ G) ≡ (¬F ∨ ¬G) ¬(F ∨G) ≡ (¬F ∧¬G)
¬¬F ≡ F
(⊤ ∨ F ) ≡ ⊤
(⊤ ∧ F ) ≡ F
(⊥ ∨ F ) ≡ F
(⊥ ∧ F ) ≡ ⊥ 4/16
Equivalences
Which equivalence can be used to justify that the following two statements, logically speaking, say the same thing?
1. There are fires outside Sydney and the air in Sydney is polluted.
2. The air in Sydney is polluted and there are fires outside Sydney.
A = “There are fires outside Sydney” B = “The air in Sydney is polluted”
By the Law of Commutativity of ∧, we know that (A ∧ B) is equivalent to (B ∧ A).
5/16
Equivalences
There are a number of ways to verify an equivalence.
1. Use truth tables.
2. Deduce it from other equivalences (using the fact that ≡ is transitive).
6/16
Equivalences
Verify the following equivalence:
(p→q) ≡ (¬q→¬p) ... using truth tables
7/16
Equivalences
Verify the following equivalence:
(p→q) ≡ (¬q→¬p)
... using deduction from other equivalences
(p → q) ≡ (¬p ∨ q) ≡ (q ∨ ¬p)
≡ (¬¬q ∨ ¬p) ≡ (¬q → ¬p)
Definition of → Commutative Law for ∨ Double Negation Law Definition of →
Aside. What justifies the use of Double Negation inside the formula?
- Fact. If F is a subformula of H, and F ≡ G, then H is equivalent to formulas that result by substituting an occurrence of F in H by G. This is called the Substitution Rule.
7/16
Equivalences: applications (1)
- Equivalences can be used to rewrite a formula into an equivalent one having a special structure, called a normal form.
- You will see normal forms in the tutorials/assignments.
8/16
Equivalences: applications (2)
- We defined the syntax of propositional formulas to only use the connectives ∧, ∨ and ¬.
- Other connectives like → were defined in terms of these ones.
- Could we have made a different choice of connectives when
defining the syntax? . . . Yes!
Fact. Every propositional formula is equivalent to a formula which
only contains the operators ¬ and ∧.
- e.g., (¬p ∨ q) is equivalent to ¬(p ∧ ¬q).
Fact. Every propositional formula is equivalent to a formula which only contains the operators ¬ and →.
- e.g., (p ∧ q) is equivalent to ¬(p → ¬q).
9/16
Logic allows us to formalise what it means for an argument (”if these facts are true, then that fact must be true”) to be correct. This is done with an idea called logical consequence.
10/16
Logical consequence
Definition
Say that F is a logical consequence of {E1, · · · , Ek}, if every assignment that makes all of the formulas Ei (for 1 ≤ i ≤ k) true also makes F true. We write this as
{ E 1 , · · · , E k } |= F
11/16
Logical consequence
Example
Use truth tables to determine whether {p, (p → q)} |= q.
12/16
Logical consequence
Example
Use truth tables to determine whether { ( p → r ) , ( q → r ) , ( p ∨ q ) } |= r .
13/16
Logical consequence
Example
Use truth tables to determine whether {(p → q)} |= (q → p).
14/16
Logical consequence: Application
- Logic is used to study correct argumentation and reasoning.
- This is useful for coding, reasoning about the world, mathematics, etc.
Example
Why is the following argument correct?
1. Ifx=5thenz=4
2. x = 5
3. Conclude z = 4
The reason is that it is a particular instance of the fact that
{(p→q),p} |= q
15/16
Logical consequence: Application
Example
Why is the following argument correct? 1. Ifx=5thenz=4
2. Ifx<1thenz=4
3. Eitherx=5orx<1
4. Conclude z = 4
The reason is that it is a particular instance of the fact that
{ ( p → r ) , ( q → r ) , ( p ∨ q ) } |= r
16/16
COMP2022|2922
Models of Computation
Lesson 2b: Deduction in Propositional Logic
Presented by
Sasha Rubin
School of Computer Science
Deduction is a syntactic mechanism for deriving validities as well as logical consequences from assumptions
1/27
Deduction: motivation
– The most famous deductive system is surely found in Euclid’s Elements for deducing facts in elementary geometry and number theory.
– Euclid made some assumptions E1, · · · , Ek about points and lines ...
– . . . and produced formal proofs that established logical consequences F such as
1. “the angles in a triangle sum to 180 degrees" 2. “there are infinitely many prime numbers".
2/27
Deduction: Formal proofs
A deductive system gives you rules for constructing formal proofs.
Formal proofs:
1. A highly disciplined way of reasoning (good for computers)
2. A sequence of formulas where each step is a deduction based on earlier steps
3. Based entirely on rewriting formulas – no semantics.
3/27
Deduction: Formal proofs
– A formal proof usually starts with some assumptions. – Each step creates another formula which is justified by
applying an inference rule to formulas from previous steps. – If we manage to prove a formula F from assumptions
E1,··· ,Ek, then we will write
{E1,··· ,Ek}⊢F
which is read E1,··· ,Ek proves F.1
1The name of the symbol ⊢ is called “turnstile”.
4/27
|= is a semantic notion ⊢ is a syntactic notion
5/27
Natural deduction
We use a deductive system called Natural Deduction (ND).2
1. Every connective has two types of inference rules:
– Introduction rules introduce the connective – Elimination rules remove the connective
2. There is also a rule to introduce assumptions.
3. Some of the connective rules cancel assumptions.
4. In this system, → and ⊥ are not abbreviations.
5. If we collect all the assumptions {E1,··· ,Ek} that have not been cancelled at the end of the derivation, and if F is the last formula in the proof, then we write
{E1,··· ,Ek}⊢F
2Developed by Gerhard Gentzen (20th century German logician), and Stanisław Jaśkowski (20th century Polish logician).
6/27
Natural deduction
Here is one of the rules: (∧ E) Let’s analyse it:
S ⊢(A ∧ B) S⊢A
1. The name of the rule is (∧ E) which is read “Conjunction Elimination”.
2. S denotes a set of formulas, and A, B denote formulas.
3. We use this rule as follows: if we have proven (A ∧ B) using the formulas in S, then we can apply this rule to get a proof of A that uses the formulas in S.
4. This rule formalises the following type of reasoning: if we have a proof of (A∧B), then we have a proof of A.
7/27
Natural deduction
Let’s look at another rule: (∧I) Let’s analyse it:
S1 ⊢ A S2 ⊢ B S1 ∪ S2 ⊢(A ∧ B)
1. The name of the rule is (∧ I) which is read “Conjunction Introduction”.
2. S1 and S2 denote sets of formulas, and A, B denote formulas.
3. We use this rule as follows: if we have proven A using the formulas in S1, and we have proven B using the formulas in S2, then we can apply this rule to get a proof of (A ∧ B) that uses the formulas in S1 ∪ S2.
4. This rule formalises following type of reasoning: if we have a proof of A and we have a proof of B, then we have a proof of (A∧B).
8/27
Natural deduction: formal proofs
Here is how we will write a formal proof in ND.
1. Write the initial assumptions (sometimes called premises).
2. Devise a sequence of formulas, in which the last one is the desired conclusion.
3. Each step in the sequence must be:
– line numbered so that we can refer to it later in the proof, – annotated with the line numbers of the assumptions which
that step depends on,
– justified by the name of inference rule,
– annotated with the line numbers referenced by the justification.
Line
Assumptions
Formula
Justification
References
9/27
Natural deduction: formal proofs
Each line of the proof means the following:
If we know the assumptions hold, then we conclude the formula holds, because it can be justified by applying rule to the referenced lines above it.
Line
Assumptions
Formula
Justification
References
.
.
.
.
.
8
1,2,4
¬C
(∧ E)
4
.
.
.
.
.
10/27
Natural deduction: Rules involving ∧
(∧ E)
S ⊢(A ∧ B) S⊢A
(∧ E)
S ⊢(A ∧ B) S⊢B
(∧I)
S1 ⊢ A S2 ⊢ B S1 ∪ S2 ⊢(A ∧ B)
– The ∧E rules formalise the following reasoning: if we have a proof of (A∧B), then we have a proof of A and we have a proof of B.
– The ∧I rule formalises the following reasoning: if we have a proof of A and we have a proof of B, then we have a proof of (A∧B).
11/27
Natural deduction: Proof involving ∧ {(F ∧G)}⊢(G∧F)
Line 1
Assumptions 1
Formula
(F ∧ G)
Justification Asmp. I
References
2
1
F
∧E
1
3
1
G
∧E
1
4
1
(G∧F)
∧I
2,3
12/27
Natural deduction: Introducing an assumption
(Asmp. I)
{F}⊢F
– This rule allows one to introduce a formula F as an assumption, without reference to earlier lines.
– It’s only assumption is itself.
13/27
Natural deduction: rules involving →
(→ E) (→ I)
S1 ⊢A S2 ⊢(A→B) S1 ∪ S2 ⊢ B
S ∪ {A} ⊢ B S ⊢(A → B)
– The (→ E) rule formalises Modus Ponens.
– The (→ I) rule formalises that if we have a proof of B from A, then we have a proof of “if A then B”.
14/27
Natural deduction: proofs involving → The following consequence is called Hypothetical Syllogism (HS):
{(A → B),(B → C)}⊢(A → C)
Line
Assumptions
Formula
Justification
References
1
1
(A→B)
Asmp. I
2
2
(B→C)
Asmp. I
3
3
A
Asmp. I
4
1,3
B
→E
1,3
5
1,2,3
C
→E
2,4
6
1,2
(A → C)
→I
3,5
15/27
Natural deduction: rules involving ∨
(∨ I)
S⊢A
S ⊢(A ∨ B)
(∨ I)
S⊢A
S ⊢(B ∨ A)
(∨E) S1⊢(A∨B) S2∪{A}⊢C S3∪{B}⊢C S1 ∪ S2 ∪ S3 ⊢ C
– The ∨I rule formalises the following reasoning: if we have a proof of A, then we have a proof of (A ∨ B), and similarly we have a proof of (B ∨ A), no matter what B is.
– The ∨E rule formalises “reasoning by cases”: if we have a proof of (A∨B), a proof of C from A, and a proof of C from B, then we have a proof of C.
16/27
Natural deduction: proof involving ∨ {(A∨B),(A→C),(B→C)}⊢C
Line
Assumptions
Formula
Justification
References
1
1
(A∨B)
Asmp. I
2
2
(A→C)
Asmp. I
3
3
(B→C)
Asmp. I
4
4
A
Asmp. I
5
2,4
C
→E
2,4
6
6
B
Asmp. I
7
3,6
C
→E
3,6
8
1,2,3
C
∨E
1,4,5,6,7
Inline8,weuse(∨E)withS1 ={(A∨B)},S2 ={(A→C)}, S3 = {(B → C)}.
17/27
Natural deduction: proofs
The following consequence is called Constructive Dilemma (CD): {((A→B)∧(C →D)),(A∨C)}⊢(B∨D)
Line
Assumptions
Formula
Justification
References
1
1
(A→B) ∧ (C → D)
Asmp. I
2
2
(A∨C)
Asmp. I
3
1
(A→B)
∧E
1
4
1
(C → D)
∧E
1
5
5
A
Asmp. I
6
1,5
B
→E
3,5
7
1,5
(B ∨ D)
∨I
6
8
8
C
Asmp. I
9
1,8
D
→E
4,8
10
1,8
(B ∨ D)
∨I
9
11
1,2
(B ∨ D)
∨E
2,5,7,8,10
18/27
Natural deduction: rules involving ¬
(¬ E) (¬ I)
S1 ⊢ A S2 ⊢ ¬A S1 ∪ S2 ⊢ ⊥
S ∪ {A} ⊢ ⊥ S ⊢ ¬A
– The (¬ E) rule formalises that ⊥ follows from any contradiction.
– The (¬ I) rule formalises that we have a proof of ⊥ from A, then we have a proof of ¬A.
19/27
Natural deduction: proofs
{F}⊢¬¬F
Line
Assumptions
Formula
Justification
References
1
1
F
Asmp. I
2
2
¬F
Asmp. I
3
1,2
⊥
¬E
1,2
4
1
¬¬F
¬I
2,3
To show
we can continue the proof above with one more line:
∅⊢F →¬¬F
5
F → ¬¬F
→I
1,4
20/27
Natural deduction: proofs
The following consequence is called Modus Tollens: {(F →G),¬G}⊢¬F
Line
Assumptions
Formula
Justification
References
1
1
(F → G)
Asmp. I
2
2
¬G
Asmp. I
3
3
F
Asmp. I
4
1,3
G
→E
1,3
5
1,2,3
⊥
¬E
2,4
6
1,2
¬F
¬I
3,5
21/27
Natural deduction: another rule
(⊥)
S⊢⊥ S⊢A
– The (⊥) rule formalises that from a false assumption, anything can be derived.
22/27
Natural deduction: proofs
{(F ∨G),¬G}⊢F
Line
Assumptions
Formula
Justification
References
1
1
(F ∨ G)
Asmp. I
2
2
¬G
Asmp. I
3
3
G
Asmp. I
4
2,3
⊥
¬E
2,3
5
2,3
F
⊥
4
6
6
F
Asmp. I
7
1,2
F
∨E
1,3,5,6,6
Weapply(∨E)usingS1 ={(F∨G)},S2 =∅,S3 ={¬G}
23/27
Natural deduction: last rule
(RA)
S ∪ {¬A} ⊢ ⊥ S⊢A
– The (RA) rule is called Reductio ad absurdum or Reduction to the absurd. It formalises the method of proof by contradiction: if the assumption that A is false (i.e., that ¬A is true) leads to a contradiction, then A must be true.
24/27
Natural deduction: proofs
{¬¬F}⊢F
Line
Assumptions
Formula
Justification
References
1
1
¬¬F
Asmp. I
2
2
¬F
Asmp. I
3
1,2
⊥
¬E
1,2
4
1
F
RA
2,3
We apply (RA) using S = {¬¬F }.
25/27
Natural Deduction: wrap-up
We should definitely ask two questions about Natural Deduction:
1. Can it prove only logical consequences? Such a system is
called sound.
2. Can it prove all logical consequences? Such a system is called
complete.
Roughly speaking:
1. A deductive system that is not sound might give us wrong results.
– E.g., it might have a proof of (B→A) from (A→B).
2. A deductive system that is not complete might not give us all the right results.
– E.g., it might not have a proof of ¬¬A from A.
Theorem
Natural deduction for propositional logic is sound and complete.
26/27
For you think about
How does the following idea apply to Natural Deduction for propositional logic?
In Computer Science we start with the simplest possible systems, and sets of rules that we haven’t necessarily confirmed by experiment, but which we just suppose are true, and then ask what sort of complex systems we can and cannot build. . . it is a mathematical set of tools, or body of ideas, for understanding just about any system—brain, universe, living organism, or, yes, computer.
Scott Aaronson — theoretical computer scientist
27/27
COMP2022|2922 Models of Computation
Lesson 3: Predicate Logic Presented by
Sasha Rubin
School of Computer Science
The world has many objects, some of which are related to each other. These can be modeled by predicate logic.
1/42
In predicate logic we will be able to represent formulas like the following:1
∀x∀y (x + y = y + x)
This is not a proposition since whether or not it is true depends on missing information: the type of the variables x, y and the meaning of the symbol +.
– If x and y are integer variables and + means ordinary addition, then the formula is true.
– If x and y are string variables and + means concatenation, then the formula is false.
"hello" + "world" != "world" + "hello"
1The symbol ∀ is read “for all”.
2/42
So, in order to give meaning to formulas of predicate logic, we first have to fix the possible values that variables can take, as well as the interpretation of symbols like +.
Structure = Domain + Functions + Predicates + Constants
3/42
Background: domains
When we reason about objects, we have in mind a certain domain of discourse.
1. In programming, the domain may be the integers, or strings, or both, etc.
2. In the world, the domain includes people, animals, etc.
For us, a domain is simply a non-empty set D.
– The set Z of integers is a domain.
– The set S of binary strings is a domain. – The set H of humans is a domain.
In predicate logic, variables x, y, z, . . . vary over elements of the domain.
4/42
Background: functions
Mappings from several objects in a domain to a single object of the domain are called functions.
1. Here are some functions in the domain Z: add, double.
2. Here are some functions in the domain S: concat, reverse. 3. Here is a functions in the domain H: mother_of
So mother_of(x) is the mother of x.
The number of arguments of a function is called its arity.
By the way, it is traditional in logic to use notation like f(x,y) or even add(x, y) instead of infix notation like x + y.
5/42
Background: predicates
Properties of objects in a domain, as well as relations between objects in a domain, are called predicates (sometimes also called relations).
1. Here are some predicates in the domain Z: is_even, is_odd, is_bigger_than.
2. Here is a predicate in the domain S: is_longer_than. 3. Here is a predicate in the domain of H: loves
loves(x,y) says that x loves y.
6/42
Background: arity
– Predicates (like functions) take some number of arguments, and so we write is_even(x) and is_bigger_than(x,y).
– The number of arguments is called the arity of the predicate.
– is_even takes one argument, so it is called a unary predicate.
– is_bigger_than takes two arguments, so it is called a binary predicate.
7/42
Background: constants
Single elements of the domain are called constants.2
1. Here are some constants in the domain Z: 0, 1, 2, 3 and −3. 2. Here are some constants in the domain S: 0, 1, 10 and 11. 3. Here are some constants in the domain H: Sasha and Paul.
2They may also be seen as functions of arity zero, i.e., functions that take no arguments and always return the same element.
8/42
Background: structures
Definition
A structure A consists of a domain D, functions on D, predicates on D, and constants from D.
Examples
– Z = (Z,add,eq,0,1) where Z is the set of integers, add is integer addition (a binary function) , eq is equality of integers (a binary predicate), and 0, 1 are constants.
– S = (S, add, eq, 0, 1) where S is the set of binary strings, add is string concatenation (a binary function), eq is equality of strings (a binary predicate), and 0, 1 are constants.
– H = (H, loves, mother_of) where loves is a binary predicate and mother_of is a unary function.
9/42
Background: structures
Important notation
– We have conflicting notation: e.g., which function is add? Integer addition or string concatenation?
– To distinguish these, we may superscript with the name of the structure: so addZ is the integer-addition function, and addS is the string-concatenation function.
– So, if we want to be precise, we should write the structures before as:
and
Z = (Z,addZ,eqZ,0Z,1Z) S = (S,addS,eqS,0S,1S)
10/42
Background: structures
Important notation
– Soifaddisnotafunction,whatisit,andhowdoweuseit?
– It is just a name, it is just a symbol.
– Just as different people may have the same name, so too different functions (addZ and addS) can have the same name (add).
– Why do we need the names at all? We can use names in formulas of predicate logic to talk about multiple structures!
– Consider the formula ∀x∀y eq(add(x, y), add(y, x)) that says that addition is commutative.
– This formula makes sense about integers and about strings!
– It happens to be true about the integers but false about
strings.
11/42
Plan
What makes up a logic?
1. Syntax tells us the structure of expressions, or the rules for putting symbols together to form an expression.
2. Semantics refers to the meaning of expressions, or how they are evaluated.
3. Deduction is a syntactic mechanism for deriving new true expressions from existing true expressions.
12/42
Syntax + Semantics: In a nutshell
The Predicate-Logic Formulas for a structure A are built from:
1. symbols for the functions, predicates and constants of A;
called the vocabulary of A,
2. the usual connectives ¬, ∧, ∨,
3. variables x,y,z,..., and
4. the quantifier symbols ∀ (read "for all") and ∃ (read "exists").
13/42
Syntax + Semantics: In a nutshell
What is the vocabulary of the following structures?
1. The vocabulary of Z = (Z,addZ,eqZ,0Z,1Z) consists of
add, eq, 0, 1
2. The vocabulary of S = (S,addS,eqS,0S,1S) consists of
add, eq, 0, 1
3. The vocabulary of H = (H, lovesH, mother_ofH) consists of loves, mother_of.
14/42
Syntax + Semantics: In a nutshell
– Variables x, y, z, . . . vary over elements in the domain D.
– Quantifier symbols ∀, ∃ allow one to talk about their quantity. – Universal quantifier: ∀ “for every”, “for all”
– Existential quantifier: ∃ “there exists”, “some”
15/42
Syntax + Semantics: In a nutshell
Example
Here are some formulas and their meaning in the structure H = (H, lovesH, mother_ofH).
∀x loves(x, x)
∀x loves(x, mother_of(x))
∀x loves( mother_of(x), x)
∀x∀y loves(x, y)
∃x∀y loves(x, y)
∀x∃y loves(x, y)
∃x∃y loves(x, y)
Every human loves themself
Everyone loves their mother Everyone’s mother loves them Everyone loves everyone Someone loves everyone Everyone loves someone Someone loves someone
∀x∃y loves(y, mother_of(x))
Everyone’s mother is loved by someone
16/42
Syntax tells us the structure of expressions, or the rules for putting symbols together to form an expression.
17/42
Syntax of Predicate Logic = Symbols + Terms + Formulas
18/42
Syntax: symbols
Suppose structure A has functions fA, gA, hA, . . . , predicates PA,QA,RA,..., and constants aA,bA,cA,....
Then predicate logic formulas for A are built from the following symbols:
1. The vocabulary of A, i.e.,
1.1 Functionsymbolsf,g,h,...
1.2 PredicatesymbolsP,Q,R,... 1.3 Constantssymbolsa,b,c,...
2. Connectives ¬, ∧, ∨ (as in Propositional Logic) 3. Variables u,v,w,x,y,z,...
4. Quantifier symbols ∀, ∃
As we have seen, we may use more suggestive symbols like add (instead of f) and eq (instead of P).
19/42
Syntax: terms
A term refers to an object without naming it
– f(2)
– father(Alan)
– round_up(2.5)
Terms may have variables
– f(x)
– father(x)
– round_up(x)
Terms may be composed
– f(g(x,y))
– father(father(Alan)) – half(round_up(x))
Alan’s father The smallest integer ≥ 2.5
x’s father The smallest integer ≥ x
Alan’s father’s father Half of the smallest integer ≥ x
20/42
Syntax: terms
Here is a precise definition.
Definition
Terms are defined by the following recursive process:
1. Every variable is a term.
2. Every constant symbol is a term.
3. If f is a k-ary function symbol, and if t1,··· ,tk are terms, then f(t1,··· ,tk) is a term.
Examples
– So, if f is unary, g is binary, and c is a constant, then the following are terms: x, c, f(x), g(c,f(y)), f(f(y)),
g(f (x), g(x, y)).
– In the language of Z, the following are terms: add(x,1), add(x, add(x, y)), add(0, add(1, add(1, x)).
21/42
Syntax: formulas
We can finally give the syntax of predicate logic.
Definition
– An atomic formula (of predicate logic) has the form P(t1,...,tk) where P is a k-ary predicate symbol and t1,...,tk are terms.
– A formula (of predicate logic) is defined by the following recursive process:
1. Every atomic formula is a formula.
2. If F is a formula then ¬F is a formula.
3. If F, G are formulas then (F ∧ G) and (F ∨ G) are formulas.
4. If x is a variable and F is a formula then ∃xF and ∀xF are
formulas.
As for propositional logic, we use the same shorthands, i.e., (F →G),(F ↔G),(i Fi),(i Fi),⊤,⊥.
If F is a formula and F occurs as part of the formula G then F is called a subformula of G.
22/42
Syntax: formulas
Example
If f is binary, P is binary, then the following are formulas: – P(f(x,y),f(y,x))
– ∀xP(f(x,y),f(y,x))
– ∀x∀y(P(f(x,y),f(y,x))∨¬P(x,x))
23/42
Syntax: free and bound occurrences of variables
– An occurrence of the variable x in the formula F is bound if x occurs within a subformula of F of the form ∃xG or ∀xG. Otherwise it is a free occurrence.
– Programming analogy: bound/local, free/global
– A variable may have both free and bound occurrences in a formula F.
– Let Free(F) be the set of all variables that occur free in F.
– A formula without free variables (i.e., Free(F ) = ∅) is called a
sentence. Example
∀x(P(x,y)→∃yQ(x,y,z))
24/42
Semantics refers to the meaning of expressions, or how they are evaluated
25/42
Semantics of Predicate Logic = Structures + Assignments + Values of Terms + Truth-values of Formulas
26/42
Semantics: assignments
Definition
Let A be a structure with domain D. An assignment is a function α that maps variables x, y, z, . . . to elements of the domain D.
– So an assignment α gives every variable a value in the domain.
– This means that α can also be seen to give every term a value in the domain.
27/42
Semantics: values of terms
Example
Recall the structure Z = (Z, addZ , eqZ , 0Z , 1Z ).
– Suppose assignment α maps x to 1 and y to 2, i.e.,
α(x) = 1, α(y) = 2.
– Then α gives the term add(x, y) the integer value 3.
– And α gives the term add(x, add(x, y)) the integer value 4.
28/42
Semantics: values of terms
Definition
The value of the term t under assignment α in the structure A is defined by the following recursive process:
1. If t is a variable, say x, then its value is just α(x).
2. If t is a constant symbol, say c, then its value is cA.
3. If t is a term of the form f(t1,··· ,tk) and t1,··· ,tk are terms with values d1,··· ,dk, then its value is fA(d1,··· ,dk).
29/42
Semantics: values of terms
We can write this more concisely:
Definition
The value of the term t under assignment α in the structure A, denoted val(t, α, A), is defined by the following recursive process:
1. val(x,α,A)=α(x).
2. val(c,α,A)=cA.
3. val(f(t1,···,tk),α,A)= fA(val(t1,α,A),··· ,val(tk,α,A)).
Notation
– If the structure A is clear from context, write val(t, α).
– If also the assignment α is clear from context, write val(t).
30/42
Semantics: values of terms
Example
Recall the structure Z = (Z, addZ , eqZ , 0Z , 1Z ).
– Suppose assignment α maps x to 1 and y to 2. – What is the value of the term add(x, add(x, y))?
val(add(x, add(x, y))) = addZ (val(x), val(add(x, y))) 3. = addZ (val(x), addZ (val(x), val(y))) 3. = addZ (α(x), addZ (α(x), α(y))) 1.
= addZ (1, addZ (1, 2)) = 1 + (1 + 2) = 4
31/42
Semantics: truth-values of atomic-formulas
We can now assign truth values to atomic formulas P(t1,··· ,tk). Example
Recall the structure Z = (Z, addZ , eqZ , 0Z , 1Z ). Suppose α(x) = 2, α(y) = 3.
– What is the truth value of the following atomic formula?
eq(add(y, x), add(x, y)) – To answer this, we must ask if
(val(add(y, x)), val(add(x, y))) ∈ eqZ
– Butthisisjustaskingif3+2=2+3,whichistrue,sothe truth value is 1.
32/42
Semantics: truth-values of formulas
Definition
The truth value of a formula F under the assignment α in the structure A, denoted TV(F, α, A) or just TV(F, α), is defined by the following recursive process:
1. For an atomic formula P(t1,...,tk),
1 if (val(t1,α),··· ,val(tk,α)) ∈ PA TV(P(t1,...,tk),α) = 0 otherwise.
33/42
Semantics: truth-values of formulas
Definition
The truth value of a formula F under the assignment α in the structure A, denoted TV(F, α, A) or just TV(F, α), is defined by the following recursive process:
2. TV(¬F,α) = 3. TV((F ∧ G), α) = 4. TV((F ∨ G), α) =
1
0 otherwise
0 1
if TV(F,α) = 1 if TV(F,α) = 0
if TV(F,α) = 1 and TV(G,α) = 1
1 if TV(F,α) = 1 or TV(G,α) = 1 0 otherwise
33/42
Semantics: truth-values of formulas
Definition
The truth value of a formula F under the assignment α in the structure A, denoted TV(F, α, A) or just TV(F, α), is defined by the following recursive process:
5. Define TV(∀xG, α) = 1 if for every element d in the domain of A, TV(G, α[x → d]) = 1, and 0 otherwise.
6. Define TV(∃xG, α) = 1 if there is an element d in the domain of A such that TV(G, α[x → d]) = 1, and 0 otherwise.
α[x → d] is the assignment which is identical to α except that it maps the variable x to d. This can be expressed by the equation:
α(y) if y ̸= x d if y = x
α[x → d](y) =
33/42
Semantics: truth-values of formulas
We can write the definition more succinctly:
Definition
The truth value of a formula F under the assignment α in the structure A, denoted TV(F, α, A) or just TV(F, α), is defined by the following recursive process:
1. TV(P(t1,··· ,tk),α) = 1 if (val(t1,α),··· ,val(tk,α)) ∈ PA, and 0 otherwise.
2. TV(¬F,α)=1−TV(F,α)
3. TV((F ∧ G), α) = min{TV(F, α), TV(G, α)}
4. TV((F ∨ G), α) = max{TV(F, α), TV(G, α)}
5. TV(∀xG, α) = min{TV(G, α[x → d]) : d ∈ D}
6. TV(∃xG, α) = max{TV(G, α[x → d]) : d ∈ D}
34/42
Semantics
Example
Recall the structure Z = (Z,addZ,eqZ,0Z,1Z). Let α be any assignment. Compute TV(∃xG, α) where G is the formula eq(x, add(x, x)).
You can think of this as a loop.
35/42
Semantics
Example
Recall the structure Z = (Z,addZ,eqZ,0Z,1Z). Let α be any assignment. Compute TV(∃xG, α) where G is the formula eq(x, add(x, x)).
– Bytherulefor∃,TV(∃xG,α)=1iffthereissomed∈Z: TV(G, α[x → d]) = 1
– By the rule for predicates, this means that
(val(x, α[x → d]), val(add(x, x), α[x → d])) ∈ eqZ
– By evaluating the terms, this means that (d, d + d) ∈ eqZ . – By evaluating the predicate, this means that d = d + d.
– Is there such an integer d? Yes, d = 0.
– So TV(∃xG,α) = 1.
35/42
Semantics
Example
Recall the structure Z = (Z,addZ,eqZ,0Z,1Z). Let α be any assignment. Compute TV(∀x∀yG, α) where G is the formula eq(add(x, y), add(y, x)).
– By the rule for ∀ (applied twice), TV(∀x∀yG, α) = 1 iff for all d,e ∈ Z: TV(G,α[x,y → d,e]) = 1.
– By the rule for predicates, this means that
(val(add(x, y), α[x, y → d, e]), val(add(y, x), α[x, y → d, e])) ∈ eqZ
– By evaluating the terms, this means that (d + e, e + d) ∈ eqZ .
– By evaluating the predicate, this means that d + e = e + d.
– Is this true for all integers d, e? Yes.
– So TV(∀x∀yG, α) = 1.
36/42
Semantics
– Intuitively, the truth value of a formula F does not depend on α(z) if the variable z does not occur in F.
– More is true:
Fact. The truth value TV(F, α) only depends on α(z) if
z ∈ Free(F).
– In the previous examples the formula F had no free variables. So the truth value of F does not depend at all on the α we start with.
– Of course, as we evaluated the formula, we updated the assignment and the formula, and so some variables became free and the current assignment did matter.
37/42
Semantics: sentences
– A formula without free occurrences of variables (i.e., Free(F ) = ∅) is called a sentence.
– Fact. The truth value of a sentence does not depend on the variable assignment.
– So, for a sentence F and a structure A, if the truth-value is 1 under some assignment, then it is 1 under every assignment.
– InthiscasewesaythatF istrueinA.
38/42
Semantics
So in the previous examples, we have that the following formulas are true in the integer structure Z = (Z, addZ , eqZ , 0Z , 1Z ):
– the formula ∃x eq(x, add(x, x)), which in ordinary notation is written ∃x (x = x + x);
– the formula ∀x∀y eq(add(x, y), add(y, x)) which in ordinary notation is written ∀x∀y (x + y = y + x).
39/42
Semantics: satisfiable sentences
A sentence F is satisfiable if it is true in some structure A. Example
– The formula ∀x∀y eq(add(x, y), add(y, x)) is satisfiable (since it is true in the structure Z).
– The formula ∃x(eq(x, x) ∧ ¬eq(x, x)) is not satisfiable.
40/42
For you to think about
We saw that we can use truth-tables to decide if a given propositional logic formula is satisfiable. Is there an algorithm that decides if a given predicate logic formula is satisfiable?
41/42
Reference
– Chapter 2 of Schöning covers Predicate Logic
– NB. We use the same syntax, but slightly different notation for semantics.
42/42
COMP2022|2922
Models of Computation
Lesson 4a: Equivalences of Predicate Logic
Presented by
Sasha Rubin
School of Computer Science
Formulas that “mean the same thing” are called equivalent. We now study common equivalences, also called laws.
1/32
Equivalences
Definition
Two formulas F and G are (logically) equivalent if they are assigned the same truth value in every structure under every assignment. This is written F ≡ G.
Fact. All equivalences which have been proved for formulas in propositional logic also hold for formulas of predicate logic.
Example
– De Morgan’s Law: ¬(F ∧ G) ≡ (¬F ∨ ¬G) holds for all formulas F, G of predicate logic.
– E.g., ¬(∃xP (x) ∧ Q(y)) ≡ (¬∃xP (x) ∨ ¬Q(y)).
2/32
Equivalences involving quantifiers
For all formulas F, G: (Q. Negation)
(Q. Unification) (Q. Transposition) (Q. Extraction)
¬∀xF
¬∃xF (∀xF ∧ ∀xG) (∃xF ∨ ∃xG)
∀x∀yF
∃x∃yF if x ̸∈ Free(G) : (∀xF ∧G) (∀xF ∨G) (∃xF ∧G) (∃xF ∨G)
≡ ∃x¬F
≡ ∀x¬F
≡ ∀x(F ∧ G) ≡ ∃x(F ∨ G) ≡ ∀y∀xF
≡ ∃y∃xF
≡ ∀x(F ∧G) ≡ ∀x(F ∨G) ≡ ∃x(F ∧G) ≡ ∃x(F ∨G)
3/32
Equivalences
Here are informal reasons behind some of these equivalences:1 1. ¬∀xF ≡ ∃x¬F
– the LHS says that not all x satisfy F ,
– which means the same thing as some x doesn’t satisfy F, – which means that some x does satisfy ¬F,
– which is what the RHS says.
2. (∀xF ∧∀xG) ≡ ∀x(F ∧G)
– the LHS says that F holds for every x and G holds for every x,
– which is the same as saying both F and G hold for every x, – which is what the RHS says.
3. ∀x∀yF ≡ ∀y∀xF
– Both sides say that F holds for all values of the listed variables.
4. (∀xF∧G)≡∀x(F∧G)ifx̸∈Free(G)
– LHS says F holds for every x, and G holds.
– RHS says F and G hold for every x; but G doesn’t depend on
the value of x.
1To do prove them formally, use the inductive definition of truth-value.
4/32
Equivalences
Example
Show that ¬(∃xP (x, y) ∨ ∀z¬R(z)) ≡ ∀x∃z(¬P (x, y) ∧ R(z))
¬(∃x P (x, y) ∨ ∀z¬R(z))
≡ (¬∃xP (x, y) ∧ ¬∀z¬R(z))
≡ (∀x¬P (x, y) ∧ ∃z¬¬R(z)) ≡ (∀x¬P (x, y) ∧ ∃zR(z))
≡ ∀x(¬P (x, y) ∧ ∃zR(z))
≡ ∀x(∃zR(z) ∧ ¬P (x, y))
≡ ∀x∃z(R(z) ∧ ¬P (x, y)) ≡ ∀x∃z(¬P (x, y) ∧ R(z))
DeMorgan’s Laws Quantifier Negation Double Negation Quantifier Extraction Comm ∧ Quantifier Extraction Comm. ∧
5/32
Equivalences
To show that F1 ̸≡ F2 we should find a counterexample, i.e., a structure A and assignment such that
TV(F1, α, A) ̸= TV(F2, α, A).
Show that (∀xP (x) ∧ Q(x)) ̸≡ ∀x(P (x) ∧ Q(x)) Here is a counter-example:
– The structure A = (A,PA,QA) has domain A = {1,2}, predicates P A = {1, 2}, QA = {2},
– the assignment α maps variable x to domain element 2.
1. Then TV((∀xP (x) ∧ Q(x)), α, A) = 1 (since every element of
AisinP andα(x)isinQ).
2. And TV(∀x(P (x) ∧ Q(x)), α, A) = 0 (since not every
elementofAisinP andinQ).
This also shows why we can’t drop x ̸∈ Free(G) for Quantifier Extraction.
6/32
Equivalences
Example
Show that (∀xP (x) ∧ Q(x)) ≡ ∀y(P (y) ∧ Q(x)).
(∀xP (x) ∧ Q(x)) ≡ (∀yP (y) ∧ Q(x)) rename bound variable
≡ ∀y(P (y) ∧ Q(x)) Quantifier Extraction
– Bound variables are like dummy variables, and so one can rename them as follows.
– Let F be a formula in which the variable z does not occur in F. The renaming of variable x by z in F is the formula F′ that results from F by simultaneously replacing all occurrences of x, that are bound by the same occurrence of a quantifier, by the variable z. We say that F′ is a renaming of F.
– Fact. Renamings preserve the formula up to logical equivalence.
– In other words, renamings only change the syntax, not the semantics (truth-value) of a formula.
7/32
Equivalences: renaming
Examples
1. ∃zQ(z, y) is a renaming of ∃xQ(x, y).
2. (∃xP (x) ∨ ∃zQ(z, x)) is not a renaming of
(∃wP (w) ∨ ∃zQ(z, x)) since x already occured in the formula.
3. (∃zP (z) ∧ ∀zQ(z)) is not a renaming of (∃xP (x) ∧ ∀xQ(x)) since a renaming only applies to variables bound by a single quantifier.
8/32
Application of equivalences
Every formula can be transformed into an equivalent one that has a rigid structure, called a normal form.
9/32
Normal forms
We will look at two normal forms. 1. Negation normal form (NNF) 2. Prenex normal form (PNF)
10/32
Normal forms: NNF
Definition
A formula F is in negation normal form (NNF) if negations only occur immediately infront of atomic formulas.
Theorem
For every formula F there is an equivalent formula in NNF. Algorithm (“push negations inwards”)
Substitute in F every occurence of a subformula of the form ¬¬G by G, and
¬∀xF by ∃x¬F ¬∃xF by ∀x¬F ¬(G∧H) by (¬G∨¬H) ¬(G∨H) by (¬G∧¬H)
until no such subformulas occur, and return the result.
Why is this algorithm correct?
11/32
Normal forms: NNF
Example
Put ¬∃x(P (x) ∧ ¬∃yQ(y, x)) into NNF.
12/32
Normal forms: PNF
Definition
A formula F is in prenex normal form (PNF) if it has the form Q1x1Q2x2 · · · QnxnF
where each Qi ∈ {∃, ∀} is a quantifier symbol, the xis are variables, n ≥ 0 (so, there may be no quantifiers in the prefix), and F does not contain a quantifier.
Example
– ∀x(¬P (x) ∨ ∃y(N (y) ∧ L(x, y))) is not in PNF. – ∀x∃y(¬P(x)∨(N(y)∧L(x,y))isinPNF.
13/32
Normal forms: PNF
Theorem
For every formula F there is an equivalent formula in PNF. Algorithm (“pull quantifiers out the front”)
1. Put F in NNF, call the result F′.
2. Substitute in F′ every occurence of a subformula of the form
(∀xF ∧G) by ∀x(F ∧G) (∀xF ∨G) by ∀x(F ∨G) (∃xF ∧G) by ∃x(F ∧G) (∃xF ∨G) by ∃x(F ∨G)
until no such subformulas occur (use commutativity to handle (G ∧ ∀xF ), etc.), and return the result.
3. NB. To apply these equivalences we need that x ̸∈ Free(G). This can always be achieved by renaming the bound variable x.
14/32
Normal forms: PNF
Put the following into PNF
– (Q(x)∨∀xR(x,x))
15/32
Normal forms: PNF
Put the following into PNF
– (∀yP(z,y)∨∃wQ(w))
– (∀yP(z,y)∨∃yQ(y))
16/32
Logical consequence
Definition
A sentence F is a logical consequence2 of the set {E1, · · · , Ek} of sentences if every structure A in which all of the E1, · · · , Ek are true, also F is true. In this case we write
{ E 1 , · · · , E k } |= F E 1 , · · · , E k |= F
or simply
Example
– ∀x R(x, x) is a logical consequence of {∀x∀y R(x, y)}.
– P (c) is a logical consequence of {Q(c), ∀x (Q(x) → P (x))}.
2This definition mimics the one for Propositional Logic.
17/32
COMP2022|2922
Models of Computation
Lesson 4b: Deductive system for Predicate Logic
Presented by
Sasha Rubin
School of Computer Science
Deductive systems are a syntactic mechanism for deriving validities as well as logical consequences from assumptions
18/32
Natural deduction
– We extend ND for propositional logic with rules to handle quantifiers.
– Each quantifier symbol ∃, ∀ has two types of rules:
1. Introduction rules introduce the quantifier 2. Elimination rules remove the quantifier
19/32
Natural Deduction: replacing free variables by terms
Definition
For a formula F , variable x, term t, we can obtain a formula F [t/x] by simultaneously replacing all free occurrences of x in F by t.
– The idea is that whatever F said about x, now F[t/x] says about t.
– In a few slides, we will restrict when we are allowed to make such replacements.
20/32
Natural Deduction: inference rules involving ∀
(∀E )
S⊢∀xF S⊢F[t/x]
where t is a term that is free to replace x in F.
(∀I )
S⊢F[c/x] S⊢∀xF
where c is a constant symbol, not occuring in F, nor in any formula in S.
(∀E) formalises the reasoning
If we know that F holds for every x, then it must hold, in particular, taking x = t.
21/32
Natural Deduction: inference rules involving ∀
(∀E )
S⊢∀xF S⊢F[t/x]
where t is a term that is free to replace x in F.
(∀I )
S⊢F[c/x] S⊢∀xF
where c is a constant symbol, not occuring in F, nor in any formula in S.
(∀I) formalises the reasoning
Let c be any element . . . [prove F [c/x]]. Since c was arbi- trary, deduce F holds for all x.
That c is arbitrary is captured by requiring that c is not in the assumptions used to prove F [c/x], and so c is not constrained in any way.
21/32
Natural Deduction: example using ∀ ∀x∀yP (x, y) ⊢ ∀y∀xP (x, y)
Line
Assumptions
Formula
Justification
References
1
1
∀x∀yP (x, y)
Asmp. I
2
1
∀yP(c,y)
∀E
1
3
1
P(c,d)
∀E
2
4
1
∀xP (x, d)
∀I*
3
5
1
∀y∀xP (x, y)
∀ I **
4
* the constant c does not occur in F (i.e., P(x,d)), nor in the formula of its assumption (in line 1).
** the constant d does not occur in F (i.e., ∀xP(x,y)), nor in the formula of its assumption (in line 1).
The conditions on (∀ E) that talk about "free to replace" will be explained next.
22/32
Natural Deduction: free to replace
Example
Say F is the formula ∃y(x < y) expressing, about numbers, that there is some number bigger than x. Then the formula F[t/x] taking t = z + 1 is ∃y(z + 1 < y) which says that there is some number bigger than z + 1.
Note
The formula F [t/x] taking t = y + 1 is the formula ∃y(y + 1 < y) which says that there is some number bigger than its successor.
– What went wrong? We changed x from being free to y + 1 where y is bound.
– We disallow such substitutions.
If no variable y of t is in the scope of a quantifier Qy in F[t/x] then we say that t is free to replace x in F.
23/32
Natural Deduction: example using ∀ ∀x∀yP (x, y) ⊢ ∀y∀xP (x, y)
Line
Assumptions
Formula
Justification
References
1
1
∀x∀yP (x, y)
Asmp. I
2
1
∀yP(c,y)
∀E*
1
3
1
P(c,d)
∀ E **
2
4
1
∀xP (x, d)
∀I
3
5
1
∀y∀xP (x, y)
∀I
4
* The term c is free to replace x in F = ∀yP(y,x). ** The term d is free to replace y in F = P(c,y).
24/32
Natural Deduction: example using ∀ ∀x(P (x) ∧ Q(x)) ⊢(∀xP (x) ∧ ∀xQ(x))
Line
Assumptions
Formula
Justification
References
1
1
∀x(P (x) ∧ Q(x))
Asmp. I
2
1
(P (c) ∧ Q(c))
∀E*
1
3
1
P (c)
∧E
2
4
1
∀xP (x)
∀I*
3
5
1
Q(c)
∧E
2
6
1
∀xQ(x)
∀I*
5
7
1
(∀xP (x) ∧ ∀xQ(x))
∧I
4,6
* Check that the conditions in lines 2,4,6 are satisfied.
25/32
Natural Deduction: inference rule ∃I
(∃I )
S⊢F[t/x] S⊢∃xF
where t is a term that is free to replace x in F.
(∃I) formalises the reasoning
If we know that F holds for a specific term t, then we know it holds for some x.
26/32
Natural Deduction: example using ∃I ∀xP (x) ⊢ ∃xP (x)
Line
Assumptions
Formulas
Just.
Ref.
1
1
∀xP (x)
Asmp. I
2
1
P (c)
∀E
1
3
1
∃xP (x)
∃I*
2
* the constant c is free to replace x in P (x)
27/32
Natural Deduction: inference rule ∃E
(∃E )
S1 ⊢∃xF S2 ∪{F[c/x]}⊢G S1 ∪ S2 ⊢ G
where c is a constant symbol, not occuring in F, nor in G, nor in any formula in S2.
(∃E) formalises the reasoning
If we prove G from F[c/x], but we did not use any other
properties of c, then we can prove G from the weaker as- sumption that some x satisfies F.
28/32
Natural Deduction: inference rule ∃E
(∃E )
S1 ⊢∃xF S2 ∪{F[c/x]}⊢G S1 ∪ S2 ⊢ G
where c is a constant symbol, not occuring in F, nor in G, nor in any formula in S2.
How to use (∃E)?
1. Assume F[c/x] ensuring that c does not occur in F.
2. Derive G making sure that c is not in the assumption set of G except for F[c/x].
3. Cancel the assumption F [c/x], and conclude G.
28/32
Natural Deduction: example using ∃E ∀x(Q(x) → P (y)), ∃xQ(x) ⊢ P (y)
Line
Assumptions
Formulas
Just.
Ref.
1
1
∀x(Q(x) → P (y))
Asmp. I
2
2
∃xQ(x)
Asmp. I
3
1
(Q(c) → P (y))
∀E
1
4
4
Q(c)
Asmp. I
5
1,4
P(y)
→E
3,4
6
1,2
P(y)
∃E
2,4,5
29/32
Natural Deduction: incorrect usage of (∃E)
∃xP(x)⊢∀xP(x) ...?
Line
Asmp.
Form.
Just.
Ref.
1
1
∃xP (x)
Asmp. I
2
1
P (c)
∃E
?,?,?
3
1
∀xP (x)
∀I
2
Here is the faulty argument in natural language:
We are given that some x has F. Let c be such an x. Since c was chosen arbitrarily (?!), conclude that every x has F .
30/32
Natural Deduction: e.g. using ∃ and ∀ ¬∃x¬P (x) ⊢ ∀xP (x)
Line
Asmp.
Form.
Just.
Ref.
1
1
¬∃x¬P (x)
Asmp. I
2
2
¬P [c/x]
Asmp. I
3
2
∃x¬P (x)
∃I *
2
4
1,2
⊥
¬E
1,3
5
1
P [c/x]
RA
2,4
6
1
∀xP (x)
∀I **
5
* c is free to replace x in ¬P(x)
** c does not appear in P(x) nor in the assumption 1
31/32
Deductive system: wrapping up
– We introduced Natural Deduction (ND) for Predicate Logic as a way to derive logical consequences in a way that can be checked by machine.
– Just like for Propositional Logic, ND for Predicate Logic is sound (meaning it can only derive logical consequences) and complete (meaning that it can derive all logical consequences).
– However, unlike Propositional Logic, there is no algorithm that can decide, given E1,··· ,Ek,F whether or not F is a consequence of E1, · · · , Ek.
– We will discuss this when we cover undecidability (Lec 10).
– This means that finding proofs of Predicate Logic inherently require human ingenuity.
32/32
COMP2022|2922
Models of Computation
Lesson 5a: Introduction to Machine Models
Presented by
Sasha Rubin
School of Computer Science
What sort of problems are programs meant to solve?
– A computational problem specifies what the input can be and what the output should be.
1. Given numbers x, y, output x + y?
2. Given numbers x, y, output x × y?
3. Given strings w, t, decide if w is a substring of t? 4. Given a string e, decide if e is well-bracketed?
– The last two are called decision problems because their output is either 1 (meaning Yes) or 0 (meaning No).
1/42
Main modeling assumption
We will focus on decision problems where the input is a string.
For you to think about
1. Is focusing on decision problems a serious restriction? What if we replaced the problem
Given numbers x, y, output x + y. by the decision problem
Given numbers x, y, z, decide if x + y = z.
2. Is focusing on input strings a serious restriction?
– We can encode almost anything as a string.
– How would you encode an integer, or a set of integers, or a
graph?
2/42
Strings
Definition
An alphabet Σ is a nonempty finite set of symbols.
Example
– Σ = {0, 1} is the classic binary alphabet.
– Σ = {a, b, c, · · · , z} is the lower-case English alphabet.
3/42
Strings
Definition
A string over Σ is a finite sequence of symbols from Σ. The number of symbols in a string is its length.
Example
– 0110 is a string of length 4 over alphabet {0, 1}
– bob is a string of length 3 over alphabet {a,b,c,··· ,z}.
– If w has length n we write w = w1w2 ···wn where each wi ∈ Σ.
– There is only one string of length 0. It is denoted ε.
– The set of all strings over Σ is denoted Σ∗.
– The reason for this notation will be made clear later.
3/42
Strings
Concatenation of strings
– The concatenation of strings x, y is the string xy formed by appending y to the end of x.
– The concatenation of x = 010 and y = 01 is 01001. – wε=εw=wforallstringsw.
– 01ε=01
3/42
Decision problems
Definition
A decision problem is a function P : Σ∗ → {0, 1}. Example
1 if the length of the string s ∈ {0, 1}∗ is odd,
P(s) =
0 otherwise.
We will also write the decision problem like this:
Input: String s over alphabet {0, 1}.
Output: 1 if the length of the string s is odd, and 0 otherwise.
4/42
Programs that solve problems
Definition
A program solves a decision problem P : Σ∗ → {0, 1} if for every input string x ∈ Σ∗, the program on input x outputs P(x).
5/42
Programs that solve problems
1 2 3
Definition
A program solves a decision problem P : Σ∗ → {0, 1} if for every input string x ∈ Σ∗, the program on input x outputs P(x).
Example
The program
def L(s): # s binary -string
# a % b returns the remainder of a divided by b return len(s)%2
solves the decision problem
Input: A string s.
Output: 1 if the length of the string s is odd, and 0 otherwise.
5/42
Programs that solve problems
Definition
A program solves a decision problem P : Σ∗ → {0, 1} if for every input string x ∈ Σ∗, the program on input x outputs P(x).
Example
The program
1 def Q(x): # binary string x encoding natural number 2 # least significant digit first
3 return x[0]
solves the decision problem
Input: A string encoding a natural number. Output: 1 if the integer is odd, and 0 otherwise.
5/42
What decision problems can be solved by programs?
Input: Strings encoding integers x, y, z. Output: 1 if z = x + y, and 0 otherwise.
Input: Strings w, t.
Output: 1 if w is a substring of t, and 0 otherwise.
Input: Strings encoding integers x, y, z. Output: 1 if z = x × y, and 0 otherwise.
Input: String encoding an arithmetic expression e. Output: 1 if e is well-bracketed, and 0 otherwise.
Input: String encoding a (Python) program p. Output: 1 if p enters an infinite loop, and 0 otherwise.
6/42
What decision problems can be solved by programs?
We will see that, in a very precise sense:
1. “addition” and “substring” can be decided with limited memory, 2. “multiplication” and “well-bracketed” require recursion,
3. “infinite loop” cannot be decided by any program.
To do this, we introduce mathematical models of programs that solve decision problems.
1. Limited memory: regular expressions and finite automata (L5-L7).
2. Plus recursion: context-free grammars and pushdown automata (L8-L9).
3. Universal: Turing machines and Lambda-calculus (L10-L11).
7/42
One more modeling choice
We will think of decision problems as deciding if the input is in some set of strings.
Example
Here are two ways of describing the same problem.
Input: A binary string encoding a natural number. Output: 1 if the integer is odd, and 0 otherwise.
Input: A binary string encoding a natural number.
Output: 1 if the string is in the set {w ∈ {0,1}∗ : w[0] = 1}, and 0 otherwise.
8/42
Sets of strings are called languages. They are so important, they have their own field of study called Formal Language Theory.
9/42
Languages
Definition
A set L ⊆ Σ∗ of strings is called a (formal) language over Σ.
Example
Let Σ = {a, b}.
– L1 ={x∈Σ∗ :xstartswithab}
– L2 ={x∈Σ∗ :xhasanevennumberofbs}.
– L3 ={x∈Σ∗ :xhasthesamenumberofasasbs}.
– The empty set ∅.
– Note. The empty set ∅ has no elements in it, but the set {ε} has one element in it, the empty string.
10/42
For you to think about after class
1. Is HTML a formal language? If so, what is the alphabet and what are the strings?
2. Is music a formal language? If so, what is the alphabet and what are the strings?
11/42
Important operations on languages
Definition
Let A, B be languages over Σ.
– The union of A and B, written A ∪ B, is the language
{x∈Σ∗ :x∈Aorx∈B}.
– The concatenation of A and B, written A ◦ B, is the language {xy∈Σ∗ :x∈A,y∈B}.
k
k
– Write A for A◦A◦A···A
– The star of A, written A∗, is the language
{ε}∪A1 ∪A2 ∪A3 ∪...
i.e.,A∗ ={x1x2···xk ∈Σ∗ :k≥0,k∈Z,eachxi ∈A}.
12/42
Important operations on languages
Examples
1. {a,ab}∪{ab,aab}= 2. {a,ab}◦{ab,aab}= 3. {a,ab}∗ =
13/42
COMP2022|2922
Models of Computation
Lesson 5b: Regular Languages Chapter 1 of Sipser (eReserve)
Presented by
Sasha Rubin
School of Computer Science
Regular expressions in a nutshell
– Expressions that describe languages – Extremely useful
– Pattern matching (Control-F) – Specification of data formats – Lexical analysis
– Based on three operations:
– Language Union L1 ∪ L2
– Language Concatenation L1 ◦ L2
– Language Iteration/Kleene-star L∗
14/42
Regular expressions: Syntax
Definition
Let Σ be an alphabet. The regular expressions over Σ are defined by the following recursive process:
R1. The symbols ∅ and ε are regular expressions.
R2. Each symbol a ∈ Σ is a regular expression.
R3. If R1, R2 are regular expressions then so is (R1 ∪ R2)
R4. If R1, R2 are regular expressions then so is (R1 ◦ R2), also written (R1R2)
R5. If R is a regular expressions then so is R∗.1
1Note that Sipser writes (R∗), although we don’t needs the parentheses in this case.
15/42
Regular expressions
Examples
Let Σ = {a, b}. – (a∪∅)
– (a◦ε)
– b∗
– (a∗∪b∗) – (a∪b)∗ – (a◦b)∗
16/42
Regular expressions: Semantics
A regular expression R matches certain strings, collectively called L(R).
Definition
Let Σ be an alphabet. The language L(R) represented by a regular expression R is defined by the following recursive procedure:
1. L(∅) = ∅ and L(ε) = {ε}.
2. L(a)={a}fora∈Σ.
3. L((R1 ∪ R2)) = L(R1) ∪ L(R2).
4. L((R1◦R2))=L(R1)◦L(R2).
5. L(R∗)=L(R)∗ ={ε}∪L(R)1 ∪L(R)2 ∪L(R)3 ∪···
17/42
Regular expressions
Notation.
– We overload the symbol ∪, i.e., it is a symbol used in regular expressions, and it is an operation on sets of strings. The same is true of other symbols like a and ε.
– We do this for convenience (otherwise we would have to use different notation for the symbols in the regular expressions and the operations on languages, e.g., some texts use + or | in regular expressions instead of ∪).
– We may drop the outermost parentheses to improve readability.
– E.g., we may write a∪∅ instead of (a∪∅), and a◦b∗ instead of (a ◦ b∗), which is different from (a ◦ b)∗.
– We may write (R1 ∪R2 ∪R3) instead of ((R1 ∪R2)∪R3), and similarly (R1 ◦ R2 ◦ R3) instead of ((R1 ◦ R2) ◦ R3)).
– Thereasonwecandothisis,aswewillsee,◦and∪are
associative.
18/42
Regular expressions
Examples
Let Σ = {a, b}.
– L((a∪∅))=
– L((a◦ε))=
– L(b∗) =
– L((a∗∪b∗))= – L((a∪b)∗)= – L((a◦b)∗)=
19/42
Regular expressions
Examples
Let Σ = {a, b}. Write regular expressions for the following languages:
1. The set of strings ending in a.
– ((a∪b)∗◦a),alsowritten((a∪b)∗a).
2. The set of strings whose second last symbol is an a. – ((a∪b)∗ ◦a◦(a∪b))
3. The set of strings that have aba as a substring. – ((a∪b)∗ ◦a◦b◦a◦(a∪b)∗)
4. The set of strings of even length.
5. The set of strings with an even number of a’s.
20/42
Regular expressions
Important questions
1. Which languages can be described by regular expressions? All languages?
2. There are natural decision problems associated with regular expressions. Are there programs that solve them?
Input: Regular expression R and string s Output: 1 if s ∈ L(R), and 0 otherwise.
Input: Regular expressions R1, R2
Output: 1 if L(R1) = L(R2), and 0 otherwise.
21/42
COMP2022|2922 Models of Computation
Lesson 5c: Finite Automata Presented by
Sasha Rubin
School of Computer Science
Finite automata in a nutshell
A deterministic finite automaton is like a program that can only scan its input string once, and cannot write anything. At the end of reading the input string, it must decide to accept or reject the input string.
22/42
Why study such a simple model of computation?
1. It is part of computing culture.
– first appeared in McCulloch and Pitt’s model of a neural network (1943)
– then formalised by Kleene (American mathematician) as a model of stimulus and response
2. It has numerous practical applications.
– lexical analysis (tokeniser)
– pattern matching
– communication protocols with bounded memory – circuits with feedback
– finite-state reactive systems
– finite-state controllers
– non-player characters in computer games
– ...
3. It is simple to implement/code.
23/42
Drawing deterministic finite automata (DFA)
24/42
Definition of a determinisitc finite automaton (DFA)
Definition
A deterministic finite automaton (DFA) M is a tuple (Q,Σ,δ,q0,F) where
1. Q is a finite set of states,
2. Σ is a finite set called the alphabet,
3. δ : Q × Σ → Q is the transition function,
4. q0 ∈ Q is the start state, and
5. F ⊆ Q is the set of accept states, sometimes called final states.
′a′
If q = δ(q, a) we write q −→ q , called a transition.
25/42
Language described by a DFA
A DFA M describes a language L(M): all strings that label paths from the start state to an accepting state.
Examples
Let Σ = {a, b}. Draw DFAs for the following languages: 1. {aba}
2. all strings over Σ
3. the set of strings ending in a
4. L((a◦b)∗)
5. the set of strings with an even number of a’s 6. the set of strings with an odd number of b’s
26/42
Designing DFAs
Draw a DFA for the language {aba}
27/42
Designing DFAs
Draw a DFA for the set of all strings over {a, b}
28/42
Designing DFAs
Draw a DFA for the set of all strings ending in a
29/42
Designing DFAs
Draw a DFA for the set of all strings matching the regular expression (ab)∗
30/42
Designing DFAs
Draw a DFA for the set of all strings with an even number of a’s
31/42
Designing DFAs
Draw a DFA for the set of all strings with an odd number of b’s
32/42
Definition of the language recognised by a DFA
Definition
Let M = (Q,Σ,δ,q0,F) be a DFA, and let w = w1w2 ···wn be a string over Σ.
– A run of M on w is a sequence of transitions w1 w2 w3 wn
q0 −→q1 −→q2 −→···−→qn
– Therunrisacceptingifrn ∈F.
– If w has an accepting run then we say that M accepts w.
The set L(M) = {w ∈ Σ∗ : M accepts w} is the language recognised by M.
33/42
DFAs
Important questions
1. Which languages can be described by DFAs? All languages?
2. There are natural decision problems associated with DFAs. Are there programs that solve them?
Input: DFA M and string w
Output: 1 if w ∈ L(M), and 0 otherwise.
Input: DFA M
Output: 1 if L(M) = ∅, and 0 otherwise.
Input: DFAs M1,M2
Output: 1 if L(M1) = L(M2), and 0 otherwise.
34/42
The problem
Input: DFA M and string w
Output: 1 if w ∈ L(M), and 0 otherwise.
1 2 3 4 5 6 7 8
is solved by the program
def run(M,w):
cur = q_0
for i = 1 to len(w):
cur = δ(cur,w_i) if cur in F:
return 1 else:
return 0
35/42
Definition
A language L is called regular if L = L(M) for some DFA M.
36/42
Theorem
The regular languages are closed under complementation. That is, if A is regular, then Σ∗ \ A is regular.
Example
Let Σ = {a, b}. Draw DFAs for the following languages: 1. strings that contain aba as a substring
2. strings that do not contain aba as a substring
37/42
Theorem
The regular languages are closed under complementation. That is, if A is regular, then Σ∗ \ A is regular.
Construction ("swap accept and reject states")
1. Given A = L(M) for some DFA M = (Q,Σ,δ,q0,F). 2. We will construct a DFA M ′ recognising Σ∗ \ L(M ). 3. DefineM′ =(Q,Σ,δ,q0,F′)whereF′ =Q\F.
37/42
Theorem
The regular languages are closed under union. That is, if A and B are regular, then A ∪ B is regular.
Example
Let Σ = {a, b}. Draw DFAs for the following languages: 1. the strings consisting of an even number of a’s
2. the strings consisting of an odd number of b’s
3. the strings consisting of an even number of a’s or an odd number of b’s.
38/42
39/42
40/42
41/42
Theorem
The regular languages are closed under union. That is, if A1 and A2 are regular, then A1 ∪ A2 is regular.
Construction ("product")
1. Given DFA Mi = (Qi, Σ, δi, qi, Fi) recognising Ai, i = 1, 2. 2. We will construct a DFA M recognising L(A1) ∪ L(A2). 3. Define M = (Q,Σ,δ,q0,F) where
– Q=Q1×Q2,2
– δ maps state (s1,s2) on input a ∈ Σ to state
(δ1(s1, a), δ2(s2, a)),
– q0 =(q1,q2),
– F ={(s1,s2):s1 ∈F1 ors2 ∈F2}.
2Recall that Q1 ×Q2 = {(s,t) : s ∈ Q1,t ∈ Q2}
42/42
COMP2022|2922
Models of Computation
Lesson 6: Nondeterministic finite automata
Presented by
Sasha Rubin
School of Computer Science
Theorem
The regular languages are closed under concatenation. That is, if A and B are regular, then A ◦ B is regular.
Idea:
– Let MA be a DFA recognising A, let MB be a DFA
recognising B.
– WewanttobuildaDFAM forA◦B.
– One way would be for M to simulate MA, and at some point, as long as the state of MA is an accepting state, “guess” that it is time to switch, and simulate MB on the remaining input.
Question. How to we capture this “guess”?
1/17
We will introduce a model of computation, called a nondeterministic finite automaton (NFA), which is like a DFA except that states can have more than one outgoing transition on the same input symbol. Later we will show how DFAs can simulate NFAs.
2/17
Definition of NFA
Definition
A nondeterministic finite automaton M is a tuple (Q, Σ, δ, q0 , F ) the same as a DFA except the transition function is1
′a′ δ:Q×(Σ∪{ε})→P(Q). Ifq ∈δ(q,a)wewriteq−→q.
1P(Q) is the set of subsets of Q. E.g., P({q0, q1}) = {∅, {q0}, {q1}, {q0, q1}}.
3/17
Definition of NFA
– A run of an NFA M on string w is a sequence of transitions y1 y2 ym
q0 −→q1 −→q2...−→qm
suchthateachyi ∈Σ∪{ε}andw=y1y2···ym.2
– The run is accepting if qm ∈ F.
– If w has at least one accepting run, then we say that w is accepted by M.
– The language recogsnised by M is L(M)={w∈Σ∗ :wisacceptedbyM}.
That is: an NFA M describes the language L(M) of all strings that label paths from the start state to an accepting state.
2Recall that εx = xε = x for all strings x.
4/17
Language described by NFAs
Examples
Let Σ = {a, b}. Draw an NFA for the languages consisting of the set of strings who last letter is an a (also, draw a DFA, for comparison).
5/17
Comparing NFAs and DFAs
1. In a DFA, every input has exactly one run. The input string is accepted if this run is accepting.
2. In an NFA, an input may have zero, one, or more runs. The input string is accepted if at least one of its runs is accepting.
3. For every DFA M there is an NFA N such that L(M) = L(N).
– GivenDFAM=(Q,Σ,δ,q0,F)withδ:Q×Σ→Q,define the NFA N = (Q,Σ,δ′,q0,F) where δ′(q,a) = {δ(q,a)}.
– To see that L(M) = L(N) observe that the diagrams of M
and N are the same.
6/17
From Regular Expressions to NFAs
Theorem
For every regular expression R there is an NFA MR such that L(R) = L(MR).
The construction is by recursion on the structure of R. The base casesareR=∅,R=ε,R=afora∈Σ. Therecursivecasesare R = (R1 ◦R2),R = (R1 ∪R2), and R = R1∗.
Base cases:
7/17
NFAs are closed under ◦ Lemma
If N1, N2 are NFAs, there is an NFA N recognising L(N1) ◦ L(N2). Construction ("Simulate N1 followed by N2")
Given NFAs Ni = (Qi, Σ, δi, qi, Fi) construct NFA N with states Q1 ∪ Q2 as in the figure. The idea is that N guesses how to break the input into two pieces, the first accepted by N1, and the second by N2.
8/17
NFAs are closed under ◦ (reference slide) Here is the construction. From Ni = (Qi, Σ, δi, qi, Fi) construct
N = (Q1 ∪ Q2, Σ, δ, q1, F2) where
δ(q,a) q∈Q \F
1 11 δ1(q,a) q ∈ F1,a ̸= ε
δ(q, a) =
δ1(q,a)∪{q2} q∈F1,a=ε
δ 2 ( q , a ) q ∈ Q 2
9/17
NFAs are closed under ∪ Lemma
If N1, N2 are NFAs, there is an NFA N recognising L(N1) ∪ L(N2). Construction ("Simulate N1 or N2")
Given NFAs Ni = (Qi, Σ, δi, qi, Fi) construct NFA N with states Q1 ∪ Q2 ∪ {q0} as in the figure. The idea is that N guesses which of N1 or N2 to simulate.
10/17
NFAs are closed under ∪ (reference slide) Here is the construction. From Ni = (Qi, Σ, δi, qi, Fi) construct
N = (Q1 ∪ Q2 ∪ {q0}, Σ, δ, q0, F1 ∪ F2) where δ(q,a) q∈Q
δ(q, a) =
1
δ 2 ( q , a )
{q1,q2} ∅
1 q ∈ Q 2
q = q0,a = ε q = q 0 , a ̸ = ε
11/17
NFAs are closed under ∗ Lemma
If N1 is an NFA, there is an NFA N recognising L(N1)∗. Construction ("Simulate N and go back to the initial
state")
Given NFAs N1 = (Q1, Σ, δ1, q1, F1) construct NFA N with states Q ∪ {q0} as in the figure. The idea is that N guesses how to break the input into pieces, each of which is accepted by N1.
12/17
NFAs are closed under ∗ (reference slide) Here is the construction. From N1 = (Q1, Σ, δ1, q1, F1) construct
N = (Q1 ∪ {q0}, Σ, δ, q0, F1 ∪ {q0}) where
δ (q,a) (q∈Q \F )or(q∈F ,a̸=ε)
1 111 δ1(q,a)∪{q1} q∈F1,a=ε
δ(q, a) =
{q1} q = q0,a = ε
∅ q = q 0 , a ̸ = ε
13/17
From RE to NFA: example
Convert the regular expression ((a ∪ b)∗ ◦ a) to an NFA (using the construction just given).
14/17
COMP2022|2922
Models of Computation
Lesson 7a: NFAs, DFAs, REs have the same expressive power
Presented by
Sasha Rubin
School of Computer Science
Where are we going?
We are going to show that DFA, NFA and regular expressions have the same expressive power. I.e., they recognise the same languages.
1. From Regular Expressions to NFAs
2. From NFAs to NFAs without ε-transitions 3. From NFAs without ε-transitions to DFAs 4. From DFAs to Regular Expressions
15/17
From NFAs to NFAs without ε-transitions
Theorem
For every NFA N there is an NFA N′ without ε-transitions such that L(N) = L(N′).
Construction ("skip over ε-transitions") ε∗
′ ε∗a
– t∈δ(s,a)if,inN,thereisastateqsuchthats−→q−→t.
′ ε∗
– s∈F if,inN,thereisastateq∈F suchthats−→q.
The idea is that N′ skips over the ε-transitions of N.
Given NFA N = (Q,Σ,δ,q0,F) write s −→ t if there is a path from s to t labelled by a string from ε∗. Construct
N′ = (Q,Σ,δ′,q0,F′) where
16/17
From NFAs to NFAs without ε-transitions Example
17/17
COMP2022|2922
Models of Computation
Lesson 7a: NFAs, DFAs, REs have the same expressive power
Presented by
Sasha Rubin
School of Computer Science
Where are we going?
We are going to show that DFA, NFA and regular expressions have the same expressive power. I.e., they recognise the same languages.
1. From Regular Expressions to NFAs
2. From NFAs to NFAs without ε-transitions 3. From NFAs without ε-transitions to DFAs 4. From DFAs to Regular Expressions
1/26
From NFAs without ε-transitions to DFAs
Theorem
For every NFA N without ε-transitions there is a DFA M such that L(M) = L(N).
Construction ("subset construction")
Intuitively, M keeps track of all possible states that N can be in, and M accepts if at least one of these states is final.
2/26
From NFAs without ε-transitions to DFAs Example
3/26
From NFAs without ε-transitions to DFAs (Reference Slide)
Given NFA N = (Q,Σ,δ,q0,F) the subset construction builds a DFA M = (P(Q),Σ,δ′,{q0},F′) where
– for X ⊆ Q, δ′(X,a) = ∪q∈Xδ(q,a), – F ′ = {X ⊆ Q : X ∩ F ̸= ∅}.
4/26
From DFAs to Regular Expressions
Theorem
For every DFA M there is a regular expression R such that L(M) = L(R).
Idea:
1. We convert M into a generalised nondeterministic finite automaton (GNFA) which is like an NFA except that it can have regular expressions label the transitions.
2. We successively remove states from the automaton, until there is just one transition (from the start state to the final state).
3. We return the regular expression on this last transition.
5/26
GNFAs
– A run of a GNFA N on string w is a sequence of transitions R1 R2 Rm
q 0 − −→ q 1 − −→ q 2 . . . −−→ q m
in M such that w matches the reg exp R1R2 ···Rm.
– The run is accepting if qm ∈ F . The language recognised by N isL(N)={w∈Σ∗ :wisacceptedbyN}.
Example
6/26
From DFAs to Regular Expressions
Generalised NFAs. We make sure that:
1. the initial state has no incoming edges.
2. there is one final state, and it has no outgoing edges.
3. there are edges from every state (that is not final) to every state (that is not initial).
We show how to translate DFAs into GNFAs, and then successively remove states from the GNFA until there is just a single edge left.
7/26
From DFAs to GNFAs
For every DFA M there is a GNFA N such that L(M) = L(N).
1. Add an initial state and ε-transition to the old intial state.
2. Add a final state and ε-transitions to it from all the old final states.
3. Add transitions for every pair of states, including from a state to itself, (except leaving the final, or entering the new).
8/26
From DFAs to GNFAs
Example
9/26
Removing states from GNFAs
For every GNFA N with > 2 states there is a GNFA N′ with one less state such that L(N) = L(N′).
1. Pick a state q to eliminate.
2. For every pair of remaining states (s, t), replace the label from
s to t by the regular expression
(R ∪(R ◦R∗ ◦R )).
s,t s,q q,q q,t
where Rx,y is the regular expression labeling the transition from state x to state y.
10/26
Removing states from GNFAs
Example
11/26
Summary
1. The following models of computation describe the same languages: DFAs, NFAs, Regular Expressions.
2. The regular languages are closed under the Boolean operations union, intersection, complementation, as well as concatenation and Kleene star.
12/26
Decision problems about regular languages
We now look at some decision problems which take automata/regular-expressions as input.
13/26
Decision Problems about DFAs
Input: DFA M and string w
Output: 1 if w ∈ L(M), and 0 otherwise.
Input: DFA M
Output: 1 if L(M) = ∅, and 0 otherwise.
Input: DFAs M1,M2
Output: 1 if L(M1) = L(M2), and 0 otherwise.
14/26
Emptiness problem for DFAs
Input: DFA M
Output: 1 if L(M) = ∅, and 0 otherwise.
Algorithm (“mark reachable states”)
1. Mark the states using the following procedure.
– Mark the initial state.
a
– Ifqismarkedandq−→p(forsomea∈Σ),thenmarkp. – Stop when no new states are marked.
2. Return 0 if a final state is marked, else return 1.
Marking a state can be thought of as putting it in a set called “marked”.
15/26
Equivalence problem for DFAs
Input: DFAs M1,M2
Output: 1 if L(M1) = L(M2), and 0 otherwise.
Algorithm
1. Form DFA M recognising
(L(M1) \ L(M2)) ∪ (L(M2) \ L(M1)).
– Note that L(M1) = L(M2) iff L(M) = ∅.
2. SowejustneedawaytocheckifL(M)=∅. Usethe algorithm for the Emptiness problem for DFAs.
16/26
Decision Problems about Regular Expressions
Input: Regular expression R and string w Output: 1 if w ∈ L(R), and 0 otherwise.
Input: Regular expressions R1, R2
Output: 1 if L(R1) = L(R2), and 0 otherwise.
17/26
COMP2022|2922 Models of Computation
Lesson 7b: Non-regularity Presented by
Sasha Rubin
School of Computer Science
To show that a language is regular, it is sufficient to find a DFA, NFA, or Regular Expression for it. To show that a language L is not regular, one must show that there is no DFA that recognises it. This can be done using a fooling argument.
1. Assume there is a DFA M recognising the language L.
2. Reason about the transitions of this DFA and find a string not in the language which it accepts (or a string in the language which it rejects).
3. This contradicts the assumption that M recognises L.
4. Conclude that there is no such DFA.
18/26
Pigeonhole principle (PHP)
If there are n holes and > n objects to put in the holes, then at least one hole must get at least two objects.
How will we apply this?
19/26
L={aibi :i≥0} is not regular
Proof
1. Assume there is a DFA M that recognises L.
2. Write f(w) for the state that M reaches after reading input w.
3. By PHP, there exists n ̸= m such that f(an) = f(am).
4. But anbn ∈ L means that the path from state f(an) labeled bn ends in a final state.
5. So there is a path from the initial state to a final state labeled ambn. So ambn is accepted by M.
6. Butsincem̸=n,M acceptsawordnotinL.
7. This contradicts the assumption that M recognises L.
20/26
L = {ww : w ∈ {a, b}∗} is not regular
Proof
1. Assume there is a DFA M that recognises L.
2. Write f(w) for the state that M reaches after reading input w.
3. By PHP, there exists u ̸= v such that |u| = |v| and f(u) = f(v).
4. But uu ∈ L means that the path from state f(u) labeled u ends in a final state.
5. So there is a path from the initial state to a final state labeled vu. So vu is accepted by M.
6. Butsinceu̸=vand|u|=|v|,vu̸∈L.
7. This contradicts the assumption that M recognises L.
21/26
L={u∈{a,b}∗ :|u| is a power of 2}
Proof
1. Assume there is a DFA M that recognises L.
2. Write f(w) for the state that M reaches after reading input w.
3. By PHP, there exists u,v such that f(u) = f(v) and |u| = 2n,|v| = 2m, n < m.
4. Butuu∈Lsince2n+2n =2n+1,sothepathfromstate f(u) labeled u ends in a final state.
5. So there is a path from the initial state to a final state labeled vu. So vu is accepted by M.
6. But vu is not in L since
2m <|vu|=2m+2n <2m+2m =2m+1. SoM acceptsa word not in L.
7. This contradicts the assumption that M recognises L.
22/26
Other techniques
Once we know that a language is not regular, we can deduce that other languages are not regular using the closure properties of regular languages.
23/26
Showing a language is not regular
Example
We have seen that L = {aibi : i ≥ 0} is not regular. Use this fact to prove that the set L′ of strings with the same number of as as bs is not regular.
1. Assume L′ is regular.
2. Then L′ ∩ L(a∗b∗) is regular since the intersection of regular
languages is regular.
3. But L = L′ ∩ L(a∗b∗), which we know is not regular.
4. This contradicts the assumption that L′ is regular.
5. So L′ is not regular.
24/26
Showing a language is not regular
Example
We have seen that the language L′ consisting of strings over {a, b} with the same number of as as bs is not regular. Use this fact to prove that the language L′′ consisting of strings over {a, b} with a different number of as as bs is not regular.
1. Assume L′′ is regular.
2. Then {a, b}∗ \ L′′ is regular since the complement of a regular
language is regular.
3. But L′ = {a, b}∗ \ L′′, which we know is not regular.
4. This contradicts the assumption that L′′ is regular.
5. So L′′ is not regular.
25/26
Where are we going?
Next week we start learning about more powerful models of computation that can recognise non-regular languages. We start with context-free grammars.
26/26
COMP2022|2922 Models of Computation
Lesson 8: Context-free Grammars Presented by
Sasha Rubin
School of Computer Science
Course topics
Here are some important mathematical models of computation: 1. Propositional Logic, Predicate Logic
2. Finite Automata, Regular Expressions, Regular Grammars 3. Pushdown Automata, Context-free grammars
4. Turing Machines, Lambda Calculus
1/37
What questions about computation can be answered with mathematical models?
2/37
What problems can be solved by programs with very limited memory?
– E.g., Basic pattern matching, lexical analysis
– COMP3109:Programming Languages and Paradigms – Model: Finite automata, regular grammars
3/37
What about parentheses matching?
{0n1n : n ≥ 0} is not regular.
My webpage!
4/37
What problems can be solved by programs with very limited memory but allowing recursion?
– E.g., Sophisicated string processing, parsing
– COMP3109:Programming Languages and Paradigms – Model: Context-free grammars, Pushdown automata.
5/37
Theme
– State-machines recognise languages. – Grammars generate languages.
6/37
Grammars in a nutshell
– A grammar is a set of rules which can be used to generate/derive strings
– The language of the grammar is all strings that can be derived by the grammar
7/37
Context-free grammars
Program Syntax
statements: statement+
statement : compound_stmt | simple_stmt
Document Description Definition
Our Syntax
S→TS T → C|D
8/37
Context-free grammars
S → 0S1 S→T T→ε
This grammar generates the language L = {0n1n | n ≥ 0}. How does it derive 000111?
9/37
Context-free grammars
S → N ounP hrase V erbP hrase NounPhrase → the Noun
V erbP hrase → V erb N ounP hrase Noun → girl | ball
V erb → likes | sees This grammar generates the language:
{ the girl likes the girl, the girl sees the girl, the ball likes the girl, the ball sees the girl,
the girl likes the ball, the girl sees the ball, the ball likes the ball,
the ball sees the ball }
10/37
Context-free grammars
Anatomy of a grammar:
– Variables aka Non-terminals: used to generate strings
– Start variable: variable used to start every derivation
– Terminals: alphabet of symbols making up strings in the language
– Rules aka Production rules: A→u
where A is a variable and u is string of variables and terminals A variable can have many rules: They can be written together:
Noun → girl Noun → girl | ball Noun → ball
11/37
Context-free grammars
S → a | b | ε | ∅ | (S ∪ S) | (S ◦ S) | S∗
– Variable S
– Terminals a,b,ε,∅,(,),∪,◦,∗
12/37
Context-free grammars
S → p | q | (S ∧ S) | (S ∨ S) | ¬S
– Variable S
– Terminals p,q,(,),∧,∨,¬
13/37
Context-free grammars
Notation
– A,B,C,… and S are variables
– S is the start variable
– a,b,c,… are terminals
– u, v, w, . . . are strings of terminals and variables
14/37
Context-free grammars: Syntax
Definition
A context-free grammar G is a 4-tuple (V, Σ, R, S) where:
– V is a finite set of variables
– Σ is a finite set of terminals
– RisafinitesetofrulesintheformA→vwhereA∈V and v∈(V ∪Σ∪{ε})⋆
– S ∈ V is a special variable called the start variable
15/37
Context-free grammars: Semantics
Definition
– If A → w is a rule and u,v are strings of terminals and variables then uAv ⇒ uwv.
The symbol ⇒ is read yields – u⇒∗ vmeansuyieldsvinzeroormoresteps
The symbol ⇒∗ is read derives. – The language generated by G is the set of strings over Σ that
are derived from the start variable S.
Sometimes we will use u ⇒+ v to mean that u yields v in at least one step.
L(G)={u∈Σ∗ :S ⇒∗ u}
16/37
Derivation of a string
– Begin with the start variable
– Repeatedly replace one variable with the right hand side of one of its productions
– … until the string is composed only of terminal symbols
Example, derivation of 000111 from this grammar:
S → 0S1 | ε S ⇒ 0S1 ⇒ 00S11
⇒ 000S111 ⇒ 000111
17/37
Leftmost and Rightmost Derivations
Leftmost derivation: always derive the leftmost variable first Rightmost derivation: always derive the rightmost variable first
Example: “the girl sees the ball”
S ⇒ NounPhrase V erbPhrase ⇒ the Noun V erbPhrase
⇒ the girl V erbPhrase
⇒ the girl V erb NounPhrase ⇒ the girl sees NounPhrase ⇒ the girl sees the Noun
⇒ the girl sees the ball
S ⇒ NounPhrase V erbPhrase
⇒ NounPhrase V erb NounPhrase ⇒ NounPhrase V erb the Noun
⇒ NounPhrase V erb the ball
⇒ NounPhrase sees the ball
⇒ the Noun sees the ball
⇒ the girl sees the ball
18/37
Constructing grammars
Let M and N be two languages whose grammars have disjoint sets of non-terminals (rename them if necessary). Let SM and SN be their start variables. Then we can construct a grammar recognising the following languages, with a new start variable S:
– Union: thegrammarforM∪N startswithS→SM |SN
– Concatenation: the grammar for M N starts with S → SM SN – Star closure: the grammar for M⋆ starts with S → SMS | ε
All other productions remain unchanged (aside for renaming of variables as needed)
19/37
Using the union rule
Let L = {ε,a,b,aa,bb,…,an,bn,…}
Then L = M ∪ N where M = {an | n ≥ 0}, N = {bn | n ≥ 0}
SoagrammarGM ofM isSM →ε|aSM andagrammarGN ofN isSN →ε|bSN
Using the union rule we get:
S→SM |SN SM →ε|aSM SN →ε|bSN
20/37
Using the concatenation rule
LetL={ambn |m≥0,n≥0}
Then L = MN where M = {am | m ≥ 0},N = {bn | n ≥ 0}
SoagrammarGM ofM isSM →ε|aSM andagrammarGN ofN isSN →ε|bSN
Using the concatenation rule we get:
S → SM SN SM →ε|aSM SN →ε|bSN
21/37
Using the star closure rule
Let L be strings consisting of 0 or more occurrences of aa or bb, i.e. (aa | bb)⋆
Then L = M⋆ where M = {aa,bb} SoagrammarGM ofM isSM →aa|bb
Using the star closure rule we get:
S → SM S | ε SM →aa|bb
22/37
Context-Free Languages
Definition
A language is context-free if it is generated by a CFG.
Facts.
– Every regular language is context-free (why?)
– The union of two CFL is also context-free
– The concatenation of two CFL is also context-free – The star closure of a CFL is also context-free
23/37
Example
Consider the grammar G:
S → AB
A → ε | aA B → ε | bB
What is L(G)?
S ⇒ AB ⇒ aAB
⇒∗ aaaaAB
⇒ aaaaB
⇒ aaaabB
⇒∗ aaaabbbbbbB ⇒ aaaabbbbbb
i.e. L(G)=L(a⋆b⋆)={anbm |n≥0,m≥0}
24/37
Parsing
The problem of parsing is determining how the grammar generates a given string.
25/37
Parse Tree
A parse tree is a tree labelled by symbols from the CFG – root = the start variable
– interior node = a variable
– leaf node = a terminal or ε
– children of X = the right hand side of a production rule for X, in order
– Example parse tree for “0011” inS→0S1|01
– An in-order traversal of the leaf nodes retrieves the string
S
0S1
01
26/37
Parse Tree or Derivation Tree
The parse tree defines the meaning of a string in the grammar’s language.
S
SS
SS
x·y·z
S→S·S S→x|y|z
This parse tree says that the expression means x · (y · z) rather than ((x · y) · z).
27/37
Natural Language Processing (NLP) example
S → N ounP hrase V erbP hrase
NounPhrase → ComplexNoun | ComplexNoun PrepPhrase
V erbP hrase → ComplexV erb | ComplexV erb P repP hrase
PrepPhrase → Prep ComplexNoun ComplexNoun → Article Noun
ComplexV erb → V erb | V erb NounPhrase Article → a | the
Noun → girl | dog | stick | ball V erb → chases | sees
Prep → with
28/37
Ambiguity: example
Ambiguity: several meanings for the same sentence.
“The girl chases the dog with a stick” has two leftmost derivations
Sentence ⇒ NounPhrase V erbPhrase ⇒ ComplexNoun V erbPhrase
⇒ Article N oun V erbP hrase ⇒ the Noun V erbPhrase
⇒ the girl V erbP hrase
⇒ the girl ComplexV erb
⇒ the girl V erb NounPhrase
⇒ the girl chases NounPhrase
⇒ the girl chases ComplexNoun PrepPhrase ⇒ the girl chases Article Noun PrepPhrase ⇒ the girl chases the Noun PrepPhrase
⇒ the girl chases the dog PrepPhrase
⇒ the girl chases the dog Prep ComplexNoun ⇒ the girl chases the dog with ComplexNoun ⇒ the girl chases the dog with Article Noun ⇒ the girl chases the dog with a Noun
⇒ the girl chases the dog with a stick
Sentence
⇒ NounPhrase V erbPhrase
⇒ ComplexNoun V erbPhrase
⇒ Article N oun V erbP hrase
⇒ the Noun V erbPhrase
⇒ the girl V erbP hrase
⇒ the girl ComplexV erb P repP hrase
⇒ the girl V erb NounPhrase PrepPhrase ⇒ the girl chases NounPhrase PrepPhrase ⇒ the girl chases Article Noun PrepPhrase ⇒ the girl chases the Noun PrepPhrase
⇒ the girl chases the dog PrepPhrase
⇒ the girl chases the dog Prep ComplexNoun ⇒ the girl chases the dog with ComplexNoun ⇒ the girl chases the dog with Article Noun ⇒ the girl chases the dog with a Noun
⇒ the girl chases the dog with a stick
29/37
Ambiguity: example
Ambiguity: several meanings for the same sentence.
“The girl chases the dog with a stick” has two leftmost derivations
Sentence ⇒∗ the girl V erbP hrase ⇒ the girl ComplexV erb
⇒∗ the girl chases the dog with a stick
Sentence ⇒∗ the girl V erbP hrase
⇒ the girl ComplexV erb P repP hrase
⇒∗ the girl chases the dog with a stick Who has the stick?
30/37
First Leftmost Derivation Tree
S entence
N ounP hraseV erbP hrase
CmplxNoun
CmplxV erb
NounPhrase
CmplxNoun PrepPhrase Article Noun Prep CmplxNoun
Article N oun
the dog with the stick
Article
N oun
V erb
the
girl
chases
31/37
Second Leftmost Derivation Tree
S entence
N ounP hrase V erbP hrase
CmplxNoun Article Noun
CmplxVerb PrepPhrase
V erb NounPhrasePrep CmplxNoun
Article Noun
chases the dog with
CmplxNoun Article Noun
the girl
a stick
32/37
Ambiguous Grammars
Definition
– A string is ambiguous on a given grammar if it has two different parse trees.
– A grammar is ambiguous if it contains an ambiguous string.
Good to know
– Each parse tree corresponds to one leftmost derivation.
– So, a string is ambiguous if it has two leftmost derivations. – The same thing is true of rightmost derivations.
33/37
Is this grammar ambiguous?
E→E−E E→a|b|c
Rightmost derivations of a − b − c:
E⇒E−E E⇒E−E ⇒E−c ⇒E−E−E ⇒E−E−c ⇒E−E−c ⇒E−b−c ⇒E−b−c ⇒a−b−c ⇒a−b−c
i.e. (a−b)−c i.e. a−(b−c)
34/37
Is this grammar ambiguous?
E→E−E E→a|b|c
Rightmost derivations of a − b − c: EE
E−EE−E
E−EcaE−E
ab bc
i.e. (a−b)−c i.e. a−(b−c)
35/37
Removing ambiguity (example)
Suppose we want a−b−c to always mean (a−b)−c? Introduce a new nonterminal symbol T:
E→E−T|T T→a|b|c
Now the only rightmost derivation of a − b − c is:
E
E⇒E−T ⇒E−cE−T
⇒E−T−c ⇒E−b−c ⇒T−b−c ⇒a−b−c
E−T Tb
c
a
36/37
Types of grammars
The Chomsky Hierarchy consists of 4 classes of grammars, depending on the type of production rules that they allow:
Type 0 (unrestricted) Type 1 (context-sensitive) Type 2 (context-free) Type 3 (regular)
χ → α
uAv → uχv (allowing S → ε) A → u
A → cB and A → d
– χ string of one or more variables and terminals, excluding ε. – u, v string of variables and terminals, including ε
– A,B variables
– c, d terminals
37/37
COMP2022|2922 Models of Computation
Lesson 9a: Chomsky Normal Form Presented by
Sasha Rubin
School of Computer Science
– A contex-free grammar (CFG) generates strings by rewriting.
– Today we will see how to tell if a given context-free grammar (CFG) generates a given string.
– Here is the decision problem:
Input: a CFG G and string w.
Output: 1 if the string w is generated by G, and 0 other- wise.
– This basic problem is solved by compilers and parsers.
1/32
Possible approaches…
1. Systematically search through all derivations (or all parse-trees) until you find one that derives w.
– Try all i-step derivations for i = 1, 2, 3, etc.
– Problem: When to stop and declare “the string w cannot be
derived from G”?
– This problem can be fixed, but the resulting algorithm takes
exponential time in the worst case, i.e., is very slow.
2. Use a table-filling algorithm (aka tabulation, aka dynamic programming).
– Systematically compute, for every substring v of w, which non-terminals of G derive v (if any).
– Then check if the start state S is in the set computed for the whole string w.
– The resulting algorithm takes polynomial time in the worst case, i.e., is acceptably fast.
– This is the approach we will take today! It is called the CYK algorithm
2/32
Input: a CFG G and string w.
Output: 1 if the string w is generated by G, and 0 otherwise.
Plan
1. Convert G into a grammar G′ such that L(G) = L(G′) and G′ is in a special normal form called Chomsky Normal Form.
2. Apply the table-filling algorithm to G′ and w, and read off the answer.
Skeptic
Q. Why do we do the normal form?
A. In order to make the table small and easier to implement.
E.g., the parse-trees of a grammar in CNF are binary trees!
3/32
Chomsky Normal Form
Definition
A grammar G is in Chomsky Normal Form (CNF) if every rule is in of one of these forms:
– A → BC (A, B, C are any variables, except that neither B nor C is the start symbol)
– A→a(Aisanyvariableandaisaterminal)
– S → ε (where S is the start symbol).
In the next slides, we will give a 5-step algorithm that transforms every CFG into an equivalent one in Chomsky Normal Form:
1. START: Eliminate the start symbol from the RHS of all rules 2. TERM: Eliminate rules with terminals, except for rules A → a 3. BIN: Eliminate rules with more than two variables
4. DEL: Eliminate epsilon productions
5. UNIT: Eliminate unit rules
4/32
Chomsky Normal Form: algorithm
1. Eliminate the start symbol from the RHS of all rules 2. Eliminate rules with terminals, except for rules A → a
3. Eliminate rules with more than two variables
4. Eliminate epsilon productions
5. Eliminate unit rules
Add the new start symbol S0 and the rule S0 → S
5/32
Chomsky Normal Form: algorithm
1. Eliminate the start symbol from the RHS of all rules
2. Eliminate rules with terminals, except for rules A → a
3. Eliminate rules with more than two variables
4. Eliminate epsilon productions
5. Eliminate unit rules
– Replace every terminal a on the RHS of a rule (that is not of the form A → a) by the new variable Na.
– For each such terminal a create the new rule Na → a.
6/32
Chomsky Normal Form: algorithm
1. Eliminate the start symbol from the RHS of all rules 2. Eliminate rules with terminals, except for rules A → a 3. Eliminate rules with more than two variables
4. Eliminate epsilon productions
5. Eliminate unit rules
For every rule of the form A → X1X2 …Xn (where n > 2), delete it and create new variables A1, A2, . . . , An−2 and rules:
A → X1A1 A1 → X2A2
.
An−3 → Xn−2An−2 An−2 → Xn−1Xn
7/32
Chomsky Normal Form: algorithm
1. Eliminate the start symbol from the RHS of all rules 2. Eliminate rules with terminals, except for rules A → a 3. Eliminate rules with more than two variables
4. Eliminate epsilon productions
5. Eliminate unit rules
For every rule of the form U → ε (except S0 → ε)
– Remove the rule.
– For each rule A → α containing U, add every possible rule
A → α′ where α′ is α with one or more U’s removed; only add the rule A → ε if this rule has not already been removed.
8/32
Chomsky Normal Form: algorithm
1. Eliminate the start symbol from the RHS of all rules 2. Eliminate rules with terminals, except for rules A → a 3. Eliminate rules with more than two variables
4. Eliminate epsilon productions
5. Eliminate unit rules
For each rule of the form A → B:
– Remove the rule A → B
– ForeachruleoftheformB→αaddthenewruleA→α (unless it was previously removed)
9/32
Chomsky Normal Form: example
S → ASA | aB A→B|S B→b|ε
Step 1 (START): Eliminate start symbol from the RHS of all rules:
S0→ S
S → ASA | aB A→B|S
B→b|ε
10/32
Chomsky Normal Form: example
S0 → S
S → ASA | aB A→B|S
B→b|ε
Step 2 (TERM): Eliminate rules with terminals, except for rules A → a:
S0 → S
S → ASA | NaB A→B|S
B→b|ε Na→ a
11/32
Chomsky Normal Form: example
S0 → S
S → ASA | NaB A→B|S
B→b|ε Na → a
Step 3 (BIN): Eliminate rules with more than two variables:
S0 → S
S → AS1 | NaB
S1→ SA A→B|S B→b|ε
Na → a
12/32
Chomsky Normal Form: example
S0 → S
S → AS1 | NaB
S1 → SA A→B|S B→b|ε
Na → a
Step 4 (DEL): Eliminate epsilon production B → ε
S0 → S
S → AS1 | NaB | Na
S1 → SA A→B|S|ε B→b
Na → a
13/32
Chomsky Normal Form: example
S0 → S
S → AS1 | NaB | Na
S1 → SA A→B|S|ε B→b
Na → a
Step 4 (DEL): Eliminate epsilon production A → ε
S0 → S
S→AS1 |NaB|Na |S1
S1 →SA|S A→B|S B→b
Na → a
14/32
Chomsky Normal Form: example
S0 → S
S→AS1 |NaB|Na |S1
S1 →SA|S A→B|S B→b
Na → a
Step 5 (UNIT): Eliminate unit rules S → Na, A → B, S → S1
S0 → S
S→AS1 |NaB|a|SA
S1 →SA|S A→b|S B→b
(anduselessS→S)
Na → a
15/32
Chomsky Normal Form: example
S0 → S
S→AS1 |NaB|a|SA
S1 →SA|S A→b|S B→b
Na → a
Step 5 (UNIT): Eliminate unit rules S0 → S, S1 → S, A → S
S0 →AS1 |NaB|a|SA S→AS1 |NaB|a|SA S1 →AS1 |NaB|a|SA
A→b|AS1 |NaB|a|SA
B→b Na → a
16/32
Chomsky Normal Form: example
All done!
S0 →AS1 |NaB|a|SA S→AS1 |NaB|a|SA S1 →AS1 |NaB|a|SA
A→b|AS1 |NaB|a|SA
B→b Na → a
17/32
COMP2022|2922
Models of Computation
Lesson 9b: Membership problem for CFGs in CNF
Presented by
Sasha Rubin
School of Computer Science
Membership problem for CFG in CNF
Input: a CFG G that is in CNF, and string w.
Output: 1 if the string w is generated by G, and 0 otherwise.
Table-filling algorithm (aka Dynamic Programming)
– This approach accumulates information about smaller subproblems to solve the larger problem (similar to divide and conquer)
– The table records the solution to the subproblems, so we only need to solve each subproblem once (aka memoisation)
– Main steps: define the subproblems, find the recursion, make sure you solve each subproblem once.
– You will see this again in COMP3027:Algorithm Design
– The algorithm we will see is known as the CYK algorithm (Cocke–Younger–Kasami).
18/32
What are the subproblems?
Idea
– A parse tree for a string w is built from a root, a left subtree, and a right subtree (remember that G is in CNF).
– The left (right) subtree is parse tree of a prefix (suffix) of w.
– So, the immediate subproblems for w are computing which
variables generate prefixes/suffixes of w.
– So, the subproblems for w are computing which variables generate which substrings of w.
19/32
What are the entries in the table?
Given G in CNF, and a non-empty string w = w1w2 · · · wn:
– Write table(i,j) for the set of variables A that generate the
substring wiwi+1 …wj.
– The algorithm will compute table(i, j) for all 1 ≤ i < j ≤ n.
– Once the table is computed, just check if S ∈ table(1, n). If yes, then the grammar generates w, and if no, then it doesn’t.
20/32
Computing the table recursively
Compute table(i,j) using the following recursive procedure:
1. If i = j then table(i, j) is the set of variables A such that
A → wi is a rule of the grammar.
2. If i < j then table(i, j) is the set of variables A for which there is a rule A → BC and an integer k with i ≤ k < j such that B ∈ table(i, k) and C ∈ table(k + 1, j).
Q: Why is the recursion correct?
21/32
Computing the table recursively
Compute table(i,j) using the following recursive procedure:
1. If i = j then table(i, j) is the set of variables A such that
A → wi is a rule of the grammar.
2. If i < j then table(i, j) is the set of variables A for which there is a rule A → BC and an integer k with i ≤ k < j such that B ∈ table(i, k) and C ∈ table(k + 1, j).
Q: Why does this recursion stop?
– At each step, we call the procedure on “smaller” problems.
– In what sense is table(i, k) and table(k + 1, j) smaller than table(i, j)? The size of the intervals [i, k] and [k + 1, j] is smaller than the size of the interval [i,j].
22/32
We want to avoid computing table entries more than once.
– So, before making a recursive call just check if the value has already been computed. If yes, use that value and don’t recurse. If not, recurse.
23/32
Can we write this with a bunch of loops?
Yes, but it is harder to read. See Sipser (edition 3) Theorem 7.16.
(pseudocode from “Introduction to the theory of computation” by Michael Sipser)
24/32
Can we fill the table by hand?
Yes, but this is best done by a computer!
(1,7)
(1,6)
(2,7)
(1,5)
(2,6)
(3,7)
(1,4)
(2,5)
(3,6)
(4,7)
(1,3)
(2,4)
(3,5)
(4,6)
(5,7)
(1,2)
(2,3)
(3,4)
(4,5)
(5,6)
(6,7)
(1,1)
(2,2)
(3,3)
(4,4)
(5,5)
(6,6)
(7,7)
w1 w2 w3 w4 w5 w6 w8 Q: Where do I put the entry table(i,j)?
– horizontal co-ordinate = starting position of substring = i
– vertical co-ordinate = length of substring = j − i + 1 Q: What entries are needed to compute table(i,j)?
– You have to look at the pairs table(i, k), table(k + 1, j) for
k = i, . . . , j − 1; it’s the right-angled triangle below table(i, j).
Q: In what order are the entries computed?
– Row by row, bottom to top, left to right
25/32
Example
S
VP
S
VP
PP
S
NP
NP
NP
VP, V
Det
N
P
Det
N
S→NP VP
VP →VP PP |V NP PP →P NP
NP → Det N a fork
she eats 1234567
a fish with
Q: In what order are the entries computed?
– To compute an entry, you need the entries in the “right-angled triangle below it”.
– Row by row, bottom to top, left to right
26/32
Example
S
VP
S
VP
PP
S
NP
NP
NP
VP, V
Det
N
P
Det
N
S→NP VP
VP →VP PP |V NP PP →P NP
NP → Det N
she eats 1234567
V P ∈ table(2, 4) because
– the string from position 2 to 4 is "eats a fish",
– which can be split into "eats" from position 2 to 2,
– and "a fish" from position 3 to 4, and
– and V P → V N P is a rule, V ∈ table(2, 2), and NP ∈table(3,4).
a fish with
a fork
27/32
Example
S
VP
S
VP
PP
S
NP
NP
NP
VP, V
Det
N
P
Det
N
S→NP VP
VP →VP PP |V NP PP →P NP
NP → Det N
she eats 1234567
V P ∈ table(2, 7) because
– the string from position 2 to 7 is "eats a fish with a fork",
– which can be split into "eats a fish" from position 2 to 4,
– and "with a fork" from position 5 to 7, and
– andVP →VP PP isarule,VP ∈table(2,4),and P P ∈ table(5, 7).
a fish with
a fork
28/32
How efficient is this algorithm?
Time complexity
– There are O(n2) entries in the table,
– and each entry requires O(n|G|) work to compute, since one
must check each rule and check at most n splits,
– So the total time is O(n3|G|).
– Here |G|, the size of G, is the number of bits required to write the grammar down, which is polynomial in the number of rules, variables and terminals.
Asides
– For fixed G and varying w, the time is O(n3).
– If the input is large (e.g., a compiling a very large program), then this complexity is too high. So, in this case, one uses restricted grammars for which there are faster algorithms, see COMP3109:Programming Languages and Paradigms
29/32
What if I want to compute a derivation?
You can adjust the algorithm to store more information in order to produce a derivation (or parse tree).
– Store in table(i, j) a rule A → BC and splitting point k
(i ≤ k < j) that can be used to derive A ⇒∗ wiwi+1 ...wj.
– You can then deduce a rightmost derivation using a stack.
– Start by pushing the element (S, 1, n) onto the stack, and then repeat the following:
– if (A, i, i) is the top element of the stack then apply the rule A → wi and pop the stack.
– if (A,i,j) is the top-element of the stack, then apply the rule A → BC, pop the stack, and push the element (B, i, k) followed by (C, k + 1, j) onto the stack.
30/32
Good to know
– There is a machine-theoretic characterisation of context-free languages (pushdown automaton = NFA + stack).
– Come to COMP2922 to learn more! or see Sipser Chapter 2.2
– Not every language is context-free. E.g., {ww : w ∈ {0, 1}∗} is not context-free.
– The proof of this uses a pumping argument, see Sipser Chapter 2.3.
31/32
Where are we going?
Next week we start learning about an even more powerful model of computation that can even recognise non-context-free languages — the Turing machine (= automaton + unbounded memory).
This is the most powerful model of computation that we know of, and is a model of a general purpose computer.
32/32