LOGIC IN COMPUTER SCIENCE
by Benji MO
Some people are always critical of vague statements. I tend rather to be critical of precise statements. They are the only ones which can
correctly be labeled wrong.
– Raymond Smullyan
August 2020
Supervisor: Professor Hantao Zhang
TABLE OF CONTENTS
Page LISTOFFIGURES …………………………. viii
CHAPTER
1 IntroductiontoLogic …………………….. 1
1.1 LogicisEverywhere …………………… 2 1.1.1 StatementorProposition …………….. 2 1.1.2 ABriefHistoryofLogic……………… 4
1.2 LogicalFallaciesinArguments ……………… 5 1.2.1 FormalFallacies …………………. 5 1.2.2 InformalFallacies ………………… 7
1.3 ABriefReviewofMathematicalLogic . . . . . . . . . . . . . . 13
1.3.1 SetTheory……………………. 13
1.3.2 ModelTheory ………………….. 21
1.3.3 ProofTheory…………………… 22
1.3.4 ComputabilityTheory………………. 25
1.4 ExerciseProblems ……………………. 29
2 PropositionalLogic ……………………… 32
2.1 Syntax …………………………. 32 2.1.1 LogicalOperators ………………… 32 2.1.2 Formulas …………………….. 34
2.2 Semantics………………………… 36
2.2.1 Interpretations………………….. 36
2.2.2 Models,Satisfiability,andValidity . . . . . . . . . . . . 38
2.2.3 Equivalence……………………. 40
2.2.4 Entailment……………………. 43
2.2.5 TheoremProvingandtheSATProblem . . . . . . . . . 45
2.3 NormalForms ……………………… 46
2.3.1 NegationNormalForm(NNF) ………….. 48
2.3.2 ConjunctiveNormalForm(CNF) . . . . . . . . . . . . 49
2.3.3 DisjunctiveNormalForm(DNF)…………. 51
2.3.4 FullDNFandFullCNFfromTruthTable. . . . . . . . 53
2.3.5 BinaryDecisionDiagram(BDD)…………. 54
2.4 OptimizationProblems …………………. 59 2.4.1 MinimumSetofOperators ……………. 59
iii
2.4.2 LogicMinimization ……………….. 61
2.4.3 MaximumSatisfiability ……………… 69 2.5 UsingPropositionalLogic………………… 70 2.5.1 BitwiseOperators ………………… 70 2.5.2 Specify Problems in Propositional Logic . . . . . . . . . 72 2.6 Exercises ………………………… 76
3 ProofProceduresforPropositionalLogic . . . . . . . . . . . . . . . 83
3.1 SemanticTableau ……………………. 85 3.1.1 Tableau: ATreeStructureforDNF . . . . . . . . . . . 86 3.1.2 α-Rulesandβ-Rules……………….. 87
3.2 DeductiveSystems……………………. 90
3.2.1 InferenceRulesandProofs ……………. 90
3.2.2 HilbertSystems …………………. 92
3.2.3 NaturalDeduction………………… 94
3.2.4 InferenceGraphs…………………. 97
3.3 Resolution ……………………….. 98 3.3.1 ResolutionRule …………………. 98 3.3.2 ResolutionStrategies ………………. 101 3.3.3 PreservingSatisfiability……………… 103 3.3.4 CompletenessofResolution……………. 106 3.3.5 A Resolution-based Decision Procedure . . . . . . . . . 109 3.3.6 ClauseDeletionStrategies…………….. 110
3.4 BooleanConstraintPropagation(BCP) . . . . . . . . . . . . . 113
3.4.1 BCP:aSimplificationProcedure . . . . . . . . . . . . . 113
3.4.2 BCP: a Decision Procedure for Horn Clauses . . . . . . 115
3.4.3 Unit Resolution versus Input Resolution . . . . . . . . . 116
3.4.4 Head-TailLiteralsforBCP ……………. 118
3.5 Exercises ………………………… 122
4 PropositionalSatisfiability ………………….. 126
4.1 TheDPLLAlgorithm ………………….. 127 4.1.1 RecursiveVersionofDPLL ……………. 127
. . . . . . . . 130 ……. 131 . . . . . . . . 134 . . . . . . . . 136 4.2.1 Generating Clauses from Conflicting Clauses . . . . . . 137 4.2.2 DPLLwithCDCL………………… 138 4.2.3 UnsatisfiableCores ……………….. 141 4.2.4 RandomRestart…………………. 142 4.2.5 BranchingHeuristicsforDPLL………….. 143 4.3 UseofSATSolvers …………………… 145
4.1.2 All-SAT and Incremental SAT Solvers . .
4.1.3 BCPw: Implementation of Watch Literals
4.1.4 IterativeImplementationofDPLL . . . .
4.2 Conflict-Driven Clause Learning (CDCL) . . . .
iv
4.3.1 Specify SAT Instances in DIMACS Format . . . . . . . 146 4.3.2 SudokuPuzzle………………….. 147 4.3.3 LatinSquareProblems ……………… 149 4.3.4 GraphProblems …………………. 150
4.4 LocalSearchMethodsandMaxSAT …………… 152 4.4.1 LocalSearchMethodsforSAT ………….. 152 4.4.2 2SATvesusMax2SAT………………. 155
4.5 MaximumSatisfiability …………………. 157
4.5.1 WeightMaxSATandHybridMaxSAT . . . . . . . . . . 157
4.5.2 TheBranch-and-BoundAlgorithm . . . . . . . . . . . . 159
4.5.3 SimplificationRulesandLowerBound . . . . . . . . . . 161
4.5.4 UseofHybridMaxSATSolvers. . . . . . . . . . . . . . 164
4.6 ExerciseProblems ……………………. 166
5 FirstOrderLogic ………………………. 170
5.1 SyntaxofFirstOrderLanguages…………….. 170 5.1.1 TermsandFormulas……………….. 171 5.1.2 TheQuantifiers …………………. 174 5.1.3 UnsortedandMany-SortedLogics . . . . . . . . . . . . 176
5.2 Semantics………………………… 178 5.2.1 Interpretation ………………….. 178 5.2.2 Models, Satisfiability, and Validity . . . . . . . . . . . . 182 5.2.3 EntailmentandEquivalence …………… 183
5.3 ProofMethods……………………… 187 5.3.1 SemanticTableau ………………… 187 5.3.2 NaturalDeduction………………… 189
5.4 ConjunctiveNormalForm………………… 190
5.4.1 Prenex Normal Form and Negation Normal Form . . . . 191
5.4.2 Skolemization ………………….. 193
5.4.3 Skolemizing Non-Prenex Formulas . . . . . . . . . . . . 195
5.4.4 ClausalForm…………………… 197
5.4.5 HerbrandModelsforCNF ……………. 198
5.5 ExerciseProblems ……………………. 200
6 UnificationandResolution ………………….. 202
6.1 Unification ……………………….. 202 6.1.1 SubstitutionsandUnifiers…………….. 202 6.1.2 CombiningSubstitutions …………….. 203 6.1.3 Rule-BasedUnification ……………… 204 6.1.4 AlmostLinearTimeUnification . . . . . . . . . . . . . 208
6.2 Resolution ……………………….. 211 6.2.1 TheResolutionandFactoringRules . . . . . . . . . . . 211 6.2.2 ARefutationalProofProcedure . . . . . . . . . . . . . 214
v
6.3 OrderedResolution …………………… 216 6.3.1 SimplificationOrderings……………… 216 6.3.2 TheOrderedResolutionRule…………… 219 6.3.3 Completeness of Ordered Resolution . . . . . . . . . . . 219
6.4 Prover9andMace4 …………………… 222 6.4.1 InputFormulastoProver9 ……………. 222 6.4.2 InferenceRulesandOptions …………… 227 6.4.3 SearchHeuristicsandLimits …………… 230 6.4.4 TermOrderinginProver9…………….. 233 6.4.5 Mace4………………………. 234 6.4.6 FiniteModelFindingbySATSolvers . . . . . . . . . . 237
6.5 ExerciseProblems ……………………. 239
7 EquationalLogic……………………….. 241
7.1 EqualityofTerms ……………………. 241 7.1.1 AxiomsofEquality ……………….. 242 7.1.2 Semanticsof“=” ………………… 243 7.1.3 TheoryofEquations……………….. 245
7.2 RewriteSystems…………………….. 247
7.2.1 RewriteRules ………………….. 247
7.2.2 TerminationofRewriteSystems . . . . . . . . . . . . . 248
7.2.3 ConfluenceofRewriting……………… 249
7.2.4 The Knuth-Bendix Completion Procedure . . . . . . . . 251
7.3 InductiveTheoremProving……………….. 257 7.3.1 InductiveTheorems ……………….. 257 7.3.2 StructuralInduction……………….. 258 7.3.3 InductiononTwoVariables……………. 260 7.3.4 Multi-Sort Algebraic Specifications . . . . . . . . . . . . 261
7.4 ResolutionwithEquality ………………… 263 7.4.1 Paramodulation …………………. 263 7.4.2 SimplificationRules ……………….. 265 7.4.3 Prover9……………………… 267
7.5 ExerciseProblems ……………………. 269
8 Prolog:ProgramminginLogic ………………… 271
8.1 Prolog’sWorkingPrinciple ……………….. 271 8.1.1 HornClausesinProlog ……………… 271 8.1.2 ResolutionProofsinProlog……………. 273 8.1.3 AGoal-ReductionProcedure …………… 275
8.2 Prolog’sDataTypes…………………… 280 8.2.1 Atoms,Numbers,andVariables . . . . . . . . . . . . . 280 8.2.2 CompoundTermsandLists……………. 281 8.2.3 ExamplesofPrologPrograms…………… 285
8.3 BeyondClausesandLogic………………… 287
vi
8.3.1 TheCutOperator! ……………….. 288 8.3.2 NegationasFailure ……………….. 289 8.3.3 BeyondClauses …………………. 291
8.4 ExerciseProblems ……………………. 292 9 UndecidableProblemsinFirst-OrderLogic . . . . . . . . . . . . . . 293
9.1 TuringMachines…………………….. 294 9.1.1 Formal Definition of Turing Machines . . . . . . . . . . 295 9.1.2 High-level Description of Turing Machines . . . . . . . . 298 9.1.3 RecognizablevsDecidable ……………. 298
9.2 DecidabilityofProblems ………………… 299 9.2.1 EncodingofDecisionProblems………….. 299 9.2.2 DecidableProblems ……………….. 301 9.2.3 UndecidableProblems………………. 303
9.3 TuringCompleteness ………………….. 306 9.3.1 TuringCompletenessofProlog………….. 307 9.3.2 Turing Completeness of Rewrite Systems . . . . . . . . 309
9.4 ExerciseProblems ……………………. 310
10HoareLogic …………………………. 311
10.1HoareTriples………………………. 312 10.1.1 HoareRules …………………… 314 10.1.2 ExamplesofFormalVerification . . . . . . . . . . . . . 318 10.1.3 PartialandTotalCorrectness. . . . . . . . . . . . . . . 320
10.2 AutomatedGenerationofAssertions . . . . . . . . . . . . . . . 321 10.2.1 VerificationConditions ……………… 323 10.2.2 ImplementingvcinProlog ……………. 328
10.3ObtainingGoodLoopInvariants…………….. 331 10.3.1 Invariants from Generalizing Postconditons . . . . . . . 332 10.3.2 ProgramSynthesisfromInvariants . . . . . . . . . . . . 335
10.4 ChoosingAGoodLanguageforAssertions . . . . . . . . . . . 338 10.4.1 Domain Theory for Expressing Assertions . . . . . . . . 338 10.4.2 RecursiveProgramsvsLoops…………… 342
10.5ExerciseProblems ……………………. 342 11CONCLUSION ……………………….. 343
vii
LIST OF FIGURES
Figure Page
2.1.1Theformulatreefor(¬p∧q)→(p∧(q∨¬r)) . . . . . . . . . . . . 35 2.3.1 The BDD of INF ite(x1, ite(x2, 1, 0), ite(x2, ite(x3, 1, 0), ite(x3, 0, 1)))
derivedfromx ̄x ̄x ̄∨xx ∨xx. ……………… 57 123 12 23
2.3.2ThefirstBDDusesa1 >b1 >a2 >b2 >a3 >b3 andthesecond BDDusesa1 >a2 >a3 >b1 >b2 >b3 forthesameformula (a1∧b1)∨(a2∧b2)∨(a3∧b3). ……………….. 57
2.4.1K-mapforf(x,y)=xy+xy+xy. ……………… 63 2.4.2TheK-mapforobtainingf(A,B,C)=AB+AC. . . . . . . . . . 64 2.4.3The K-map for f(P,Q,R,S) = m0 +m2 +m5 +m7 +m8 +m10 +
m13+m15.Thesimplifiedformulaisf=QS+QS. …….. 65 2.4.4TheK-mapforf(A,B,C)=ABC+ABC+ABC+ABC. The
simplifiedCNFisf=(A+B+C)(A+B)(B+C). . . . . . . . . 66 3.1.1ThetableauxforA=p∧(¬q∨¬p)andB=(p∨q)∧(¬p∧¬q) . . 87 3.2.1Theprooftreeof¬q …………………….. 91 4.1.1TheresultofBCPisshownintheboxneareachnode. . . . . . . . 128 4.3.1 A typical example of Sudoku puzzle and its solution. . . . . . . . . 147 4.4.1Theimplicationgraphfor2CNFCandtheSCC. . . . . . . . . . . 156 4.5.1Agraphforthecliqueproblem. ……………….. 159
viii
CHAPTER 1 INTRODUCTION TO LOGIC
When you see the word “logic”, what would you think of it? In the Merriam- Webster dictionary, the word logic is defined as follows:
1.
(a) a science that deals with the principles and criteria of validity of inference and demonstration: the science of the formal principles of reasoning e.g., a professor of logic
(b) a branch or variety of logic e.g., Boolean logic
(c) a branch of semiotics, especially: syntactics (d) the formal principles of a branch of knowledge
e.g., the logic of grammar.
(a) a particular mode of reasoning viewed as valid or faulty.
e.g., She spent a long time explaining the situation, but he failed to see her logic.
(b) Relevance, Propriety
e.g., Could not understand the logic of such an action.
(c) interrelation or sequence of facts or events when seen as inevitable or predictable
e.g., By the logic of events, anarchy leads to dictatorship.
(d) the arrangement of circuit elements (as in a computer) needed for com-
putation
2.
In this book, we will take “logic” in the sense of 1.(a): Logic is the science of the formal principles of reasoning. When these principles are introduced, we will use “logic” in the sense of 1.(b) and introduce several branches of logic, such as propositional logic, equational logic, first-order logic, etc. These logics belong to the domain of mathematical logic (formerly known as symbolic logic).
1
1.1 Logic is Everywhere
Logic has been called the calculus of computer science, because logic plays a fundamental role in computer science, starting from the construction of computers, to the computing devices beyond our ability of construction, such as computer ar- chitecture (digital gates, hardware verification), software engineering (specification, verification), programming languages (semantics, type theory, abstract data types, object-oriented programming), databases (relational algebra), artificial intelligence (automated theorem proving, knowledge representation), algorithms and theory of computation (complexity, computability), etc.
Logic also plays important roles in other fields, such as mathematics and philosophy. In mathematics, logic includes both the mathematical study of logic and the applications of formal logic to other areas of mathematics. The unifying themes in mathematical logic include the study of the expressive power of formal systems and the deductive power of formal proof systems. Mathematical logic is often divided into the fields of (a) set theory, (b) model theory, (c) proof theory, and (d) computability theory. These areas share basic results on logic.
Symbolic logic in the late nineteenth century and mathematical logic in the twentieth are also called formal logic. Topics traditionally treated by logic not being part of formal logic have tended to be termed either philosophy of logic or philosophical logic . Philosophical logic dealt with arguments in the natural language used by humans.
Example. Given the premises that (a) “all men are mortal” and (b) “Socrates is a man”, we may draw the conclusion that (c) “Socrates is mortal”, by the inference rule of Modus Ponens.
Philosophical logic is the investigation, critical analysis and intellectual reflection on issues arising in logic and is the branch of studying questions about reference, predication, identity, truth, quantification, existence, entailment, modality, and ne- cessity.
1.1.1 Statement or Proposition
Logic comes from natural languages as most sentences in natural languages are
statements or propositions. A proposition is a statement or assertion that expresses a judgment or opinion. Here statement, assertion, and proposition are synonyms. Here are some examples:
• That apple is delicious.
2
• Today is either Monday or Tuesday.
• He has good memory and runs fast.
Every statement can be either true or false. True or false are called the truth values of a statement. Some sentences in natural languages are not statements, such as commands or exclamatory sentences. For example, “Run fast!” Coach shouts. The first part of the sentence, “Run fast!”, is n ̄ot a statement; the second part, “Coach shouts”, is a statement. In fact, a sentence is a statement if and only if it has a truth value.
In natural languages, we can combine, or relate statements with words such as “not” (negation), “and” (conjunction), “or” (disjunction), “implies” (implication), “if-then”, etc. That is, a statement can be obtained from other statements by these words. In logic, these words are called logical operators, or equivalently, logical connectives. A statement is composed if it can be expressed as a composition of several simpler statements; otherwise it is simple. In the above three examples of statements, the first statement is simple, but the other two are composed. That is, “today is either Monday or Tuesday” is the composition of “today is Monday” and “today is Tuesday”, using the logical operator “or”. We often regard “that apple is not delicious” as a composed statement: It is the negation of a simple statement.
Logical connectives are indispensable for expressing the relationship between statements. For example, the following statement is a composition of several simple statements.
(∗) Since I don’t have a good memory, I don’t know that today is either Monday or Tuesday.
To see this clearly, we need to introduce symbols, such as p, q, r, s, and t, to denote them:
1. p: I have a good memory. 2. q: Today is Monday.
3. r: Today is Tuesday.
4. s: q or r.
5. t: I know s.
3
To express that the statement (∗) is composed, let us use these symbols for logical operators: ¬ for negation, ∨ for “OR”, ∧ for “AND”, and → for implication. Then (∗) becomes the following formula:
(¬p) → (¬t)
Here each of the symbols p, q, r, s, and t is called a propositional variable, which denotes a statement, either simple or composed. Naturally, these propositional variables, like statements, can take on only the truth values, i.e., true and false.
Propositional variables are also called Boolean variables after their inventor, the nineteenth century mathematician George Boole. Boolean logic includes any logic in which the considered truth values are true and false. The study of proposi- tional variables with logic operators is called Propositional logic, which is the most simple one in the family of Boolean logics. On the other hand, probability logic is not a Boolean logic because probability values are used to represent various degrees of the truth values. Questions regarding degrees of truth or the sense-content of sentences are ignored in Boolean logic. Despite this simplification, or indeed be- cause of it, such a method is scientifically successful. One does not even have to know exactly what the truth values true and false actually are.
1.1.2 A Brief History of Logic
Sophisticated theories of logic were developed in many cultures, including
China, India, Greece and the Islamic world. Greek methods, particularly Aris- totelian logic (or term logic) as found in the Organon, found wide application and acceptance in Western science and mathematics for millennia.
Aristotle (384 – 322 BC) was the first logician to attempt a systematic analysis of logical syntax, of noun (or term), and of verb. He demonstrated the principles of reasoning by employing variables to show the underlying logical form of an ar- gument. He sought relations of dependence which characterize necessary inference, and distinguished the validity of these relations, from the truth of the premises. He was the first to deal with the principles of contradiction and excluded middle in a systematic way. Aristotle has had an enormous influence in Western thought. and he developed the theory of the syllogism, where three important principles are applied for the first time in history: the use of symbols, a purely formal treatment, and the use of an axiomatic system. Aristotle also developed the theory of fallacies, as a theory of non-formal logic.
Christian and Islamic philosophers such as Boethius (died 524), Ibn Sina (Avi- cenna, died 1037) and William of Ockham (died 1347) further developed Aristotle’s
4
logic in the Middle Ages, reaching a high point in the mid-fourteenth century, with Jean Buridan (1301-1358/62 AD).
In the 18th century, attempts to treat the operations of formal logic in a sym- bolic or algebraic way had been made by philosophical mathematicians, including Leibniz and Lambert, but their works remained isolated and little known.
In the beginning of the 19th century, logic was studied with rhetoric, through the syllogism, and with philosophy. Mathematical logic emerged in the mid-19th century as a field of mathematics independent of the traditional study of logic. The development of the modern “symbolic” or “mathematical” logic during this period by the likes of Boole, Frege, Russell, and Peano is the most significant in the two-thousand-year history of logic, and is arguably one of the most important and remarkable events in human intellectual history.
1.2 Logical Fallacies in Arguments
In this section, we will give a brief introduction to logical fallacies, which are the subjects of philosophical logic. This introduction is based on Wikipedia’s page on this topic. We feel that the knowledge on fallacies will help us to write better reports and papers by avoiding fallacies.
In philosophical logic, an argument has similar meaning as a proof in formal logic, as the argument consists of premises and conclusions. A fallacy is the use of invalid or otherwise faulty reasoning in the construction of an argument. A fallacious argument may be deceptive by appearing to be better than it really is. Some fallacies are committed intentionally to manipulate or persuade by deception, while others are committed unintentionally due to carelessness or ignorance.
Fallacies are commonly divided into “formal” and “informal”. A formal fallacy can be expressed neatly in a formal logic, such as propositional logic, while an informal fallacy cannot be expressed in a formal logic.
1.2.1 Formal Fallacies
In philosophical logic, a formal fallacy is also called deductive fallacy, logical
fallacy, or non sequitur (Latin for “it does not follow”). This is a pattern of reasoning rendered invalid by a flaw in its logical structure.
Example. Give the premises that (a) my car is some car, and (b) some cars are red, we draw the conclusion that (c) my car is red.
This is a typical example of a conclusion that does not follow logically from
5
6 premises or that is based on irrelevant data. Here are some common logical fallacies:
• Affirming the consequent: Any argument with the invalid structure of: If A then B. B, therefore A.
Example. If I get a B on the test, then I will get the degree. I got the degree, so it follows that I must have received a B. In fact, I got an A.
• Denying the antecedent: Any argument with the invalid structure of: If A then B. Not A, therefore not B.
Example. If it’s a dog then it’s a mammal. It’s not a dog, so it must not be a mammal. In fact, it’s a cat.
• Affirming a disjunct: Any argument with the invalid structure of: If A or B. A, therefore not B.
Example. I am working or I am at home. I am working, so I must not be at home. In fact, I am working at home.
• Denying a conjunct: Any argument with the invalid structure of: It is not the case that both A and B. Not A, therefore B.
Example. I cannot be both at work and at home. I am not at work, so I must be at home. In fact, I am at a park.
• Undistributed middle: Any argument with the invalid structure of: Every A has B. C has B, so C is A.
Example. Every bird has a beak. That creature has a beak, so that creature must be a bird. In fact, the creature is a dinosaur.
1.2.2 Informal Fallacies
A formal fallacy occurs when the structure of the argument is incorrect, despite
of the truth of the premises. A valid argument always has a correct formal structure and if the premises are true, the conclusion must be true. A valid argument is a formally correct argument that use true premises. When we use false premises, the formal fallacies disappear, but the argument may be regarded as a fallacy as the conclusion is invalid.
As an application of modus ponens, the following example contains no formal fallacies:
Example. If you took that course on CD player repair right out of high school, you would be doing well and having a vacation on the moon right now.
Even though there is no logic error in the argument, the conclusion is invalid because the premise is contrary to the fact. With a false premise, you can make any conclusion, so that the composed statement is always true. However, an always true statement has no value in reasoning.
By contrast, an argument with a formal fallacy could still contain all true premises:
Example. (a) If someone owns Fort Knox, then he is rich. (b) Bill Gates is rich. Therefore, (c) Bill Gates owns Fort Knox.
Although, (a) and (b) are true statements, (c) does not follow from (a) and (b) because the argument commits the formal fallacy of “affirming the consequent”. There are numerous kinds of informal fallacies that use an incorrect relation between premises and conclusion. These fallacies can be grouped approximately
into four groups:
• fallacies of improper premise;
• fallacies of faulty generalizations; • fallacies of questionable cause; and • relevance fallacies.
7
1.2.2.1 Improper Premise
• Begging the question: Providing what is essentially the conclusion of the
argument as a premise.
Example. Everyone wants the new iPhone because it is the hottest new gadget on the market!
• Circular reasoning: The reasoner begins with what he or she is trying to end up with.
Example. You must obey the law, because it’s illegal to break the law.
• Complex question: Someone asks a question that presupposes something that has not been proven or accepted by all the people involved.
Example. “Who is the King of France?” would commit the com- plex question fallacy because while it assumes there is a place called France (true), it also assumes France currently has a king (false).
1.2.2.2 Faulty Generalizations
The arguments in which the premises are related to the conclusions yet only
weakly support the conclusions. A faulty generalization is thus produced.
• Accident: an exception to a generalization is ignored.
Example. Cutting people with knives is a crime. Surgeons cut people with knives. Therefore, surgeons are criminals.
• Cherry picking. Only select favor evidence in order to persuade the audi- ence to accept a position, and evidence that would go against the position is withheld. The stronger the withheld evidence, the more fallacious the argu- ment.
Example. My political candidate gives 10% of his income to the needy, and volunteers one day a week at a homeless shelter. There- fore, he is honest and morally straight.
• Weak analogy: The analogy is poorly suited.
8
Example. To say humans are immortal is like saying a car can run forever.
• Hasty generalization: Basing a broad conclusion on a small sample or the making of a determination without all of the information required to do so.
Example. You send a message back home that everyone in this new country is rude because the first person you meet in the airport is rude.
• Misleading vividness: involves describing an occurrence in vivid detail, even if it is an exceptional occurrence, to convince someone that it is a prob- lem.
Example. We have had four days of temperatures that were 100 degrees or more! Global warming is getting dramatically worse!
1.2.2.3 Questionable Cause
This is a general type of error with many variants. Its primary basis is the con-
fusion of association with causation, either by inappropriately deducing (or reject- ing) causation or a broader failure to properly investigate the cause of an observed effect.
• Faulty cause/effect: A false cause/effect fallacy occurs when one cites to sequential events as evidence that the first caused the second. The argument generally looks like this: “Event A happened. Event B happened after A. Therefore, A caused B.”
Example: Every day, I eat cereal for breakfast. One time, I had a muffin instead, and there was a major earthquake in my city. I’ve eaten cereal ever since.
• Complex cause: It is assumed that there is one, simple cause of an outcome when in reality it may have been caused by a number of only jointly sufficient causes.
Example. President Trump has been in office for a month and gas prices have been skyrocketing.
9
• Slippery slope: In a slippery slope argument, one insists that one evidence will lead to a chain reaction resulting in an undesirable end or ends, without direct evidence that this course of events will happen.
Example. Today late for ten minutes, tomorrow late for an hour, and then someday you will simply cease to show up.
• Furtive fallacy: Outcomes are asserted to have been caused by the hidden misconduct or wrongdoing of decision makers. “Furtive” means secretive, hidden, conspiratorial, sly, and sneaky. Conspiracy theory is a typical example of furtive fallacy.
• Gambler’s fallacy: The incorrect belief that separate, independent events can affect the likelihood of another random event. If a fair coin lands on heads 10 times in a row, the belief that it is “due to land on tails” is incorrect. Parents might believe incorrectly that the next birth is a son, after having three girls.
1.2.2.4 Fallacies of Relevance
• Appeal to ignorance: Assuming that a claim is true because it has not been
or cannot be proven false or vice versa.
Example. “You can’t prove that there isn’t a mirror universe of our own, so there must be one out there somewhere!”
• Appeal to common sense: Asserting that your conclusion or facts are just “common sense” when, in fact, they are not.
Example. “I cannot imagine how this could be true; therefore, it must be false.”
• Argument from repetition: Repeating an argument until nobody cares to discuss it any more.
Example. Stupid people are so annoying. They prove their stu- pidity by saying stupid things.
• Argument from silence: Assuming that a claim is true based on the absence of textual or spoken evidence from an authoritative source.
10
Example. Jay: “Dude, where are my car keys?” Silent Bob says nothing. Jay: “I KNEW you took them!”
• Appeal to motive: Attacking the arguer instead of the argument, dismissing an idea by questioning the motives of its proposer. The word “motive” can be replaced by character, race, status, etc. This type of fallacies is called “ad hominem”.
Example. “This is why a woman shouldn’t do a man’s job!”
• Appeal to emotion: An argument is made due to the manipulation of emotions, rather than the use of valid reasoning. The emotion can be fear, flattery, pity, ridicule, etc.
Example. “There must be objective rights and wrongs in the uni- verse. If not, how can you possibly say that torturing babies for fun could ever be right?”
• Appeal to authority: An argument uses the testimony of an authority in order to warrant their conclusion, but the authority appealed to is not an expert in the field in question. The word authority can be replaced by majority, nature, novelty, tradition, wealth, etc.
Example. “Einstein said ‘God does not play dice with the uni- verse,’ therefore God must exist.”
• Straw man: An argument based on misrepresentation of an opponent’s po- sition, especially to attack a weaker version of it rather than the argument actually presented. Also known as the weak man fallacy.
Example. Parent: “No dessert until you finish your chicken and vegetables!” Child: “You only love me when I eat.”
• Two wrongs make a right: It occurs when it is assumed that if one wrong is committed, another wrong will rectify it.
Example. Jimmy stole Tommy’s lunch in the past. Therefore, it is acceptable for Tommy to steal Jimmy’s lunch today.
Some informal fallacies do not belong to the above four groups. For example, the false dilemma fallacy and the middle ground fallacy.
11
False Dilemma Fallacy
False Dilemma presents a choice between two mutually exclusive options, im- plying that there are no other options. One option is clearly worse than the other, making the choice seem obvious. Also known as the either/or fallacy, false dilemmas are a type of informal logical fallacy in which a faulty argument is used to persuade an audience to agree. False dilemmas are everywhere.
• Vote for me or live through four more years of higher taxes.
• America: Love it or leave it.
• Either we let every immigrant into our country, or we close the borders for everyone.
• Subscribe to our streaming service or be stuck with cable.
• You can either eat at this restaurant or have a sad TV dinner alone.
• Would you rather invest in your future or enjoy your money now?
The Middle Ground Fallacy
This fallacy assumes that a compromise between two extreme conflicting points is always true. Arguments of this style ignore the possibility that one or both of the extremes could be completely true or false – rendering any form of compromise between the two invalids as well.
Example. Mary thinks the best way to improve sales is to redesign the entire company website, but John is firmly against making any changes to the website. Therefore, the best approach is to redesign some portions of the website.
An argument could contain both an informal fallacy and a formal fallacy yet lead to a conclusion that happens to be true. For example, again affirming the consequent, now also from an untrue premise:
If a scientist makes a statement about science, it is correct. It is true that quantum mechanics is deterministic. Therefore, a scientist has made a statement about it.
Having an understanding of these basic logical fallacies can help us more confidently parse the arguments we participate in and witness on a daily basis, separating fact from sharply dressed fiction.
12
1.3
• • • •
A Brief Review of Mathematical Logic
Mathematical Logic is roughly divided into four areas: set theory,
model theory,
proof theory, and
computability theory.
Each area has a distinct focus, although many techniques and results are shared between multiple areas. The borderlines between these areas, and the lines between mathematical logic and other fields of mathematics, are not always sharp. Each area contains rich materials which can be studied in graduate courses of multi- ple levels. As an introductory book on logic, we will introduce some basic concepts, which are closely related to this book, in these areas. This introduction is indeed less than a tip of the iceberg.
1.3.1 Set Theory
A set is a structure, representing an unordered collection of zero or more
distinct objects. All elements in a set are unequal (distinct) and unordered. Set theory is a branch of mathematical logic that studies sets and deals with operations between, relations among, and statements about sets. Although any type of object can be collected into a set, set theory is applied most often to objects that are relevant to mathematics. The language of set theory can be used to define nearly all mathematical objects.
1.3.1.1 Basic Concepts and Notations
Let U be the collection of all objects under consideration. U is often called
the universal set or universe of discourse. For example, U can be the set of natural numbers or the population of the world. Any object x under consideration is a member of U, written as x ∈ U. A set is any collection of objects from U. The empty set is often denoted by ∅ or simply {}. Given two sets A and B, the following are the popular set operations:
• x∈A: itistrueiff(e.g.,ifandonlyif)xamemberofA.
13
• A ∪ B: the union of A and B. For any object x, x ∈ A ∪ B iff x is in A or B or both.
• A ∩ B: the intersection of A and B. x ∈ A ∩ B iff x ∈ A and x ∈ B.
• A − B: the set difference of A and B. x ∈ A − B iff x ∈ A but not x ∈ B.
• A: the complement of A (with respect to the universal set U). A = S − A. A−B is also called the complement of B with respect to A: A−B = A∩B.
• A ⊆ B: A is a subset of B. A ⊆ B is true iff every member of A is a member of B. Naturally, A ⊆ U and B ⊆ U for the universal set U.
• A ⊂ B: A is a proper subset of B. A ⊂ B is true iff A ⊆ B and A ̸= B.
• A=B: A=BiffA⊆BandB⊆A,thatis,AandBcontainexactlythe
same elements.
• A×B: the Cartesian product of A and B. An ordered pair (a,b) is in A×B
iff a ∈ A and b ∈ B.
• Ai: the Cartesian product of i copies of A, where i is a natural number (i.e., non-negative integers). When i = 0, A0 denotes the empty sequence; A1 = A; A2 =A×A,A3 =A×A×A,etc.
• |A|: the cardinality of a set. When A is finite, |A| is the number of members in A. |A| is called a cardinal number if A is infinite.
• P(A): the power set of A. P(A) is the set whose members are all of the possible subsets of A. That is, for any set X, X ∈ P(A) iff X ⊆ A. When |A| = n, |P(A)| = 2n.
Elementary set theory can be studied informally and intuitively, and so can be taught in primary schools using Venn diagrams. Many mathematical concepts can be defined precisely using only set theoretic concepts, such as relations and functions.
1.3.1.2 Relations and Functions
Given two sets A and B, a relation between A and B is a subset of A×B. In
many applications, A = B. For example, the less-than relation, <, on N, is < = {⟨0, 1⟩, ⟨0, 2⟩, ⟨0, 3⟩, ⟨1, 2⟩, ⟨0, 4⟩, ⟨1, 3⟩, ...}
14
which is a subset of N ×N. In general, give a relation R over set S, i.e., R ⊆ S ×S, we assume that R(a,b) = ⊤ if ⟨a,b⟩ ∈ R and R(a,b) = ⊥ if ⟨a,b⟩ ̸∈ R. Of course, we are used to 0 < 1 = ⊤ and 1 < 0 = ⊥, but awkward to write < (0, 1) = ⊤ and < (1, 0) = ⊥.
A relation ≤ over the set S is reflexive if a ≤ a for any a,b ∈ S; ≤ is antisymmetry if a ≤ b and b ≤ a imply a = b; ≤ is transitive if a ≤ b and b ≤ c implya≤c;≤isconnex(i.e.,comparable)ifeithera≤borb≤aistrue. We say ≤ is a total order if ≤ is antisymmetry, transitive, and connex. We say ≤ is a partial order if ≤ is antisymmetry, transitive, and reflexive. Since the connexity implies the reflexivity, a total order is always a partial order but the inverse is not always true (some pair of elements may not comparable in a partial order).
Thestrictpartofapartialorder≤is<: a
Definition 1.3.1. A set S is countable if there exists an injective function f : S −→ N, where N is the set of natural numbers. If S is both countable and infinite, S is called countably infinite.
Example 1.3.2. Let E = {0, 2, 4, …, } be the set of all even natural numbers. Let f : E −→ N with f(x) = x, then f is injective, so E is countable and we may conclude that |E| ≤ |N. Since E is infinite, E is also countably infinite. If we define h : N −→ E with h(x) = 2x, then h is bijective, hence |N| ≤ |E| and we end up with |E| = |N|. This is not an exception for all countably infinite sets.
Proposition 1.3.3. There is a bijective function between any two countably infinite sets.
Proof. It suffices to show that the proposition holds for the two sets A and N (the set of natural numbers). Since A is countable, there exists an injective function f : A −→ N. Listing the elements of A as a0,a1,a2,…, such that i < j iff f(ai) < f(aj). Since A is infinite, define g : N −→ A as g(i) = ai, then g is a bijection between N and A. P
Today, countable sets form the foundation of discrete mathematics, and we first check what sets are countable.
Proposition 1.3.4. The following sets are countable: 1. Any finite set;
2. Any subset of N;
16
3. The set A × B, if both A and B are countable;
4. The set Z of integers;
5. The set of all binary strings (of finite length).
6. Σ∗, the set of all strings (of finite length) built on the symbols from Σ, a finite alphabet.
Proof. To show (1) is countable, we just need to define an injective function which map a set of n elements to the first n natural numbers.
(2) is countable because if S ⊆ N, then f : S −→ N, where f(x) = x, is an injective function.
(3) is countable when either A or B is finite and its proof is left as an exercise. When both A and B are countably infinite, without loss of generality, we assume A = B = N. Let g(k) = k(k + 1)/2, the sum of the first k positive integers, and f : N × N −→ N be the function:
f(i, j) = g(i + j) + j
If k = f(i,j), then i = g(m)+m−k and j = k−g(m), where g(m) ≤ k < g(m+1). That is, the inverse of f exists and f is bijective. It is easy to check that i + j = m. Forexample,ifk=12,thenm=4,andi=j=2. Table1.3.1.3showsthefirst few values of k = f(i,j).
i\j 0 1 2 3 4 0 0 2 5 9 14 1 1 4 8 13 19 2 3 7 12 18 25 3 6 11172432 4 1016233140
Table1.3.1: k=f(i,j)=(i+j)(i+j+1)/2+j.
To show (4) is countable, we can define a bijective function f on the integers: f (n) = if n ≥ 0 then 2n; else −2n − 1. This function maps 0 to 0; positive integers to even natural numbers, and negative integers to odd natural numbers.
To show (5) is countable, let S = {0,1}∗ be all the binary strings, we may sort S by length first, then by the decimal value of each string. Thus, the strings will be listed as follows:
S = {ε, 0, 1, 00, 01, 10, 11, 000, 001, ...},
17
where ε denotes the empty string. To find an injection from S to N, for any string w in S, if w = ε, let h(ε) = 0; and h(w) will replace every 0 in w by 2 and leaving 1 intact,
h(S) = {0, 2, 1, 22, 21, 12, 11, 222, 221, ...}.
Since there is a bijection between S and h(S), and h(S) ⊂ N, h : S −→ N is injective. So S is countable.
We can do better by finding a bijection between S and N. For any binary string s, let the length of s be n, i.e., |s| = n. There are 1+2+22 +...+2n−1 = 2n −1 strings shorter than s. Let v(s) be the decimal value of s (assuming v(ε) = 0), then there are v(s) strings of length n before s. So the position of s in the sorted list is 2n + v(s). Define f : S −→ N: f(s) = 2|s| + v(s) − 1, then f must be a bijection.
To show (6) is countable, let k = ⌈log2(|Σ|)⌉. Since |Σ| ≤ 2k, for each symbol a ∈ Σ, let h(a) be a distinct binary string of length k. For each string w ∈ Σ∗, let h(w) = h(a1)···h(an) if w = a1 ···an. Then h : Σ∗ −→ {0,1}∗ is an injective function, so Σ∗ is countable. P
The last three cases in the above proposition are examples of countably infinite sets.
In 1874, in his first set theory article, Georg Cantor introduced the term “countable set” and proved that the set of real numbers is uncountable by the famous diagonalization method, thus showing that not all infinite sets are countable. In 1878, he used bijection to define and compare cardinalities, contrasting sets that are countable with those that are uncountable.
Proposition 1.3.5. The following sets are uncountable: 1. B: the set of binary strings of infinite length;
2. R1: the set of non-negative real numbers less than 1; 3. P(N): the power set of the natural numbers;
4. F : The set of Boolean functions, f : N −→ {0, 1}, over the natural numbers.
Proof. For the first set B, we will use Cantor’s diagonalization method to show that B is uncountable by a refutational proof. Assume that B is countable, then there exists a one-to-one correspondence between B and N. For each i ∈ N, let the corresponding string in B be si, and the jth symbol in si be sij. Now, we construct a binary string x of infinite length as follows: the jth symbol of x, xj, isthecomplementofsjj. Thatis,ifsjj =0,thenxj =1;ifsjj =1,then xj = 0. Clearly, x ∈ B. Suppose x’s corresponding number in N is k, i.e., x = sk.
18
x= 1 1 1 0 1 1 1 0 1 0 ...
Table 1.3.2: Illustration of Cantor’s diagonalization method: If the set B can be enumerated as s0, s1, s2, ..., then x = s00s11s22... is a binary string but cannot be in B.
However, x differs from sk on the kth symbol, because xk is the complement of skk. The contradiction comes from the assumption that B is countable. Table 1.3.1.3 illustrates the idea of Cantor’s diagonalization method, where 0 = 1 and 1 = 0.
Once we knew B is uncountable, if we have a bijection from B to R1, then R1 must be uncountable. The bijection f : B −→ R1 can be simply f(s) = 0.s, as each number of R1 can be represented by a binary number, where s ∈ B.
To show P(N) is uncountable, let f : B −→ P(N) be f(s) = {j | jth symbol of s is 1 }. Obviously f is bijective.
Similarly, to show F is uncountable, let f : B −→ F be f(s) = f if for every j, f(j) = the jth symbol of s. Obviously f is bijective, too. P
The result that P(N) is uncountable while N is countable can be generalized: for every set S, the power set of S, i.e., P(S), has a larger cardinality than S itself. This result implies that the notion of the set of all sets is an inconsistent notion. If S were the set of all sets then P(S) would at the same time be bigger than S and a subset of S.
Theorem 1.3.6. (Cantor’s Theorem) |A| < |P(A)| for any set A.
Proof. Let us prove it by contradiction. If |A| = |P(A)|, there is a bijection f : A−→P(A). DefineS={a∈A|a̸∈f(a)},thenS⊆AandS∈P(A). There must exist b ∈ A such that f(x) = S because f is surjective. There only two possibilities: b ∈ S or b ̸∈ S.
19
s0 = 0 0 0 0 0 0 0 0 0 0 ... s1 = 0 0 0 0 0 0 0 0 1 0 ... s2 = 0 0 0 0 0 1 0 0 0 0 ... s3 = 0 0 0 1 0 0 0 0 0 0 ... s7 = 0 0 0 1 0 1 0 0 0 0 ... s6 = 0 0 0 0 1 0 0 0 0 0 ... s4 = 0 0 0 1 0 1 0 1 0 0 ... s5 = 0 0 1 0 0 0 0 1 0 0 ... s8 = 0 1 0 0 1 0 0 1 0 0 ... s9 = 0 1 0 0 1 1 0 1 0 1 ... ......
1. Ifb∈S,bydefinitionofS,b̸∈f(b)=S.
2. Ifb̸∈S,bydefinitionofS,b∈f(b)=S.
Both cases lead to a contradiction, so f cannot exist. It cannot be the case that
|A| > |P(A)| because g(a) = {a} is an injection from A to P(A). P
Cantor’s theorem implies that there is no such thing as the “set of all sets”. Suppose A were the set of all sets. Since every element of P(A) is a set, so P(A) ⊆ A. Thus |P (A)| ≤ |A|, a contradiction to Cantor’s theorem.
Cantor chose the symbol א0 for the cardinality of n, |N|, a cardinal number. א0 is read as aleph-null, after the first letter of the Hebrew alphabet. The cardinality of the reals is often denoted by א1, or c for the continuum of real numbers. Cantor’s theorem implies that there are infinitely many infinite cardinal numbers, and that there is no largest cardinal number.
א0 = א1 = א2 = א3 =
…
|N|
|P(N)| |P(P(N))| |P(P(P(N)))|
= 2א0 >א0 = 2א1 > א1 = 2א2 > א2
The famous inconsistent example related to set theory is so-called Russell’s paradox, discovered by Bertrand Russell in 1901:
Russell’s paradox: Let T = {S | S ̸∈ S}, the set of all sets S that does not contain itself. What is the case of T ∈ T ?
IfT ∈T isfalse,thenT ∈T bytheconditionS̸∈S. IfT ∈T istrue,then T ̸∈ T because every member of T satisfies S ̸∈ S. This paradox showed that some attempted formalizations of the na ̈ıve set theory created by Georg Cantor led to a contradiction. Russell’s paradox shows that the concept of “a set contains itself” is invalid. If “S ∈ S” is always false, then T = {S | S ̸∈ S} is the “set of all sets”, which does not exist, either.
A layman’s version of Russell’s paradox is called the Barber’s paradox.
Barber’s paradox: In a village, there is only one barber who shaves all those, and those only, who do not shave themselves. The question is, does the barber shave himself?
Answering this question results in a contradiction. The barber cannot shave himself as he only shaves those who do not shave themselves. Conversely, if the barber does not shave himself, then he fits into the group of people who would be shaved by the barber, and thus, as the barber, he must shave himself.
20
The discovery of paradoxes in informal set theory caused mathematicians to wonder whether mathematics itself is inconsistent, and to look for proofs of consis- tency. Ernst Zermelo (1904) gave a proof that every set could be well-ordered, using the axiom of choice, which drew heated debate and research among mathematicians and the pioneers of set theory. In 1908, Zermelo provided the first set of axioms for set theory. These axioms, together with the additional axiom of replacement proposed by Abraham Fraenkel, are now called Zermelo–Fraenkel set theory (ZF). Besides ZF, many set theories are proposed since then, to rid paradoxes from set theory.
1.3.2 Model Theory
Model theory is the study of mathematical structures (e.g. groups, fields,
algebras, graphs, logics) in a formal language. Every formal language has its syntax and semantics. Models are a semantic structure associated with syntactic structures in a formal language. Theories are then introduced based on models. Following this approach, every formal logic is defined inside of a formal language.
1.3.2.1 Syntax and Semantics
The syntax of a formal language specifies how various components of the
language, such as symbols, words, and sentences, are defined. For example, the language for propositional logic uses only propositional variables and logic operators as its symbols, and well-formed formulas built on these symbols as its sentences. In model theory, a set of sentences in a formal language is one of the components that form a theory. The language for logic often contains the two constant symbols, either 1 and 0, or ⊤ and ⊥, which are interpreted as true and false, respectively. They are usually considered to be special logical operators which take no arguments, not propositional variables.
The semantics of a language specifies the meaning of various components of the language. For example, if we use symbol q to stands for the statement “Today is Monday”, and r for “Today is Tuesday”, then the meaning of q is “Today is Monday”; q can be used to denote a thousand different statements, just like a thousand Hamlets in a thousand people’s eyes. On the other hand, a formal meaning of the formula q ∨ r is the truth value of q ∨ r, which can be decided uniquely when the truth values of q and r are given. In model theory, the formal meaning of a sentence is explored: it examines semantical elements (meaning and truth) by means of syntactical elements (formulas and proofs) of a corresponding language.
In model theory, semantics and model are synonyms. A model of a theory is
21
an interpretation that satisfies the sentences of that theory. Abstract algebras are often used as models. In a summary definition, dating from 1973,
model theory = abstract algebra + logic,
Abstract algebra, also called universal algebra, is a broad field of mathematics, con- cerned with sets of abstract objects associated various operations and properties.
1.3.2.2 Boolean Algebra
In abstract algebra, the most relevant algebra related to the logic discussed
in this book is Boolean algebra, which is approximately equivalent to propositional logic. Many syntactic concepts of Boolean algebra carry over to propositional logic with only minor changes in notation and terminology, while the semantics of proposi- tional logic are defined via Boolean algebras in a way that the tautologies (theorems) of propositional logic correspond to equational theorems of Boolean algebra.
In Boolean algebra, the values of the variables are the truth values true and false, usually denoted 1 and 0 respectively. The main operations of Boolean algebra are the multiplication · (conjunction), the addition + (disjunction), and the inverse i (negation). It is thus a formalism for describing logical operations in the same way that elementary algebra describes numerical operations, such as addition and multiplication. In fact, a Boolean algebra is any set with binary operations + and · and a unary operation i thereon satisfying the Boolean laws (equations), which define the properties of the logical operations.
Boolean algebra was introduced by George Boole in his first book The Math- ematical Analysis of Logic (1847). According to Huntington, the term “Boolean algebra” was first suggested by Sheffer in 1913. Sentences that can be expressed in propositional logic have an equivalent expression in Boolean algebra. Thus, Boolean algebra is sometimes used to denote propositional logic performed in this way. How- ever, Boolean algebra is not sufficient to capture logic formulas using quantifiers, like those from first order logic.
1.3.3 Proof Theory
In a similar way to model theory, proof theory is situated in an interdisci-
plinary area among mathematics, philosophy, and computer science. According to Wikipedia, proof theory is a major branch of mathematical logic that represents proofs as formal mathematical objects, facilitating their analysis by mathematical techniques.
22
1.3.3.1 Axioms and Theorems
In a formal logic, the axioms are a set of sentences which are assumed to be
true. Typically, the axioms contain by default the definitions of all logical operators. For example, the negation operator ¬ is defined by ¬(⊤) = ⊥ and ¬(⊥) = ⊤.
The theorems are the formulas which can be proved to be true from the axioms. For example, for any propositional variable p, ¬(¬(p)) = p is a theorem. Using the case analysis method, if the truth value of p is ⊤, then ¬(¬(⊤)) = ¬(⊥) = ⊤; if the truth value of p is ⊥, then ¬(¬(⊥)) = ¬(⊤) = ⊥. It is easy to see that different sets of axioms would lead to different theorems.
There are two important properties concerning a set of axioms:
• Consistency The axiom set is consistent if every formula in the axiom can be true at the same time.
• Independence The axiom set is independent if no axiom is a theorem of the other axioms.
1.3.3.2 Proof Procedures
Given a set A of axioms and a formula B, we need to a procedure P that can
answer the question if B is a theorem of A: If P(A,B) returns “yes”, we say B is proved, if P (A, B) returns “no”, we say B is disproved; if it returns “unknown”, we do not know if B is a theorem or not. P is called a proof procedure. This procedure can be carried out by hand, or executed on a computer.
There are two important properties concerning a proof procedure P:
• Soundness The proof procedure P is sound iff whenever P(A,B) returns true, B is a theorem of A; whenever P (A, B) returns false, B is not a theorem of A.
• Completeness The proof procedure P is complete iff for any theorem B of A, P (A, B) returns true.
If a procedure P(A,B) is sound and always halts on any input A and B with “yes” or “no”, i.e., P is an algorithm, then P is called a decision procedure for the logic. This definition of decision procedure is narrow; in general, any algorithm which can answer a decision question is called a decision procedure.
Theorem 1.3.7. Every decision procedure is sound and complete.
23
Proof. Given any axiom set A and any formula B, since P is a decision procedure, P(A,B) is will return a truth value. If B is a theorem of A, P(A,B) will return true because P is sound. P
Some proof procedures may stop with “yes”, “no”, or “unknown”; they cannot be called a decision procedure. A sound and complete proof procedure may loop on a formula that is not a theorem, thus cannot be a decision procedure by definition. From the above theorem, we see that the soundness and halting properties of a proof procedure implies the completeness. A sound and complete proof procedure is also called a semi-decision procedure for the logic, because it will halt on any formula which is a theorem. For any logic, we always look for its decision procedure. For example, propositional logic has many different decision procedures. If a decision procedure does not exist, we look for a semi-decision procedure. For some logics, even a semi-decision procedure cannot exist. Clearly, these discussions will only make sense once we have formally defined the logic in question.
1.3.3.3 Inference Rules
In structural proof theory, which is a major area of proof theory, inferences
rules are used to construct a proof. In logic, an inference rule is a pattern which takes formulas as premises, and returns a formula as conclusion. For example, the rule of modus ponens (MP) takes two premises, one in the form “If p then q” (i.e., p → q) and another in the form p, and returns the conclusion q’. The rule is sound if the premises are true under an interpretation, then so is the conclusion.
An inference system S consists of a set of axioms A and a set of inference rules R. The soundness of S comes from the soundness of every inference rule in R. A proof of formula Fn in S = (A, R) is a sequence of formulas F1, F2, …, Fn such that each Fi is either in A or can be generated by an inference rule of R, using the formulas before Fi in the sequence as the premises.
Example1.3.8.LetS=(A,R),A={p→(q→r),p,q},andR={MP}. A proof of r from A using S is given below:
1. p→(q→r) 2. p
3. q→r
4. q
5. r
axiom axiom MP,1,2 axiom MP,3,4
One obvious advantage of such a proof is that these proofs can be checked either by hand or automatically by computer. Checking formal proofs is usually
24
simple, whereas finding proofs (automated theorem proving) is generally hard. An informal proof in the mathematics literature, by contrast, requires weeks of peer review to be checked, and may still contain errors. Informal proofs of everyday mathematical practice are unlike the proofs of proof theory. They are rather like high-level sketches that would allow an expert to reconstruct a formal proof at least in principle, given enough time and patience. For most mathematicians, writing a fully formal proof is too pedantic and long-winded to be in common use.
In proof theory, proofs are typically presented as recursively-defined data structures such as lists (as shown here), or trees, which are constructed accord- ing to the axioms and inference rules of the logical system. As such, proof theory is syntactic in nature, in contrast to model theory, which is semantic in nature.
Every inference system can be converted to a proof procedure if a fair strategy is adapted to control the use of inference rules. Fair strategy ensures that if an inference rule is applicable at some point to produce a new formula, this new for- mula will be derived eventually. If every inference rule is sound, the corresponding proof procedure is sound. Extra effort is needed in general to show that the proof procedure is complete, or halting.
We have defined a formal notion of proof as a sequence of formulae, starting with axioms and continuing with theorems derived from earlier members of the sequence by rules of inference.
Besides structural proof theory, some of the major areas of proof theory include ordinal analysis, provability logic, reverse mathematics, proof mining, automated theorem proving, and proof complexity.
1.3.4 Computability Theory
According to Wikipedia, computability theory, used to be called recursion the-
ory, is a branch of mathematical logic, of computer science, and of the theory of computation that originated in the 1930s with the study of computable functions. The field has since expanded to include the study of generalized computability and definability. In these areas, computability theory overlaps with proof theory and set theory.
1.3.4.1 Recursion
The main reason for the name of “recursion theory” is that recursion is used
to construct objects and functions.
Example 1.3.9. Given a constant symbol 0 of type T and a function symbol s : T −→ T, the objects of type T can be recursively constructed as follows:
25
1. 0 is an object of type T;
2. IfnisanobjectoftypeT,soiss(n); 3. Nothing else will be an object of T.
If we let s0(0) = 0, s1(0) = s(0), s2(0) = s(s(0)), etc., then T can be expressed as T = {0, s(0), s2(0), s3(0), …, si(0), …}.
Obviously, there exists a bijection f between T and N, the set of natural numbers. That is, for any i ∈ N, f(i) = si(0). By giving s the meaning of successor of a natural number, mathematicians use T as a definition of N.
Functions can be recursively defined in a similar way. Let pre, add, sub, mul be the predecessor, addition, subtraction, and multiplication functions over the set of natural numbers:
pre(0) pre(s(x)) add(0, y) add(s(x), y) sub(x, 0) sub(x, s(y)) mul(0, y) mul(s(x), y)
= 0;
= x.
= y;
= s(add(x, y)). = x;
= sub(pre(x), y)). = 0;
= add(mul(x, y), y).
Note that the subtraction defined above is different from the subtraction over the integers: we have sub(si(0),sj(0)) = 0 when i ≤ j. This function holds the property that sub(x, y) = 0 iff x ≤ y.
The above functions are examples of so-called primitive recursive functions, which are total and computable functions. The set of general recursive functions include all primitive recursive functions, and some functions which are not com- putable.
1.3.4.2 Backus-Naur Form (BNF)
In computer science, Backus–Naur form (BNF) is a notation technique for
context-free grammars, often used to describe the syntax of languages used in com- puting, such as computer programming languages, document formats, instruction sets and communication protocols. John Backus and Peter Naur are the first to use this notation to describe ALGOL, an early programming language. BNF is applied
26
wherever formal descriptions of languages are needed. We will use BNF describe the syntax of logic as a language.
BNF can be extended to define recursively constructed objects in a succinct style. For example, the set of natural numbers is expressed in BNF as follows:
⟨N⟩ ::= 0 | s(⟨N⟩)
The symbol ::= can be understood as “is defined by”, and “—” separates alternative parts of a definition. The above formula can be read as follows: the member of N is either 0, or s applying to (another) member of N; nothing else will be in N. From this example, we see that to define a set N, we use any member of N is denoted by ⟨N⟩ and is defined by the items on the right of ::=. For a recursive definition, we have a basic case (such as 0) and a recursive case (such as s(⟨N⟩)). Items like ⟨N⟩ are called variables in a context-free grammar and items like 0, s, (, and ) are called terminals. A context-free grammar specifies what set of strings of terminals can be derived for each variable. For the above example, the set of strings derived from ⟨N⟩ is
{0, s(0), s2(0), s3(0), …, si(0), …}
This set is the same as the objects defined in Example 1.3.9, but BNF gives us a
formal and compact definition.
Example 1.3.10. To define the set Σ∗ of all strings (of finite length) over the set Σ of symbols, called the alphabet, we may use BNF. For instance, if Σ = {a, b, c}, then Σ∗ is the set of all binary strings:
⟨symbol⟩ ::= a|b|c
⟨Σ∗s⟩ ::= ε | ⟨symbol⟩⟨Σ∗⟩
That is, the empty string ε ∈ Σ∗; if w ∈ Σ∗, for every symbol a ∈ Σ, aw ∈ Σ∗; nothing else will be in Σ∗. Thus, Σ∗ contains exactly all the strings built on Σ:
Σ∗ = {ε,a,b,c,aa,ab,ac,ba,bb,bc,ca,cb,cc,aaa,…}.
Since there is a one-to-one corresponding between a string and a sequence, sequences can be also recursively defined in a similar way. In fact, all recursively constructed objects can be defined this way.
1.3.4.3 Computable Functions
One of the basic questions addressed by computability theory is the following:
• What does it mean for a function on the natural numbers to be computable?
27
The answer to the question is that the function is computable if there exists an algorithm which computes the function. Logicians and computer scientists, includ- ing Turing and Church, gave different definitions of algorithms based on different computing models. These computing models can be Turing machines, Church’s λ-calculus, general recursive functions, Markov algorithms, Chomsky’s grammars, Minski’s counter machine, or von Neumann model. One point in common is that, by algorithm, we mean that the computing process will halt in a finite number of steps.
In Section 1.3.1.3, we saw that the set of functions, which take a natural number and return a Boolean value, is uncountable. However, the set of algorithms is countable, because every algorithm can be recorded by a text file of finite length (which can be regarded as a binary string), and the set of binary strings is countable. In other words, we have only a countable set of algorithms but the functions on the natural numbers are uncountable. As a result, there exist massive number of functions on the natural numbers which are not computable.
One notable non-computable problem is to decide if a Turing machine T will halt on an input x. If we encode T and x as a single positive integer ⟨T,x⟩, and define the function f(⟨T,w⟩) = 1 if T halts on w and 0, otherwise, then f is a non-computable function on the natural numbers. Note that the Turing machine T can be replaced by any other equivalent computing model and the result still holds. This follows from the Church–Turing thesis (also known as computability thesis, the Church–Turing conjecture), which states that a function on the natural numbers can be calculated by an effective method if and only if it is computable by a Turing machine.
The concept of Turing completeness says that if a computing model has the ability to simulate a Turing machine. A computing model that is Turing complete is theoretically capable of doing all tasks done by computers; on the other hand, nearly all computers are Turing complete if the limitation of finite memory is ignored. Some logics are also Turing complete as they can also be used to simulate a Turing machine. As a result, some problems such as the halting problem for such logics are undecidable. Computability theory will help us to decide if there exist or not decision procedures for some logic, and we will see some examples in Chapter 9.
One of the main aims of this book is to present algorithms invented for logic and the software tools built up based on these algorithms. That is why we will invest a disproportionate amount of effort (by the standards of conventional logic textbooks) in the algorithmic aspect of logic. Set theory and model theory help us to understand and specify the problems; proof theory helps us to create algorithms; and computability theory helps us to know what problems have algorithms and
28
what problems do not have algorithms at all.
1.4 Exercise Problems
1. Decide if the following sentences are statements or not. If they are, use propo- sitional symbols and logic connectives to express them.
(a) If I win the lottery, I’ll be poor.
(b) The man asked, “shut the door behind you.”
(c) Today is the first day of the school. (d) Today is either sunny or cloudy.
2. Answer the questions in the following examples regarding logical fallacies.
(a) During their argument, Anthony tells Marie that he hates her and wished that she would get hit by a car. Later that evening, Anthony receives a call from a friend who tells him that Marie is in the hospital because she was struck by a car. Anthony immediately blames himself and reasons that if he hadn’t made that comment during their fight, Marie would not have been hit. What logical fallacy has Anthony committed?
(b) Louise is running for class president. In her campaign speech she says, “My opponent does not deserve to win. She is a smoker and she cheated on her boyfriend last year.” What fallacy has Louise committed?
(c) Bill: “I think capital punishment is wrong.” Adam: “No it isn’t. If it was wrong, it wouldn’t be legal.” Of which fallacy is this an example?
(d) Ricky is watching television when he sees a commercial for foot cream. The commercial announcer says, “This is the best foot cream on the market because no other foot cream makers have been able to prove otherwise!” What fallacy has the announcer committed?
(e) Maria has been working at her current job for more than 30 years at the same wage. She desperately wants a raise so she approaches her boss to ask for one. She says, “You are one of the kindest people I know. You are smart and good-looking and I really love your shoes.” What type of fallacy is this?
(f) Jeff’s mom is concerned when she finds out that he skipped class one day. She tells him that she is concerned that since he skipped one class, he will start skipping more frequently. Then he will drop out altogether,
29
never graduate or get into college, and end up unemployed and living at home for the rest of his life. What type of fallacy has Jeff’s mom committed?
(g) Dana is trying to raise money for her university’s library. In her address to the board of trustees, she says, “We must raise tuition to cover the cost of new books. Otherwise the library will be forced to close.” Of what fallacy is this an example?
(h) Jeff is preparing to create a commercial for a new energy drink. He visits a local high school and surveys students in an English class about their beverage preferences. The majority of the class says they prefer grape flavored drinks, so Jeff tells his superiors that grape is the flavor favored most by high school students. What error in reasoning has Jeff made?
3. LetA={0,1}. P(A)=? P(P(A))=?
4. Let a,b be real numbers and a < b. Define a function f : (0,1) −→ (a,b) by f(x) = a+x(b−a), where (a,b) denotes the interval {x | a < x < b}. Show that f is bijective.
5. Answer the following questions with justification.
(a) Is the set of all odd integers countable?
(b) Is the set of real numbers in the interval (0, 0.1) countable?
(c) Is the set of angles in the interval (0o,90o) countable?
(d) Is the set of all points in the plane with rational coordinates countable?
(e) Is the set of all Java programs countable?
(f) Is the set of all words using an English alphabet countable?
(g) Is the set of sands on the earth countable?
(h) Is the set of atoms in the solar system countable?
6. Barber’s paradox: In a village, there is only one barber who shaves all those, and those only, who do not shave themselves. The question is, does the barber shave himself? Let the relation s(x,y) be that “x shaves y” and b denote the barber.
(a) Define A to be the set of the people shaved by the barber using s(x,y) and b.
30
31 (b) Define B to be the set of the people who does not shave himself.
(c) Use A and B to show that the answer to the question “does the barber shave himself?” leads a contradiction.
7. Prove that any subset of a countable set is countable.
8. Prove that the set of rational numbers is countable.
9. Prove that the set N × {a, b, c} is countable, where N is the set of natural numbers. Your proof should not use the property that countable sets are closed under Cartesian product.
10. Decide with a proof that the set Nk is countable or not, where N is the set of natural numbers and k ∈ N.
11. Decide with a proof that the set of all functions f : {0, 1} −→ N is countable or not.
12. Prove that if both A and B are countable, so is A ∪ B.
13. Prove that if A is uncountable and B is countable, then A − B is uncountable.
14. Prove that the set {a, b, c}∗ of all strings (of finite length) on {a, b, c} is count- able.
15. If a language is any set of strings in Σ∗, where Σ ̸= ∅, prove that the set of all languages is uncountable.
16. Suppose the set of natural numbers is constructed by the constant 0 and the successor function s. Provide a definition of < and ≤ over the natural numbers. No functions are allowed to use.
17. Provide a definition of power function, pow(x,y), which computes xy, over the natural numbers built up by 0 and the successor function s. You are allowed to use the functions add (addition) and mul (multiplication).
18. Provide a BNF (Backus–Naur form) for the set of binary trees, where each non-empty node contains a natural number. You may use null : T ree for the empty tree and node : Nat,Tree,Tree −→ Tree for a node of the tree.
CHAPTER 2 PROPOSITIONAL LOGIC
Following model theory, we will present propositional logic as a language. We first introduce the syntax of this language and then its semantics.
2.1 Syntax
The syntax of propositional logic specifies what symbols and what sentences are used in the logic. Propositional logic uses propositional variables and logical operators as the only symbols. In the first chapter, we said that propositional variables are symbols for denoting statements and take only true and false as truth values. For convenience, we will use 1 for true and 0 for false. We will treat ⊤ and ⊥ as the two nullary logical operators such that the truth value of ⊤ is always 1 and the truth value of ⊥ is always 0. Note that ⊤ and ⊥ are syntactic symbols while 1 and 0 are semantic symbols.
2.1.1 Logical Operators
Mathematicians use the words “not”, “and”, “or”, etc., for operators that
change or combine propositions. The meaning of these logical operators can be specified as a function which takes Boolean values, and returns a Boolean value. These functions are also called . For example, if p is a proposition, then so is ¬p and the truth value of the proposition ¬p is determined by the truth value of p according to the meaning of ¬: ¬(1) = 0 and ¬(0) = 1. That is, if the value of p is 1, then the value of ¬(p) is 0; if the value of p is 0, then the value of ¬(p) is 1.
In general, the definition of a Boolean function is displayed by a table where the output of the Boolean function is given for each possible truth values of the input variables. Such a table is called truth table. For example, the meanings of ∧, ∨, and → are defined by the following truth table: This table has four lines, since there are four pairs of Boolean values for the two variables:
According to this table, the truth value of p ∧ q is 1 when both p and q are 1. The truth value of p ∨ q is 1 when either p or q, or both are true. This is not always the intended meaning of “or” in everyday dialog, but this is the standard definition in logic. So if a logician says, “You may have cake, or you may have ice cream,” he means that you could have both. If you want to exclude the possibility of having
32
33
p
q
p∧q
p∨q
p→q
p⊕q
p↔q
1
1
1
1
1
0
1
1
0
0
1
0
1
0
0
1
0
1
1
1
0
0
0
0
0
1
0
1
Table 2.1.1: Definitions of ∧, ∨, →, ⊕, and ↔
both cake and ice cream, you should combine them with the exclusive-or operator, ⊕, which is also defined in Table 2.1.1. The exclusive disjunction ⊕ corresponds to addition modulo 2 and is therefore given the symbol + when the boolean values are assumed to be 0 and 1.
In Table 2.1.1, we also give the definition of ↔ (if and only if). If p denotes “Tom wears a blue jean” and q denotes “Sam wears a blue jean”, then the formula p ↔ q asserts that “Tom and Sam always wear a blue jean at the same time”. That is, the value of p ↔ q is 1 when p and q have the same truth value: either both are 1 or both are 0. Note that the p ⊕ q and p ↔ q always have opposite values, that is, ¬(p ⊕ q) and p ↔ q always have the same value.
The truth table for implications is also given in Table 2.1.1. This operator is worth remembering, because a large fraction of all mathematical statements are of the if-then form. Now let’s figure out the truth of the following example: If elephants fly, I’ll be on Mars. What is the truth value of this statement? Elephants do not fly, so the truth value of p is 0 and we fall to the last two lines of Table 2.1.1: In both lines, p → q is 1. In Section 1.2.2 on Informal Fallacies, we pointed out that with a false premise, you can make any conclusion, so that the composed statement is always true. However, an always true statement has no value in reasoning, and it does not imply the causal connection between the premise and the conclusion of an implication. In logic, it is important to accept the fact that logical implications ignore causal connections.
Let B = {0,1}, any logical operator is a function f : Bn −→ B. There are two nullary operators (when n = 0): ⊤ = 1 and ⊥ = 0. There are four unary operators (when n = 1): f1(x) = 0; f2(x) = 1; f3(x) = x; and f4(x) = ¬x. There are 16 binary operators (when n = 2), and ∧, ∨, →, ⊕, and ↔ are some of them. We may use operators when n > 2. For example, let ite : B3 −→ B be the if-then- else operator, if we use a truth table for the meaning of ite(p, q, r), then the truth table will have eight lines. In general, there are 22n n-ary Boolean functions, as the number of functions f : A −→ B is |B||A|, where |A| = |Bn| = |B|n = 2n.
2.1.2 Formulas
Definition 2.1.1. The propositional formulas are the set of strings built up by propositional variables and logical operators. Let op be the binary operators and Vp be the propositional variables used in the current application, then the formulas for this application can be defined by the following BNF grammar:
⟨op⟩ ::= ∧|∨|→|⊕|↔
⟨Vp⟩ ::= p|q|r|s|t
⟨Formulas⟩ ::= ⊤ | ⊥ | ⟨Vp⟩ | ¬⟨Formulas⟩ | (⟨Formulas⟩ ⟨op⟩ ⟨Formulas⟩)
We may add/delete operators or propositional variables at will in the above definition. According to the definition, every member of Formulas is either the constants ⊤ or ⊥, a propositional variable of Vp, or the negation of a well-formula, or a binary operator applying to two formulas. Here are some examples:
⊤, p, ¬q, (p∧¬q), (r → (p∧¬q)), ((p∧¬q)∨r)
Throughout this chapter, we will use p, q, …, for propositional variables, and A, B, …, for formulas.
For every binary operator, we introduce a pair of parentheses, and some of them may be unnecessary and we will use a precedence relation over all the oper- ators to remove them. The typical precedence is given as the following list, where operators with higher precedence go first. Note that we assign ⊕ and ↔ the same precedence.
¬, ∧, ∨, →, {⊕, ↔}.
Thus (r → (p∧¬q)) can be simply written as r → p∧¬q, and we will not take it as (r → p)∧¬q, because ∧ has higher precedence than →. Similarly, ((p∧¬q)∨r) is written as p ∧ ¬q ∨ r without ambiguity.
When a well-formed expression, like the formulas defined above, is represented by a string (a sequence of symbols), we use parentheses and commas to identify the structures of the expression. Such a structure can be easily identified by a tree, where we do not need parentheses and commas.
Definition 2.1.2. A tree for formula A is defined as follows:
1. IfAisp,⊤,or⊥,thetreeisasinglenodewithAasitslabel.
2. If A = ¬B, then the root node is labeled with ¬ and has one branch pointing to the formula tree of B.
34
35
Figure 2.1.1: The formula tree for (¬p ∧ q) → (p ∧ (q ∨ ¬r))
3. IF A = B op C, then the root node is labeled with op and has two branches
pointing to the formula trees of B and C, respectively.
For example, the formula tree of (¬p ∧ q) → (p ∧ (q ∨ ¬r)) is displayed in Figure 2.1.1. The tree representation of a formula does not need parentheses.
A formula and its formula tree are the two representations of the same object. We are free to choose one of the representations for the convenience of discussion.
Definition 2.1.3. Given a formula A, we say B is a subformula A if either B = A or A = (A1 op A2), where op is a binary operator, or A = ¬A1 and B is a subformula ofA1 orA2.IfBisasubformulaofAandB̸=A,thenBisapropersubformula of A.
For example, ¬p∨¬(q∨r) contains p, q, r, ¬p, q∨r, and ¬(q∨r) as proper subformulas. Intuitively, a subformula is just a formula represented by any subtree in the formula tree.
Definition 2.1.4. Given a subformula B of A and another formula C, we use A[B ↼ C] to denote the formula obtained by substituting all occurrences of B by C in A. If B is a propositional variable, then A[B ↼ C] is called an instance of A.
For example, if A is ¬p ∨ ¬(q ∨ r), B is ¬(q ∨ r), and C is ¬q ∧ ¬r, then A[B ↼ C] is ¬p∨(¬q∧¬r). A[p ↼ p∧q] = ¬(p∧q)∨¬(q∨r) is an instance of A.
2.2 Semantics
In the previous section, we talked about Boolean values, truth tables, and meanings of logical operators, which are, strictly speaking, semantic concepts of the logic. Now, let us talk about meaning of propositional variables and formulas.
The informal meaning of a propositional variable is the statement that sym- bol represents. In different applications, the same symbol can represent different statements, thus has different meanings. However, each statement can have only two truth values, i.e., true, or false. We will use these truth values of proposi- tional variables, extending to the formulas which contain them, as the semantics of propositional logic.
2.2.1 Interpretations
A propositional variable may be assigned true or false. This truth value as-
signment is considered to be the semantics of the variable. For a formula, an as- signment of truth values to every propositional variable is said be an interpretation of the formula. If A is a formula built on the propositional variables Vp, then an interpretation of A is the function σ : Vp −→ {1, 0}. It is easy to check that there are 2|Vp| distinct interpretations.
Suppose σ is an interpretation over Vp = {p,q} such that σ(p) = 1 and σ(q)=0. Wemaywriteσ={p→1,q→0}. Wemayuse(1,0)forthesameσ, assuming Vp is a list (p, q) and σ(p, q) = (1, 0).
An alternative representation of an interpretation σ is to use a subset of Vp. Given σ, let Xσ = {x ∈ Vp | σ(x) = 1}. Then there is bijection between σ and Xσ. So we may use Xσ to represent σ. For example, if Vp = {p,q}, then the four interpretations over Vp is the power set of Vp:
2Vp = {∅, {p}, {q}, {p, q}}
Another alternative representation of σ is to use a set of literals. A literal is
either a propositional variable or the negation of a variable. Let Yσ =Xσ ∪{¬y∈Vp,y̸∈Xσ}
then Yσ is a set of literals such that every variable of Vp appears in Yσ exactly once. IfVp ={p,q}andXσ =∅,thenYσ ={¬p,¬q};ifXσ ={p},Yσ ={p,¬q},etc. That is, by adding the negations of the missing variables in Xσ, we obtain such a representation Yσ for σ.
Given σ, we can check the truth value of the formula under σ, denoted by σ(A). This notation means we treat σ as a function from the formulas to {0, 1}, as
36
the unique extension of σ : Vp −→ {0, 1}. On the other hand, the notation Aσ is used to denote the formula obtained by substituting each propositional variable p by σ(p).
Example 2.2.1. Given the formula A = p∧¬q∨r, and an interpretation σ = {p → 1,q→1,r→0},werepresentσbyσ={p,q,¬r}. Replacepby1,qby1,andr by 0 in A, we have Aσ = 1∧¬1∨0. Applying the definitions of ¬, ∧, and ∨, we obtain the truth value 0. In this case, we say the formula is evaluated to 0 under σ, denoted by σ(A) = 0. Given another interpretation σ′ = {p, q, r}, the same formula will be evaluated to 1, i.e., σ′(A) = 1.
Recall that we use A[B ↼ C] to denote an instance of A where every occur- rence of B is replaced by C. An interpretation σ is a special case of the substitution of formulas where B is a variable and C is either 1 or 0. For example, if σ = {p, ¬q}, then Aσ = A[p ↼ 1][q ↼ 0]. Strictly speaking, Aσ is not a propositional formula: it is the meaning of A under σ. To obtain σ(A), we may use the algorithm eval.
Algorithm 2.2.2. The algorithm eval will take a formula A and an interpretation σ, and returns a Boolean value.
proc eval(A, σ)
if A = ⊤ return 1;
if A = ⊥ return 0;
if A ∈ Vp return σ(A);
if A is ¬B return ¬eval(B, σ);
else A is (B op C) return eval(B, σ) op eval(C, σ);
Example 2.2.3. Let σ = {p, ¬q}. Then the execution of eval on p → p ∧ ¬q and σ will be as follows:
eval(p → p ∧ ¬q, σ) calls
eval(p, σ), which returns 1; and eval(p ∧ ¬q, σ), which calls
eval(p, σ), which returns 1; and eval(¬q, σ), which calls
eval(q, σ), which returns 0; and
returns ¬0, i.e., 1; returns 1 ∧ 1, i.e., 1;
returns 1 → 1, i.e., 1.
37
What eval does is to travel the formula tree of A bottom-up, if the node is labeled with a variable, use σ to get the truth value; otherwise, compute the truth value using the operator at that node with the truth values from its children nodes. In this case, we say this is the truth value of the node under σ. The process of running eval is exactly what we do when constructing a truth table for p → p ∧ ¬q. The truth values under p and q in the truth table give us all the interpretations σ and the truth values under A = ¬q, p∧¬q, or p → p∧¬q are the values of eval(A, σ).
Theorem 2.2.4. Algorithm eval(A, σ) returns 1 iff σ(A) = 1, and runs in O(|A|) time, where |A| denotes the number of symbols, excluding parentheses, in A.
|A| is also the number of nodes in the formula tree of A. The proof is left as an exercise.
Corollary 2.2.5. For any formulas A and B, and any interpretation σ, the follow- ing equations hold.
σ(A∨B) = σ(A)∨σ(B) σ(A∧B) = σ(A)∧σ(B)
σ(¬A) = ¬σ(A)
The truth value of a formula in propositional logic reflects the two foundational principles of Boolean logics: the principle of bi-valence, which allows only two truth values, and the principle of extensibility that the truth value of a general formula depends only on the truth values of its parts, not on their informal meaning.
2.2.2 Models, Satisfiability, and Validity
Interpretations play a very important role in propositional logic and introduce
many important concepts.
Definition 2.2.6. Given a formula A and an interpretation σ, σ is said to be a
model of A if eval(A, σ) = 1. If A has a model, A is said to be satisfiable.
Given a set Vp of propositional variables, we will use IVp to denote the set of all interpretation over Vp. For example, if Vp = {p,q}, then IVp = {{p →
38
p
q
¬q
p ∧ ¬q
p → p ∧ ¬q
0 0 1 1
0 1 0 1
1 0 1 0
0 0 1 0
1 1 1 0
1,q → 1},{p → 1,q → 0},{p → 0,q → 1)},{p → 0,q → 0}, or abbreviated as IVp = {(1, 1), (1, 0), (0, 1), (0, 0)} if p and q are understood from the context. In general, |IVp| = 2|Vp|, as each variable has two values. We may use eval to look for a model of A by examining every interpretation in IVp.
Definition 2.2.7. Given a formula A, let M(A) be the set of all models of A. If M(A) = ∅, A is said to be unsatisfiable; If M(A) = IVp, i.e., every interpretation isamodelofA,thenAissaidtobevalid,ortautology. Wewrite|=AifAis valid.
If there is a truth table for a formula A, M(A) can be easily obtained from the truth table, as the truth values under the variables of A are all the interpretations and the models of A are those rows in the table where the truth value of A is 1.
Example 2.2.8. Let Vp = {p, q}. If A is ¬p → ¬q, then M(A) = {(1, 1), (1, 0), (0, 0)}. If B is p∨(¬q → ¬p), then M(B) = IVp. From M(A) and M(B), we know that both A and B are satisfiable, and B is also valid, i.e., |= B is true.
A valid formula is one which is always true in every interpretation, no matter what truth values its variables may take, that is, every interpretation is its model. The simplest example is ⊤, or p ∨ ¬p. There are many formulas that we want to know if they are valid, and this is done by so-called theorem proving, either by hand or automatically.
We may think about valid formulas as capturing fundamental logical truths. For example, the transitivity property of implication states that if one statement implies a second one, and the second one implies a third, then the first implies the third. In logic, the transitivity can be expressed as the formula
(p → q) ∧ (q → r) → (p → r)
The validity of the above formula confirms the truth of this property of implication. There are many properties like this for other logical operators, such as the commu- tativity and associativity of ∧ and ∨, and they all can be stated by a tautology.
Theorem 2.2.9. (Substitution) Let A be a tautology, so is any instance of A. That is, if p is a propositional variable in A and B is any formula, then A[p ↼ B] is a tautology.
Proof. For any interpretation σ of A[p ↼ B], let σ(p) = σ(B). Applying σ to the formula trees of A and A[p ↼ B], then the truth values of all the nodes of A must be identical to those of the corresponding nodes of A[p ↼ B]. Since A is a tautology,
39
40 the root node of A must be 1 under σ, so the root node of A[p ↼ B] must the same
truth value, i.e., 1. Since σ is arbitrary, A[p ↼ B] must be a tautology. P
The significance of the above theorem is that if we have a tautology for one variable, the tautology holds when the variable is substituted by any formula. For example, from p ∨ ¬p, we have A ∨ ¬A for any A. On the other hand, when we try to prove a tautology involving symbols A, B, …, we may treat each of these symbols as a propositional variable. For example, to prove (A ∧ B) ↔ (B ∧ A), we may prove instead (p ∧ q) ↔ (q ∧ p).
An unsatisfiable formula is one which does not have any model, that is, no interpretation is its model. The simplest example is 0, or p ∧ ¬p. Validity and unsatisfiability are dual concepts, as stated by the following proposition.
Proposition 2.2.10. A formula A is valid iff ¬A is unsatisfiable.
From the view of theorem proving, proving the validity of a formula is as hard (or as easy) as proving its unsatisfiability.
The following theorem shows the relationship between logic and set theory.
Theorem 2.2.11. For any formulas A and B,
(a) (b) (c)
M(A ∨ B) = M(A ∧ B) = M(¬A) =
M(A) ∪ M(B); M(A) ∩ M(B); IVp − M(A).
Proof. To show M(A ∨ B) = M(A) ∪ M(B), we need to show that, for any inter- pretation σ, σ ∈ M(A∨B) iff σ ∈ M(A)∪M(B). By definition, σ ∈ M(A∨B) iff σ(A∨B) = 1. By Corollary 2.2.5, σ(A∨B) = σ(A)∨σ(B). So σ(A∨B) = 1 iff σ(A)=1orσ(B)=1. σ(A)=1orσ(B)=1iffσ(A)∨σ(B)=1. σ(A)∨σ(B)=1 iff σ ∈ M(A) ∪ M(B).
The proof of the other two is left as exercise. P
Example 2.2.12. Let Vp = {p, q}. M(p) = {(1, 1), (1, 0)}, M(q) = {(1, 1), (0, 1)}; M(¬p) = IVp − M(p) = {(0,1),(0,0)}, and M(¬p ∨ q) = M(¬p) ∪ M(q) = {(1, 1), (0, 1), (0, 0)}.
2.2.3 Equivalence
In natural language, one statement can be stated in different forms with the
same meaning. For example, “if I won the lottery, I must be rich now.” This meaning can be also expressed “I am not rich, because I didn’t win the lottery.”
Introducing p for “I won the lottery” and q for “I’m rich”, the first statement is p → q and the second statement is ¬q → ¬p. It happens that M(p → q) = M(¬q → ¬p), as Table 2.2.1 shows that both formulas have the same set of models. Each line of the truth table defines an interpretation; if a formula is true in the same line, that interpretation is a model. The table also reveals that M(p → q) = M(¬p ∨ p). In logic, these formulas are equivalent.
Table 2.2.1: The equivalence of the implication p → q and its contrapositive ¬q → ¬p.
The formula ¬q → ¬p is called the contrapositive of the implication p → q. The truth table shows that an implication and its contrapositive are equivalent: they are just different ways of saying the same thing. In contrast, the converse of p → q is the formula q → p, which has a different set of models (as shown by the second and third lines in Table 2.2.1).
Definition 2.2.13. Given two formulas A and B, A and B are said to be logically equivalent if M(A) = M(B), denoted by A ≡ B.
A ≡ B means that, for every interpretation σ, σ(A) = σ(B), so σ ∈ M(A) iff σ ∈ M(B). The relation ≡ over formulas is an equivalence relation as ≡ is reflexive, symmetric, and transitive. The relation ≡ is also a congruence relation as A ≡ C and B ≡ D imply that ¬A ≡ ¬C and A o B ≡ C o D for any binary operator o. This property allows us to obtain an equivalent formula by replacing a part of the formula by an equivalent one. The relation ≡ plays a very important role in logic as it is used to simplify formulas, to convert formulas into equivalent normal forms, or to provide alternative definitions for logical operators.
In arithmetic one writes simply s = t to express that the terms s, t represent the same function. For example, (x + y)2 = x2 + 2xy + y2 expresses the equality of values of the left-and right-hand terms for all x, y. In propositional logic, however, we use the equality sign like A = B only for the syntactic identity of the formulas A and B. Therefore, the equivalence of formulas must be denoted differently, such as A ≡ B.
41
p
q
¬p
¬q
p→q
¬p ∨ q
¬q → ¬p
q→p
1
1
0
0
1
1
1
1
1
0
0
1
0
0
0
1
0
1
1
0
1
1
1
0
0
0
1
1
1
1
1
1
Theorem 2.2.14. For any formulas A, B, and C, where B is a subformula of A, and B ≡ C, then A ≡ A[B ↼ C].
Proof. For any interpretation σ, σ(B) = σ(C), since B ≡ C. Apply σ to the formula trees of A and A[B ↼ C] and compare the truth values of all corresponding nodes of the two trees, ignoring the proper subtrees of B and C. Since σ(B) = σ(C), they must have the same truth values, that is, σ(A) = σ(A[B ↼ C]). P
The equivalence relation is widely used to simplify formulas and has real prac- tical importance in computer science. Formula simplification in software can make a program easier to read and understand. Simplified programs may also run faster, since they require fewer operators. In hardware, simplifying formulas can decrease the number of logic gates on a chip because digital circuits can be described by log- ical formulas. Minimizing the logical formulas corresponds to reducing the number of gates in the circuit. The payoff of gate minimization is potentially enormous: a chip with fewer gates is smaller, consumes less power, has a lower defect rate, and is cheaper to manufacture.
Suppose a formula A contains k propositional variable, then A can be reviewed as one of Boolean function f : {1, 0}k −→ {1, 0}. For example, ¬p ∨ q contains two variables and can be regarded as a Boolean function f(p,q). The truth table (Table 2.2.1) reveals that f (p, q), i.e., ¬p ∨ q, always has the same truth value as p → q, so f and → are the same function. As a result, we may use ¬p ∨ q as the definition of →. As another example, the if-then-else function, ite : {1,0}3 −→ {1, 0}, can be defined by ite(1, B, C) = B and ite(0, B, C) = C, instead of using a truth table of eight lines.
The following proposition lists many equivalent pairs of the formulas.
Proposition 2.2.15. Assume A, B, and C are any formulas.
A ∨ A ≡ A;
A ∨ B ≡ B ∨ A;
(A∨B)∨C ≡ A∨(B ∨C);
A ∨ ⊥ ≡ A;
A∨¬A ≡ ⊤;
A ∨ ⊤ ≡ ⊤;
A∨(B ∧C) ≡ (A∨B)∧(A∨C); ¬⊤ ≡ ⊥; ¬⊥ ≡ ⊤;
¬(A ∨ B) ≡ ¬A ∧ ¬B;
A → B ≡ ¬A ∨ B;
A⊕B ≡ (A∨B)∧(¬A∨¬B);
A∧A≡A;
A∧B≡B∧A;
(A∧B)∧C ≡ A∧(B ∧C); A∧⊤≡A;
A∧¬A ≡ ⊥;
A∧⊥≡⊥;
A∧(B ∨C) ≡ (A∧B)∨(A∧C); ¬¬A ≡ A;
¬(A ∧ B) ≡ ¬A ∨ ¬B;
A → B ≡ ¬B → ¬A;
A ↔ B ≡ (A∨¬B)∧(¬A∨B).
One way to show that two formulas are equivalent if we can prove the validity of a formula, as given by the following proposition:
42
43 Proposition 2.2.16. For any formulas A and B, A ≡ B iff |= A ↔ B; |= A iff
A ≡ ⊤.
Proof. A ≡ B means, for any interpretation σ, σ(A) = σ(B), which means σ(A ↔ B)=1,i.e.,anyσisamodelofA↔B. Theproofofthesecondpartisomitted. P
The above proposition shows the relationship between ≡, which is a semantic notation, and ↔, which is a syntactical symbol. Moreover, Theorem 2.2.9 allows us to prove the equivalences using the truth table method, treating A, B, and C as propositional variables. For example, from (p → q) ≡ (¬p ∨ q), we know (p → q) ↔ (¬p∨q) is tautology; from (p → q) ↔ (¬p∨q), we know (A → B) ↔ (¬A∨B) is a tautology for any formulas A and B, thus (A → B) ≡ (¬A ∨ B).
2.2.4 Entailment
In natural languages, given a set of premises, we like to know what conclusions
can be drawn from the premises. If we represent these premises by a single formula, it will be the conjunction of the premises. Since ∧ is commutative and associative, a conjunction of formulas can be conveniently written as a set S of formulas. That is, we assume that A1 ∧ A2 ∧ ··· ∧ An is equivalent to S = {A1,A2,··· ,An}. For example, {A, B} ≡ A ∧ B.
To catch the relation between the premises and the conclusion in logic, we have the notion of entailment.
Definition 2.2.17. Given two formulas A and B, we say A entails B, or B is a logical consequence of A, denoted by A |= B, if M(A) ⊆ M(B).
Thus, |= A and ⊤ |= A have the same meaning, i.e., A is valid.
The above definition allows many irrelevant formulas to be logical conse- quences of A, including all tautologies and logically equivalent formulas. Despite of this irrelevant relationship between A and B, the concept of entailment is indis- pensable in logic. For instance, an inference rule is a pattern which takes formulas as premises, and returns a formula as conclusion. To check the soundness of the inference rule, we let the premises be represented by formula A and the conclusion represented by B, and check if A |= B, i.e., B is a logical consequence of A. If the number of variables is small, we may use the truth table method to check that if every model of A remains to be a model for B.
Example 2.2.18. The modus ponens rule says that given the premises p → q and p, then we draw the conclusion q. Let A be (p → q)∧p, then M(A) = {(1,1)}, and
44 M(q) = {(1, 1), (0, 1)}. Since M(A) ⊆ M(q), so q is a logical consequence of A, or
A |= B.
Proposition 2.2.19. The relation |= is transitive, that is, if A |= B and B |= C,
then A |= C.
The proof is left as an exercise.
Definition 2.2.20. Given a formula A, the set {B | A |= B} is called the theory of A and is denoted by T (A). Every member of T (A) is called a theorem of A.
In the above definition, A can be a set of formulas to represent the conjunction of its members.
Proposition 2.2.21. The following three statements are equivalent: (a) A |= B; (b) M(A)⊆M(B); and (c) T(B) ⊆T(A).
Proof. By definition, A |= B iff M(A) ⊆ M(B). Also by definition, B |= C for any C ∈ T (B). If A |= B, then A |= C because of the transitivity of |=, so C ∈ T (A). IfT(B) ⊆T(A),sinceB∈T(B),soB∈T(A),thusA|=B. P
Corollary 2.2.22. For any formula A, T (⊤) ⊆ T (A) ⊆ T (⊥).
Proof. This holds because ⊥ |= A and A |= ⊤. Note that T (⊤) contains every
tautologies and T (⊥) contains every formula. P Corollary 2.2.23. For any formulas A and B, the following statements are equiv-
alent: (a) A≡B; (b) M(A)=M(B); and (c) T(A)=T(B).
Proof. Since A ≡ B iff A |= B and B |= A, this corollary holds by applying
Proposition 2.2.21 twice. P
Intuitively, we may regard a formula A as a constraint to mark some interpre- tations as “no-model” and M(A) records those interpretations which are models of A. The stronger the constraint, the less the set of models. The strongest constraint is ⊥ (no interpretations left) and the weakest constraint is ⊤ (no constraints). On the other hand, T (A) collects all constraints which do not mark any model in M(A) as “no-model”.
Proposition 2.2.24. Let A and B be sets of formulas. If A ⊆ B ⊆ T (A), then M(A) = M(B).
Proof.LetC=B−A. SinceB⊆T(A),soC⊆T(A),orA|=C,whichimplies M(A) ⊆ M(C). Thus M(B) = M(A∪C) = M(A∧C) = M(A)∩M(C) = M(A), because M(A) ⊆ M(C). P
Corollary 2.2.25. If C ∈T(A), then T(A)=T(A∧C).
Proof. Let B = A∪{C}, then A ⊆ B ⊆ T(A), so M(A) = M(B) = M(A∧C).
Hence T (A) = T (A ∧ C) by Corollary 2.2.23. P Corollary 2.2.26. If ⊥ ∈ T (A), then A is unsatisfiable.
Proof. Let B = A ∪ {⊥}, then A ⊆ B ⊆ T (A), so M(A) = M(B) = M(A ∧ ⊥) = M(⊥) = ∅. P
One task of theorem proving is to show that, given a set A of axioms, B is a theorem of A, that is, B ∈ T(A). If we use the operator →, to show that B is a logical consequence of A by showing (A → B) ∈ T (⊤), as given by the following proposition.
Proposition 2.2.27. For any formulas A and B, A |= B iff |= A → B.
Proof. A |= B means, for any interpretation σ, if σ(A) = ⊤, then σ(B) must be ⊤; if σ(A) = ⊥, we do not care about the value of σ(B). The two cases can be combined as σ(A) → σ(B) = 1, or σ(A → B) = 1, which means A → B is valid, because σ is any interpretation. P
The above proposition shows the relationship between |= (a semantic nota- tion) and → (a syntactical symbol). The closure of the entailment relation under substitution generalizes the fact that from p ∧ q |= p, all entailments of the form A ∧ B |= A arise from substituting A for p and B for q, just like the closure of the equivalence relation under substitution.
2.2.5 Theorem Proving and the SAT Problem
The concepts of validity, unsatisfiability, or entailment give rise difficult prob-
lems of computation. These three problems have the same degree of difficulty and can be represented by theorem proving, which asks to check if a formula A ∈ T (⊤) (A is valid), or ⊥ ∈ T (A) (A is unsatisfiable), or A is in T (B) (B entails A). The- orem proving by truth table runs out of steam pretty quickly: a formula with n variables has a truth table of 2n interpretations, so the effort grows exponentially with the number of variables. For a formula with just 30 variables, that is already over a billion interpretations to check!
45
The general problem of deciding whether a formula is A satisfiable is called the SAT problem. One approach to SAT is to construct a truth table and check whether or not a ⊤ ever appears, but as with testing validity, this approach quickly bogs down for formulas with many variables because truth tables grow exponentially with the number of variables.
The good news is that SAT belongs to the class of NP problems, where the solutions of these problems can be efficiently checked. For example, given an interpretation σ, we can check in linear time (in terms of the size of A) if σ is a model of A, thus showing A is satisfiable. This can be done using eval(A,σ)). If a problem can be solved in polynomial time (with respect to the size of inputs), it belongs to the class P. SAT is not known to be in P, and theorem proving is not known to be in NP.
The bad news is that SAT is known to be the hardest problem in NP. That is, if there exists a polynomial-time algorithm to solve SAT, then every problem in NP can be solved in polynomial-time. These problems include many other important problems involving scheduling, routing, resource allocation, and circuit verification across multiple disciplines including programming, algebra, finance, and political theory. This is the famous result of Cook and Levin who proved that SAT is the first NP-complete problem. Does this mean that we cannot find an algorithm which takes O(n100) time, where n is the size of A, to decide A is satisfiable? No one knows. This is the most famous open problem in computer science: P = NP? It is also one of the seven Millennium Problems: the Clay Institute will award you $1,000,000 if you solve the P = NP problem. If P = NP, then SAT will be in P; theorem proving will be also in P. If P ̸= NP, then theorem proving won’t be in NP: it belongs the complement of the NP-complete problems. An awful lot hangs on the answer and the general consensus is P ̸= NP.
In the last twenty years, there have been exciting progress on SAT-solvers for practical applications like digital circuit verification. These programs find satisfying assignments with amazing efficiency even for formulas with thousands of variables. In the next two chapters, we will study the methods for theorem proving and SAT- solvers.
2.3 Normal Forms
By “normal form”, we mean every formula can be converted into an equiva- lent formula with certain syntactical restriction on the use and position of Boolean operators. If all equivalent formulas has a unique normal form, that normal form is also called “canonical form”.
Since it is expensive to use truth table to prove the equivalence relation when
46
the number of variables is big, an alternative approach is to use algebra to prove equivalence. A lot of different operators may appear in a propositional formula, so a useful first step is to get rid of all but three: ∧,∨ and ¬. This is easy because each of the operators is equivalent to a simple formula using only these three. For example, A → B is equivalent to ¬A ∨ B.
Using the equivalence relations discussed in the previous section, any propo- sitional formula can be proved equivalent to a canonical form. What has this got to do with equivalence? That’s easy: to prove that two formulas are equivalent, convert them both to canonical forms over the set of variables that appear in any formula. Now if two formulas are equivalent to the same canonical form, then the formula are certainly equivalent. Conversely, if two formulas are equivalent, they will have the same canonical form. We can also use canonical form to show a for- mula is valid or unsatisfiable: it is valid if its canonical form is ⊤; it is unsatisfiable if its canonical form is ⊥.
In this section, we present four normal forms:
• negation normal form (NNF);
• conjunctive normal form (CNF);
• disjunctive normal form (DNF); and
• ITE normal form (INF).
Of the four normal forms, there exist canonical forms for the last three. How- ever, the canonical forms derived from DNF or CNF are essentially copies of truth tables, thus not effective enough for showing the equivalence relation. Only the last one, INF, may have a compact size so that equivalence of two formulas can be conveniently established in this canonical form.
We will define a normal form by giving a set of transformation rules, that is a set of pairs of formulas, one is the left hand side and the other is the right hand side of the rule. Most importantly, the left side of each rule is logically equivalent to its right side. Each rule deals with a formula which is not in normal form and the application of the rule on the formula will result in a new formula closer to normal form. To convert any formula into a formula in normal form, we pick a rule, matching the left side of the rule to a subformula and replacing the subformula by the right side of the rule.
A general procedure for obtaining normal forms goes as follows: 1. If the formula is already in normal form then stop with success.
47
2. Otherwise it contains a subformula violating the normal form criteria.
3. Pick such a subformula and find the left hand side of a rule that matches.
4. Replace the subformula by the right hand side of the rule and continue.
2.3.1 Negation Normal Form (NNF)
A formula A is said to be a literal if A is a variable (positive literal) or A is
¬p (negative literal), where p is a variable.
Definition 2.3.1. A propositional formula A is a negation norm form (NNF) if the
negation symbol ¬ appears in literals (as subformulas of A).
In a NNF, if we replace negative literals by a variable, there will be no ¬ in
the resulting formula.
Proposition 2.3.2. Every propositional formula can be transformed into an equiv- alent NNF.
The proof is done by the following algorithm, which uses only equivalence relations in the transformation.
Equivalence rules for obtaining NNF
Algorithm 2.3.3. The algorithm to convert a formula into NNF takes the following three groups of rules:
1. Use the following equivalences to remove →, ⊕, and ↔ from the formula.
A→B ≡ ¬A∨B;
A⊕B ≡ (A∨B)∧(¬A∨¬B);
A↔B ≡ (A∨¬B)∧(¬A∨B).
2. Use the de Morgan laws to push ¬ down to variables:
¬¬A ≡ A;
¬(A∨B) ≡ ¬A∧¬B; ¬(A∧B) ≡ ¬A∨¬B.
3. UseA∨⊥≡A,A∨⊤≡⊤,A∧⊥≡⊥,andA∧⊤≡A,tosimplifythe formula.
48
For example, ⊤ and ¬p ∨ q are NNF. ¬(¬p ∨ q) is not NNF; we may transform it into p ∧ ¬q, which is NNF. If the input formula contains only ¬, ∨, ∧, and →, it takes only linear time to obtain NNF by working recursively top-down. If ↔ or ⊕ are present, the computing time and the size of the output formula may be exponential in terms of the original size, because the arguments of ↔ and ⊕ are duplicated during the transformation.
Definition 2.3.4. Given a NNF A, if we replace every occurrence of ∧ by ∨, ∨ by ∧, positive literal by its negation, and negative literal ¬p by its counterpart p, we obtain a formula B, which is called the dual of A, denoted by B = dual(A).
For example, if A is p ∧ (q ∨ (¬p ∧ ¬r)), then dual(A) = ¬p ∨ (¬q ∧ (p ∨ r)). Both A and dual(A) are NNF.
Proposition 2.3.5. If A is in NNF, then dual(A) is also in NNF and dual(A) ≡ ¬A.
The proposition can be easily proved by DeMorgan’s law.
2.3.2 Conjunctive Normal Form (CNF)
Definition 2.3.6. In propositional logic, a clause is either ⊥ (the empty clause), a literal (unit clause) or a disjunction of literals.
A clause is also called maxterm in the community of circuit designs. Since ∨ is commutative and associative, we may represent a clause by a list of literals. We may use A ∨ A ≡ A to remove duplicate literals in a clause.
Definition 2.3.7. A formula is said to be a conjunctive normal form (CNF) if it is a conjunction of clauses.
A CNF is also called a product of sums (POS) expression. The idea of con- junctive normal form is that each type of connective appears at a distinct height in the formula. The negation signs, ¬, are lowest, with only propositions as their arguments. The disjunction signs, ∨, are in the middle, with only literals as their arguments. And the conjunction signs, ∧, are at the top, taking the disjunctions as their arguments. Of course, some operators may be missing in CNF. For example, p ∧ q is a CNF, which contains two unit clauses: p and q.
49
Equivalence rules for obtaining CNF
• stage 1: The equivalence rules for obtaining NNF.
• stage 2: A∨(B ∧C) ≡ (A∨B)∧(A∨C), where the arguments of ∧ and ∨ can be switched.
Proposition 2.3.8. Every propositional formula can be transformed into an equiv- alent CNF.
To show the proposition is true, we first convert the formula into NNF, and then apply the distribution law to move ∨ under ∧. This process will terminate because we have only a finite number of ∨ and ∧.
Example 2.3.9. To convert p1 ↔ (p2 ↔ p3) into CNF, we first use A ↔ B ≡ (¬A∨B)∧(A∨¬B) to remove the second ↔:
p1 ↔ ((¬p2 ∨p3)∧(p2 ∨¬p3)), and then remove the first ↔:
(¬p1 ∨((¬p2 ∨p3)∧(p2 ∨¬p3)))∧(p1 ∨¬((¬p2 ∨p3)∧(p2 ∨¬p3))). Its NNF is
(¬p1 ∨((¬p2 ∨p3)∧(p2 ∨¬p3)))∧(p1 ∨((p2 ∧¬p3)∨(¬p2 ∧p3))). We then obtain the following non-tautology clauses:
¬p1 ∨¬p2 ∨p3, ¬p1 ∨p2 ∨¬p3, p1 ∨p2 ∨p3, p1 ∨¬p2 ∨¬p3.
Since ∨ is commutative and associative and X ∨ X ≡ X, a clause can be represented by a set of literals. In this case, we use | for ∨ and p ̄ for ¬p in a clause. For example, the above clauses will be displayed as
(p ̄ |p ̄ |p), (p ̄ |p |p ̄), (p |p |p), (p |p ̄ |p ̄). 123 123 123 123
Definition 2.3.10. If a clause includes every variable exactly once, the clause is said to be full. A CNF is called a full CNF if every clause of the CNF is full.
If a clause C misses a variable r, we may replace C by two clauses: {(C | r),(C | r ̄)}. Repeating this process for every missing variable, we will obtain an equivalent full CNF. Transforming a set of clauses into an equivalent full CNF is sound because for any formula A and B, the following relation is true:
A ≡ (A∨B)∧(A∨¬B), which can be proved by truth table.
50
Equivalence rules for obtaining full CNF
• stage 1: The equivalence rules for NNF.
• stage 2: A∨(B ∧C) ≡ (A∨B)∧(A∨C), where the arguments of ∧ and ∨ can be switched.
• stage 3: A ≡ (A∨p)∧(A∨¬p), if A is a clause and p does not appear in A.
Proposition 2.3.11. Every propositional formula can be transformed into an equiv- alent and unique full CNF (up to the commutativity and associativity of ∧ and ∨). That is, full CNF is a canonical form.
Given a CNF A, each clause can be regarded as a constrain on what interpre- tations cannot be a model of A. For example, if A contains the unit clause (¬p), then all the interpretations where p → 1 cannot be a model of A and these are half of all interpretations. If A contains a binary clause (p | q), then all interpretations where p → 0, q → 0 (there are a quarter of such interpretations) cannot be a model of A. If A has a clause C containing every variable exactly once, then this clause removes only interpretation, i.e., ¬C, as a model of A. If A has m models over Vp, i.e., |M(A)| = m, then a full CNF of A contains 2|Vp| − m clauses, as each clause removes exactly one interpretation as a model of A. That is exactly the idea for a proof of the above proposition.
The concept of full CNF is useful for theoretic proofs. Unfortunately, the size of a full CNF is exponential in terms of number of variables, too big for any practical application. CNFs are useful to specify many problems in practice, because practical problems are often specified by a set of constraints. Each constraint can be specified by a set of clauses, and the conjunction of all clauses from each constraint gives us a complete specification of the problem.
2.3.3 Disjunctive Normal Form (DNF)
If we regard ∧ as multiplication, and ∨ as addition, then a CNF is a “product of
sums. On the other hand, a “sum of products” is a disjunctive normal form (DNF). A formula A is said to be a minterm if A is either ⊤, a literal, or a conjunction of literals. Product is also called product or product term in the community of circuit designs.
Like clauses, we remove duplicates in a minterm and represent them by a set of literals.
Definition 2.3.12. A formula A is a disjunctive normal form (DNF) if A is a disjunction of products.
51
52 A DNF is also called a sum of products (SOP) expression. A formula like
p∧¬q or p∨¬q can be both a CNF and a DNF. Equivalence rules for obtaining DNF
• stage 1: The equivalence rules for NNF.
• stage 2: A∧(B ∨C) ≡ (A∧B)∨(A∧C), where the arguments of ∧ and ∨
can be switched.
Proposition 2.3.13. Every propositional formula can be transformed into an equiv- alent DNF.
To show the proposition is true, we first convert the formula into NNF in the first stage, and then apply the distribution law in the second stage to move ∧ under ∨.
Once a formula A is transformed into DNF, it is very easy to check if A is satisfiable: If DNF contains a product in which each variable appears at most once, then A is satisfiable, because we can simply assign 1 to each literal of this product and the product becomes true. Since it is a hard problem to decide if a formula is satisfiable, it is also hard to obtain DNF, due to its large size.
Like CNF, DNF is not a canonical form. For example, the following equiva- lence is true:
p ∧ ¬q ∨ ¬p ∧ ¬r ∨ q ∧ r ≡ ¬p ∧ q ∨ p ∧ r ∨ ¬q ∧ ¬r However, we cannot convert the both sides to the same formula.
Definition 2.3.14. A product is full if it contains every variable exactly once. A DNF is full if every product in the DNF is full.
A full minterm in a DNF A is equivalent to a model of A. For example, if Vp = {p,q,r}, then the product p∧q∧¬r specifies the model {p → 1,q → 1,r → 0}. Obviously, A full DNF A has m full products iff A has m models. Like full CNF, every propositional formula is equivalent to a unique full DNF, that is, full DNF is a canonical form.
Equivalence Rules for obtaining full DNF
• stage 1: The equivalence rules for NNF.
• stage 2: A∧(B ∨C) ≡ (A∧B)∨(A∧C), where the arguments of ∧ and ∨ can be switched.
53 • stage 3: A ≡ (A∧p)∨(A∧¬p), if A is a product and p does not appear in A.
Proposition 2.3.15. Every propositional formula can be transformed into an equiv- alent and unique full DNF (up to the commutativity and associativity of ∧ and ∨).
2.3.4 Full DNF and Full CNF from Truth Table
A truth table is often used to define a Boolean function. In the truth table,
the truth value of the function is given for every interpretation of the function. Each interpretation corresponds to a full minterm and the function can be defined as a sum (disjunction) of all the minterms for which the function is true in the corresponding interpretation.
Example 2.3.16. Suppose we want to define a Boolean function f : B3 −→ B, where B = {0,1} and the truth value of f(a,b,c) is given in the following truth table. Following the convention of EDA, we will use + for ∨, · for ∧, and A for ¬A.
a
b
c
f
minterms
clauses
0 0 0 0 1 1 1 1
0 0 1 1 0 0 1 1
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 0
m0 = abc m1 = abc m2 = abc m3 = abc m4 = abc m5 = abc m6 = abc m7 = abc
M0 =a+b+c M1 =a+b+c M2 =a+b+c M3 =a+b+c M4 =a+b+c M5 =a+b+c M6 =a+b+c M7 =a+b+c
For each interpretation σ of (a,b,c), we define a minterm mi, where i is the decimal value of σ, when σ is viewed as a binary number. For each mi, we also define a clause Mi and the relation between mi and Mi is that ¬mi ≡ Mi. For the function f defined in the truth table, f = m1 + m3 + m5, or f = abc + abc + abc.
In the above example, for each model of f, we create a minterm (product term) in its full DNF: If the variable is true in the model, the positive literal of this variable is in the product; if a variable is false, the negative literal of the variable is in the product. f is then defined by a full DNF consisting of these minterms. In other words, every Boolean function can be defined by a DNF. We can also define the same function by a full CNF. This can be processed as follows: At first, we define the negation of f by a full DNF from the truth table:
f =m0 +m2 +m4 +m6 +m7
Apply the negation on the both sides of the above equation, we have f =M0 ·M2 ·M4 ·M6 ·M7
where · stands for ∧ and mi ≡ Mi. In other words, f can be defined by a full CNF. Note that M0 ·M2 ·M4 ·M6 ·M7 is the dual of m0 +m2 +m4 +m6 +m7. In general, the dual of DNF is CNF and the dual of CNF is DNF.
Once we know the principle, we can construct a full CNF directly from the truth table of a formula. That is, for each non-model interpretation in the truth table, we create one full clause: If a variable is true in the interpretation, the negative literal of this variable is in the clause; if a variable is false, the positive literal of the variable is in the clause.
Example 2.3.17. The truth table of (p ∨ q) → ¬r has eight lines and three in- terpretations of ⟨p, q, r⟩ are non-models: ⟨1, 1, 1⟩, ⟨1, 0, 1⟩, and ⟨0, 1, 1⟩. From these three interpretations, we obtain three full clauses:
(¬p ∨ ¬q ∨ ¬r) ∧ (¬p ∨ q ∨ ¬r) ∧ (p ∨ ¬q ∨ ¬r)
2.3.5 Binary Decision Diagram (BDD)
Let ite : {1,0}3 −→ {1,0} be the ”if-then-else” operator such that, for any
formula A and B, ite(1,A,B) ≡ A and ite(0,A,B) ≡ B. In fact, ite, 1 and 0 can be used to represent any logical operator. For example, ¬y ≡ ite(y, 0, 1). The following table shows that every binary logical operator can be represented by ite. Following the convention on BDD, we will use 1 for ⊤ and 0 for ⊥ in every propositional formula.
Definition 2.3.18. A formula is said be an ite normal form (INF) if the for- mula uses logical operators no more than ite, 1, and 0, and, for each occurrence of ite(C,A,B) in the formula, C is a propositional variable, and A and B are INFs.
By definition, 1 and 0 are INFs, but p is not; ite(p, 1, 0) is.
Proposition 2.3.19. Every propositional formula can be transformed into an equiv-
alent INF.
For example, let A be ¬p∨q∧¬r, its equivalent INF is ite(p, ite(q, ite(r, 0, 1), 0), 1). To prove the above proposition, we provide the algorithm below to do the transfor- mation.
54
55
output
formula ite formula
0000 0 0
x ∧ y x∧¬y(x>y) x ¬x∧y(x
Since ite(C, A, A) ≡ A, we may use this equivalence repeatedly to simplify an INF, until no more simplification can be done. The resulting INF is called reduced. The corresponding BDD is called reduced if, additionally, identical subformulas of an INF are represented by a unique node.
123 12 23
57
Figure 2.3.1: The BDD of INF ite(x1, ite(x2, 1, 0), ite(x2, ite(x3, 1, 0), ite(x3, 0, 1))) derived from x ̄x ̄x ̄ ∨x x ∨x x .
Figure2.3.2: ThefirstBDDusesa1 >b1 >a2 >b2 >a3 >b3 andthesecondBDD uses a1 > a2 > a3 > b1 > b2 > b3 for the same formula (a1 ∧b1)∨(a2 ∧b2)∨(a3 ∧b3).
Proposition 2.3.22. All logical equivalent INFs, using the same order on proposi- tional variables, have a unique reduced OBDD (ROBDD).
Proof. Since a ROBDD is a graph representing an INF A, we may do an induction proof on the number of variables in A. If A contains no variables, then 1 or 0 are represented by the unique nodes. Let pn > pn−1 > · · · > p2 > p1 be the ordering on the n variables in A. Then A ≡ ite(pn,B,C) for INFs B, C, and B ̸≡ C. If n = 1, then {B,C} = {0,1}, and A is represented by a unique node. If n > 1, because B and C are INFs without variable pn, by induction hypothesis, B and C are represented by the unique ROBDDs, respectively. Thus, A = ite(pn, B, C) has a unique ROBDD representation. P
It is easy to apply ite(p, A, A) ≡ A as a reduction rule on INF. To implement the sharing of common subformulas, we may use a hash table which contains the triple ⟨v,x,y⟩ for each node ite(v,x,y) in a ROBDD, where v is a propositional variable and both x and y, x ̸= y, are nodes in the ROBDD. The hash table provides two operations: lookupHashTable(v, x, y) checks if ⟨v, x, y⟩ exists in the hash table: if yes, the node is returned; otherwise, null is returned. The other operation is saveCreateNode(v, x, y), which creates a new node for the triple ⟨v, x, y⟩ and saves the node in the hash table. The pseudo-code of the algorithm for creating ROBDD is given below.
Algorithm 2.3.23. The algorithm ROBDD takes a formula A and returns a ROBDD node for A. vars(A) returns the set of propositional variables of A, topVariable(S) returns the maximal variable of S by the given variable order. simplify(A) returns the truth value of a formula without variables.
proc ROBDD(A)
if vars(A) = ∅ return simplify(A);
v := topVariable(vars(A));
x := ROBDD(A[v ↼ 1]);
y := ROBDD(A[v ↼ 0]);
if (x = y) return x; // reduction
p := lookupHashTable(v, x, y);
if (p ̸= null) return p; // sharing
return saveCreateNode(v, x, y); // a new node is created
The input A to ROBDD can contain the nodes of a ROBDD, so that we can perform various logical operations on ROBDDs.
58
Example 2.3.24. Assume a > b > c, and let A be ite(F,G,H), where F = ite(a, 1, b), G = ite(a, c, 0), and H = ite(b, 1, d). Then ROBDD(A) will call ROBDD(A[a ↼ 1]) and ROBDD(A[a ↼ 0]), which returns c and J = ite(b,0,H), respectively. Finally, ROBDD(A) returns ite(a,c,J).
ROBDDs have used for presenting Boolean functions, symbolic simulation of combinational circuits, equivalence checking and verification of Boolean functions, and finding and counting models of propositional formulas.
2.4 Optimization Problems
There exist many optimization problems in propositional logic. For example, one of them is to find a variable ordering for a formula, so that the resulting ROBDD is minimal. In this section, we introduce three optimization problems.
2.4.1 Minimum Set of Operators
To represent the sentences like A provided B, which represent the converse
implication, we do not need a formula like A ⇐ B, because we may use simply B → A. This and similar reasons explain why only a few of the sixteen binary Boolean functions require notation. In CNF or DNF, we need only three Boolean operators: ¬, ∨, and ∧; all other Boolean operators can be defined in terms of them.
Definition 2.4.1. A set S of Boolean operators is said to be sufficient if every other Boolean operators can be defined using only the operators in S. S is said to be minimally sufficient if S is sufficient and no proper subset of S is a sufficient set of operators.
Obviously, {¬, ∨, ∧} is a sufficient set of operators. For example, A ⊕ B ≡ (A∨¬B)∧(¬A∨B)andite(A,B,C)≡(A∧B)∨(¬A∧C). Istheset{¬,∨,∧} minimally sufficient? The answer is no, because A ∨ B ≡ ¬(¬A ∧ ¬B). Thus, {¬,∧} is a sufficient set of operators. {¬,∨} is another sufficient set, because A ∧ B ≡ ¬(¬A ∧ ¬B). To show that {¬, ∧} is minimally sufficient, we need to show that neither {¬} nor {∧} is sufficient. Since ¬ takes only one argument and cannot be used alone to define ∧, {¬} cannot be sufficient. We will see later that {∧} is not sufficient.
From IND (ITE normal form), we might think {ite} is minimally sufficient. That is not true, because we also use 1 and 0 (stand for ⊤ and ⊥) which are nullary operators. For the set {¬, ∧}, ⊥ ≡ p ∧ ¬p for any propositional variable p.
59
Do we have a minimally sufficient set of Boolean operators which contains a single operator? The answer is yes and there are two binary operators as candidates. They are ↑ (nand, the negation of and) and ↓ (nor, the negation of or):
A ↑ B ≡ ¬(A ∧ B) A ↓ B ≡ ¬(A ∨ B) Proposition 2.4.2. Both {↑} and {↓} are minimally sufficient.
Proof. To show {↑} is sufficient, we just need to define ¬ and ∧ using ↑: ¬(A) ≡ A ↑ A and A∧B ≡ ¬(A ↑ B). The proof that {↓} is sufficient is left as an exercise. P
In fact, among all the sixteen binary operators, only ↑ and ↓ have such prop- erty.
Theorem 2.4.3. For any binary operator which can be used to define both ¬ and ∧, that operator must be either ↑ or ↓.
Proof. Let o be any binary operator. If ¬ can be defined by o, then ¬A ≡ t(o, A), where t(o,A) is the formula represented by a binary tree t whose leaf nodes are labeled with A and whose internal nodes are labeled with o. Furthermore, we assume that t is such a binary tree with the least number of nodes. If we replace A by 1, then t(o,1) should have the truth value 0, because ¬1 = t(o,1) = 0. If any internal node of t(o,1) has the truth value 1, we may replace that node by a leaf node 1; if any internal node other than the root of t has truth value 0, we may use the subtree at this internal node as t. In both cases, this is a violation to the assumption that t is such a tree with the least of nodes. As a result, t should have only one internal node, i.e., 1 o 1 = ¬1 = 0. Similarly, we should have 0 o 0 = ¬0 = 1.
There are four cases regarding the truth values of 1 o 0 and 0 o 1.
• case1: 1o0=0and0o1=1. ThenAoB=¬Aandwecannotusethiso to define A ∧ B, as the output from o on A and B will be one of A, ¬A, B, or ¬B.
• case2: 1o0=1and0o1=0. ThenAoB=¬B,andwestillcannotuse this o to define A∧B for the same reason as in Case 1.
• case3: 1o0=0o1=1. ois↑.
• case4: 1o0=0o1=0. ois↓.
60
61
p
q
p o1 q
p o2 q
p o3 q
p o4 q
1
1
0
0
0
0
0
1
1
0
1
0
1
0
0
1
1
0
0
0
1
1
1
1
Table 2.4.1: p o1 q ≡ ¬p (the negation of projection on the first argument); p o2 q ≡ ¬q (the negation of projection on the second argument); p o3 q ≡ p ↑ q (NAND); andpo4 q≡p↓q(NOR).
The last two cases give us the definitions of ↑ and ↓, as shown in Table 2.4.1. P Corollary 2.4.4. Both {¬, ∧} and {¬, ∨} are minimally sufficient.
In Boolean algebra, the property of any sufficient set of operators can be specified using equations with variables and the operator. To specify the property of ↑, we may use three equations: 0 ↑ Y = 1, X ↑ 0 = 1, and 1 ↑ 1 = 0. For ↑, what is the minimal number of equations needed? And what is the minimal number of symbols in the equations? These questions have been answered firmly: only one equation is needed and only three variables (for a total of either occurrences) and six uses of ↑ are needed. We state below the result without proof.
Proposition 2.4.5. The Boolean algebra can be generated by the following equation: ((x ↑ y) ↑ z) ↑ (x ↑ ((x ↑ z) ↑ x)) = z
The above equation contains three variables and 14 symbols (excluding paren- theses). For the set {¬, ∨}, a single axiom also exists to specify the Boolean algebra:
Proposition 2.4.6. The Boolean algebra can be generated by the following equation: ¬(¬(¬(x ∨ y) ∨ z) ∨ ¬(x ∨ ¬(¬z ∨ ¬(z ∨ u)))) = z.
The above equation contains four variables and 21 symbols (excluding paren- theses).
2.4.2 Logic Minimization
Logic minimization, also called logic optimization, is a part of circuit design
process and its goal is to obtain the smallest combinational circuit that is represented by a Boolean formula. Logic minimization seeks to find an equivalent representation
of the specified logic circuit under one or more specified constraints. Generally the circuit is constrained to minimum chip area meeting a pre-specified delay. Decreas- ing the number of gates will reduce the power consumption of the circuit. Choosing gates with less transistors will reduce the circuit area. Decreasing the number of nested levels of gates will reduce the delay of the circuit. Logic minimization will reduce substantially the cost of circuits and improve its quality.
2.4.2.1 Karnaugh Maps for Minimizing DNF and CNF
In the early days of electronic design automation (EDA), a Boolean function is
often defined by a truth table and we can derive full DNF (sum of products) or full CNF (product of sums) directly from the truth table, as shown in Example 2.3.16. The obtained DNF or CNF need to be minimized, in order to use the least number of gates to implement this function.
Following the convention of EDA, we will use + for ∨, · for ∧, and A for ¬A.
Example 2.4.7. Let f : {0,1}2 −→ {0,1} be defined by f(A,B) = AB+AB+AB. We need three AND gates and one OR gate of 3-input. However, f(A,B) ≡ A+B. Using A + B, we need only one OR gate.
The equivalence relations used in the above simplification process are AB + AB ≡ A and A+AB ≡ A+B (or A+AB ≡ A+B. For the above example, AB + AB ≡ B and B + AB ≡ B + A.
We want to find an equivalent circuit of minimum size possible, and this is a difficult computation problem. There are many tools such as Karnaugh maps available to achieve this goal.
The Karnaugh map (K-map) is a popular method of simplifying DNF or CNF formulas, originally proposed by Maurice Karnaugh in 1953. The Karnaugh map reduces the need for extensive calculations by displaying the output of the function graphically and taking advantage of humans’ pattern-recognition capability. It is suitable for Boolean functions with 2 to 4 variables. When the number of variables is greater than 4, it is better to use an automated tool.
The Karnaugh map uses a two-dimensional grid where each cell represents an interpretation (or equivalently, a full minterm) and the cell contains the truth value of the function for that interpretation. The position of each cell contains all the information of an interpretation and the cells are arranged such that adjacent cells differ by exactly one truth value in the interpretation. Adjacent 1s in the Karnaugh map represent opportunities to simplify the formula. The minterms for the final formula are found by encircling groups of 1s in the map. Minterm groups must be rectangular and must have an area that is a power of two (i.e., 1, 2, 4, 8, …).
62
Figure2.4.1: K-mapforf(x,y)=xy+xy+xy.
Minterm rectangles should be as large as possible without containing any 0s (it may contain don’t-care cells). Groups may overlap in order to make each one larger. A least number of groups that cover all 1s will represent the minterms of a DNF of the function. These minterms can be used to write a minimal DNF representing the required function and thus implemented by the least number of AND gates feeding into an OR gate. The map can also be used to obtain a minimal CNF that is implemented by OR gates feeding into an AND gate.
Let f(x, y) = xy + xy + xy. There are four minterms of x and y, the K-map has four cells, that is, four minterms corresponding to the four interpretations on x and y. As the output of f, three cells have value 1 and one cell (i.e., xy) has value 0. Initially, each cell belongs to a distinct group of one member (Figure 2.4.1).
The merge operation on K-map: Merge two adjacent and disjoint groups of the same size and of the same truth value into one larger group.
This operation creates groups of size 2, 4, 8, etc., and cells are allowed to appear in different groups at the same time. For example, cells xy and xy in Figure 2.4.1 are adjacent, they are merged into one group {xy,xy}, which can be represented by y as a shorthand of the group. Similarly, cells xy and xy can be merged into {xy,xy}, which can be represented by x. Note that xy is used twice in the two merge operations. No more merge operations can be performed, and the final result is f = x + y (f is the NAND function). This result can be proved logically as follows:
f = xy+xy+xy
= (xy+xy)+(xy+xy) = (x+x)y+x(y+y)
= y+x
63
64
Figure2.4.2: TheK-mapforobtainingf(A,B,C)=AB+AC.
Example 2.4.8. Consider the function f defined by the truth table in Figure 2.4.2. The corresponding K-map has four cells with truth value 1 as f has four models shown in the truth table. There are three possible merge operations: (i) ABC and ABC merge into AC; (ii) ABC and ABC merge into AB; and (iii) ABC and ABC merge into BC. Since all the cells with value 1 are in the groups represented by AC and AB, BC is redundant. So the simplified function is f = AC + AB, instead of f = AC + AB + BC. On the other hand, if g = AB + AC + BC, then none of the three minterms in g is redundant, because if you delete one, you will miss some 1s, even though the full minterm ABC is covered by all the three minterms.
From the above examples, it is clear that after each merge operation, two minterms are merged into one with one less variable. That is, the larger the group, the less the number of variables in the minterm to represent the group. This is possible because adjacent cells represent interpretations which differ by exactly one truth value.
Because K-maps are two-dimensional, some adjacent relations of cells are not shown on K-maps. For example, in Figure 2.4.2, m0 = ABC and m4 = ABC are adjacent, m1 = ABC and m5 = ABC are adjacent. Strictly speaking, K-maps are toroidally connected, which means that rectangular groups can wrap across the edges. Cells on the extreme right are actually “adjacent” to those on the far left, in the sense that the corresponding interpretations only differ by one truth value. We need a 3-dimensional graph to completely show the adjacent relation of 3 variables; a 4-dimensional graph to completely show the adjacent relation of 4 variables; etc.
Based on the idea of K-maps, an automated tool will create a graph of 2n nodes for n variables, each node represents one interpretation, and two nodes have an edge iff their interpretations differ by one truth value. The merge operation is then implemented on this graph to generate minterms for a least size DNF. The same graph is also used for generating Gray code (by finding a Hamiltonian path
65
Figure 2.4.3: The K-map for f(P,Q,R,S) = m0 +m2 +m5 +m7 +m8 +m10 + m13 + m15. The simplified formula is f = QS + QS.
in the graph).
Example 2.4.9. The K-map for f(P,Q,R,S) = m0 +m2 +m5 +m7 +m8 +m10 + m13+m15, is given in Figure 2.4.3, where m0 = PQRS, m2 = PQRS, m5 = PQRS, m7 =PQRS,m8 =PQRS,m10 =PQRS,m13 =PQRS,andm15 =PQRS(iin mi is the decimal value of PQRS in the corresponding interpretation). At first, m0 and m2 are adjacent and can be merged into PQS; m8 and m10 are also adjacent and can be merged into P QS. Then P QS and P QS can be further merged into QS. Similarly, m5 and m7 merge into PQS; m13 and m15 merge into PQS. Then PQS and PQS merge into QS. The final result is f = QS + QS. It would also have been possible to derive this simplification by carefully applying the equivalence relations, but the time it takes to do that grows exponentially with the number of minterms.
Note that the formula obtained from a K-map is not unique in general. For example, g(P,Q,R,S) = PQ+RS +PQS or g(P,Q,R,S) = PQ+RS +QRS are possible outputs for g from K-maps. This means the outputs of K-maps are not canonical.
K-maps can be used to generate simplified CNF for a function. In Exam- ple 2.3.16, we have shown how to obtain a full CNF from a full DNF. Using the same idea, if we simplify the full DNF first, the CNF obtained by negating the simplified DNF will a simplified one.
Example 2.4.10. Let f(A,B,C) = ABC +ABC +ABC +ABC and its K-map
66
Figure 2.4.4: The K-map for f(A,B,C) = ABC + ABC + ABC + ABC. The simplified CNF is f = (A + B + C)(A + B)(B + C).
is given in Figure 2.4.4. The negation of f is
f =ABC+ABC+ABC+ABC
Since ABC and ABC can merge into BC (shown in the figure), ABC and ABC can merge into AB, the simplified f is f = ABC + AB + BC. The simplified CNF for f is thus f = (A + B + C)(A + B)(B + C).
Karnaugh maps can also be used to handle “don’t care” values and can simplify logic expressions in software design. Boolean conditions, as used for example in conditional statements, can get very complicated, which makes the code difficult to read and to maintain. Once minimized, canonical sum-of-products and product-of- sums expressions can be implemented directly using AND and OR logic operators.
2.4.2.2 Other Issues in Logic Minimalization
Simplified Boolean functions obtained from K-maps are DNFs or CNFs. To
implement them in a circuit using CMOS (complementary metal–oxide–semiconductor), we need further optimization. One of the most important characteristics of CMOS devices is low static power consumption. CMOS devices draw significant power only momentarily during switching between on and off states. Consequently, CMOS devices do not produce as much waste heat as other forms of logic, like transis- tor–transistor logic (TTL), which normally has some standing current even when not changing state. These characteristics allow CMOS to integrate a high density
of logic functions on a chip. It was primarily for this reason that CMOS became the most widely used technology to be implemented in VLSI (very large scale inte- gration) chips.
Using CMOS, the NAND/NOR/NOT gates are cheaper to implement than the AND/OR gates. For a basic unbuffered NAND/NOR/NOT gate, we need only
two transistors per input. It works when that gate is one among many others, driving a few similar gates. Many practical discrete gates, however, will be buffered and they will need additional transistors to improve the output strength and prevent the output strength from depending on active inputs. Typically, AND/OR gates will be implemented using the NAND/NOR/NOT gates and need more transistors than NAND/NOR gates.
Example 2.4.11. If we directly implement f(A,B,C) = AC + B in CMOS, we need two inverters (for negation), one OR gate (as the negation of ↓) and one AND gate (as the negation of ↑). That is, we need four inverters (each uses two transistors), one NOR gate and one NAND gate (each uses four transistors), for a total of 16 transistors. We may convert f(A,B,C) = AC + B into the simplified and equivalent formula using only ↑ and ¬: f(A,B,C) = (A ↑ C) ↑ B, which uses a total of 10 transistors (two for the inverter; four each for the two NAND gates).
To convert DNF or CNF into a formula using only NAND/NOR/NOT, we may use the algorithm NNF2NOR.
Algorithm 2.4.12. The algorithm NNF2NOR takes a formula A in NNF as input and returns a formula B which is equivalent to A and uses only NOR/NAND/NOT operators. It calls a mutually recursive algorithm neg2NOR, which takes a formula A in NNF as input and returns a formula B which is equivalent to ¬A and uses only NOR/NAND/NOT operators.
proc NNF2NOR(A)
ifA∈Vp orA=¬BreturnA;
if A = B ∨ C return neg2NOR(B) ↑ neg2NOR(C); if A = B ∧ C return neg2NOR(B) ↓ neg2NOR(C);
proc neg2NOR(A)
if A ∈ Vp return ¬A;
if A = ¬B return B;
if A = B ∨ C return NNF2NOR(B) ↓ NNF2NOR(C); if A = B ∧ C return NNF2NOR(B) ↑ NNF2NOR(C);
For input A = (p ∨ q) ∨ r, NNF2NOR(A) returns (p ↓ q) ↑ r, which needs 12 transistors to implement. However, this result is not optimal, because another equivalent formula, i.e., p ↑ (q ↓ r), needs eight transistors to implement in CMOS. Relatively speaking, finding optimal formulas from DNF/CNF is an easy task. The
67
68 trick is to get rid of as many negation in literals as possible, by switching between
↑ and ↓, using the relation
(¬A ↑ B) ↓ ¬C ≡ A ↓ (B ↑ C);
(¬A ↓ B) ↑ ¬C ≡ A ↑ (B ↓ C).
If a 3-input NAND gate is available, the same A can be represented by ↑ (p, q, r), which needs 10 transistors to implement. Note that ↑ (A,B,C), (A ↑ B) ↑ C, and A ↑ (B ↑ C) are all logically different.
DNFs (sum of products) and CNFs (product of sums) are called 2-level for- mulas. A 3-level formula is a product of DNFs or a sum of CNFs. A 4-level formula is a product of 3-level formulas or a sum of 3-level formulas, and so on. Finding equivalent formulas of high levels can possibly reduce the size of formulas.
Example 2.4.13. Let f(a,b,c,d,f,g) = adf + aef + bdf + bef + cdf + cef + g. f is a DNF and we need 6 AND gates of 3-input and 3 OR gates of 3-input (to implement the 7 input OR) for a total of 9 gates. If we allow 3-level formulas, then f(a,b,c,d,f,g) ≡ (a + b + c)(d + e)f + g, which needs one OR gate of 3 input, 2 OR gates of 3 input, one AND gate of 3 input, for a total of 4 gates.
Besides using higher level formulas, CMOS technology allows to use other types of gates besides OR, AND, and inverters (for negation). For example, if we want to implement a function f(a,b) = ab+ab, since f(a,b) ≡ a ⊕ b, one XOR gate will suffice. Many circuit design systems provide a complex of libraries with a large number of gates. For example, the EE271 library allows the use of the following gates for combinational circuits:
inv, nand2, nand3, nand4, nor2, nor3, xor, xnor, aoi2, aoi22, oai2, oai22, xnor
where “xnor” is the negation of “xor”, i.e., the logical equal (↔), “aoi” stands for “AND-OR-Invert”, “oai” stands for “OR-AND-Invert”, and they define the following functions:
aoi2: aoi22: oai2: oai22:
F =(A∧B)∨C
F = (A∧B)∨(C ∧D) F =(A∨B)∧C
F = (A∨B)∧(C ∨D)
AOI and OAI gates can be readily implemented in CMOS circuitry. AOI gates are particularly advantaged in that the total number of transistors is less than
if the AND, NOT, and OR functions were implemented separately. This results in increased speed, reduced power, smaller area, and potentially lower fabrication cost. For example, an AOI2 gate can be constructed with 6 transistors in CMOS compared to 10 transistors using a 2-input NAND gate (4 transistors), an inverter (2 transistors), and a 2-input NOR gate (4 transistors), because F = (A ∧ B) ∨ C ≡ A ↑ B ↓ C.
Finding an equivalent and minimal circuit with high-level formulas of a rich set of gates becomes even more challenging and is an on-going research area in EDA.
2.4.3 Maximum Satisfiability
When a CNF is unsatisfiable, we are still interested in finding an interpretation
which makes most clauses to be true. This is so-called the maximum satisfiability problem, which is the problem of determining the maximum number of clauses, of a given Boolean formula in CNF, that can be made true by an interpretation (an as- signment of truth values to the variables of the formula). It is a generalization of the Boolean satisfiability problem, which asks whether there exists a truth assignment that makes all clauses true.
Example 2.4.14. Let A be a set of unsatisfiable clauses {(p ̄), (p ∨ q ̄), (p ∨ q), (p ∨ r ̄), (p ∨ r)}. If we assign 0 to p, two clauses will be false no matter what values are assigned to q and r. On the other hand, if p is assigned 1, only clause (p ̄) will be false. Therefore, if A is given as an instance of the MAX-SAT problem, the solution to the problem is 4 (four clauses are true in one interpretation).
More generally, one can define a weighted version of MAX-SAT as follows: given a conjunctive normal form formula with non-negative weights assigned to each clause, find truth values for its variables that maximize the combined weight of the satisfied clauses. The MAX-SAT problem is an instance of weighted MAX- SAT where all weights are 1.
Find the final exam schedule for a big university is a challenging job. There are hard constraints such as no students can take two exams at the same, and no classrooms can be hold two exams at the same time. There are also software constraints such that one student is preferred to take at most two exams in one day, or two exams are separated by two hours. To specify such a problem as an instance of weighted MAX-SAT is easy: We just give a very large weight to hard constraints and a small weight to soft constraints.
Various methods have developed to solve MAX-SAT or weighted MAX-SAT problems. These methods are related to solving SAT problems and we will present
69
them later in the book.
2.5 Using Propositional Logic
Propositions and logical operators arise all the time in computer programs. All programming languages such as C, C++, and Java use symbols like “&&”, “||”, and “!” in place of ∧, ∨, and ¬. Consider the Boolean expression appearing in an if-statement
if ( x > 0 || (x <= 0 && y > 100 )) …
If we let p denote x > 0 and q be y > 100, then the Boolean expression is abstracted as p∨¬p∧q. Since (p∨¬p∧q) ≡ (p∨q), the original code can be rewritten as follows:
if ( x > 0 || y > 100 ) …
In other words, the knowledge about the logic may help us to write neater codes. Consider another piece of the code:
if ( x >= 0 && A[x] == 0 ) …
Let p be x >= 0 and q be A[x] == 0, then the Boolean expression is p∧q. Since p∧q≡q∧p,itisnaturaltothinkthat( x >= 0 && A[x] == 0 )canbereplaced by ( A[x] == 0 && x >= 0 ) and the program will be the same. Unfortunately, this is not the case: the execution of ( A[x] == 0 && x >= 0 ) will be aborted when x < 0. This example shows that “&&” in programming languages are not commutative in general. Neither is “||” as ( x < 0 || A[x] == 0 ) is different from ( A[x] == 0 || x < 0 ). These abnormalities of the logic not only appear in array indices, but also when assignments are allowed in Boolean expressions. For example,x > 0 || (x = c) > 10isdifferentfrom(x = c) > 10 || x > 0inC, where x = c is an assignment command. As a programmer we need to be careful about these abnormalities.
2.5.1 Bitwise Operators
In programming languages like C, C++, and Java, an integer is represented
by a list of 16-, 32-, or 64-bits (for C, a char is regarded as an 8-bit integer or a byte). A bitwise operation operates on one or two integers at the level of their individual bits and several of them are logical operations on bits. It is a fast and simple action, directly supported by the processor, and should be used whenever possible.
70
(a)2 = (0,1,0,0,0,0,1,1) (b)2 = (0,0,0,1,0,0,1,0)
name symbol example
NOT OR AND XOR
shift right shift left
∼ (∼a)2 = (1,0,1,1,1,1,0,0)
| (a | b)2 = (0,1,0,1,0,0,1,1)
& (a & b)2 = (0,0,0,0,0,0,1,0) ∧ (a ∧ b)2 = (0,1,0,1,0,0,0,1)
>> (a >> 2)2 = (0,0,0,1,0,0,0,0) << (b << 2)2 = (0,1,0,0,1,0,0,0)
Table 2.5.1: Examples of bitwise operators.
For illustration purpose, we will use 8-bit integers as examples. In general, for any 8-bit integer x, the binary representation of x is a 8-bit vector (x)2 = (b7, b6, b5, b4, b3, b2, b1, b0), where bi ∈ {0, 1} and x = ∑7i=0 bi ∗ 2i. For example, for a = 67, its bit vector is (a)2 = (0,1,0,0,0,0,1,1), because 67 = 64 + 2 + 1 = 26 +21 +20, and for b = 18, (b)2 = (0,0,0,1,0,0,1,0), since 18 = 16+2 = 24 +21. In fact, (a)2 and (b)2 are internal representation of a and b in computer.
The following bitwise operators are provided, treating each bit as a Boolean value and performing on them simultaneously. We will use the bit vectors of a = 67, and b = 18 as examples.
If necessary, other bitwise operations can be implemented using the above four operations. For example a bitwise implication on a and b can be implemented as ∼a | b.
The bit shifts are also considered bitwise operations, because they treat an integer as a bit vector rather than as a numerical value. In these operations the bits are moved, or shifted, to the left or right. Registers in a computer processor have a fixed width, so some bits will be “shifted out” of the register at one end, while the same number of bits are “shifted in” from the other end. Typically, the “shifted out” bits are discarded and the “shifted in” bits are all 0’s. When the leading bit is the sign of an integer, the left shift operation becomes more complicated and its discussion is out of the scope of this book. In the table, we see that the shift operators take two integers: x << y will shift (x)2 left by y bits; x >> y will shift (x)2 right by y bits.
The value of the bit vector x << y is equal to x ∗ 2y, provided that this multiplication does not cause an overflow. E.g., for b = 18, b << 2 = 18 ∗ 4 = 72. Similarly, the value of x >> y is equal to the integer division x/2y. E.g., for a = 67,
71
a >> 2 = ⌊67/4⌋ = 16. Shifting provides an efficient way of multiplication and division when the second operand is a power of two. Typically, bitwise operations are substantially faster than division, several times faster than multiplication.
Suppose you are the manager for a social club of 64 members. For each gathering, you need to record the attendance. If each member has a unique member id ∈ {0, 1, …, 63}, you may use a set of member ids to record the attendance. Instead of a list of integers, you may use a single 64-bit integer, say s, to store the set. That is,ifamemberidisi,thentheith bitofsis1. Forthispurpose,weneedawayto setabitinstobe1. Supposes=0initially,wemayuse
s = s | (1 << i)
tosettheith bitofstobe1. Tosetthevalueoftheith bitofstobe0,use
s = s & ∼(1 << i).
To get the value of the ith bit of s, use s & (1 << i). For another gathering, we use another integer t. Now it is easy to compute the union or intersection of these sets: s | t is the union and s & t is the intersection. For example, if you want to check that nobody attended both gatherings, you can check if s & t = 0, which is more efficient than using lists for sets.
2.5.2 Specify Problems in Propositional Logic
Example 2.5.1. The so-called n-queen example is stated as follows: Given a chess- board of size n × n, where n > 0, how to place n queens on the board such that no two queens attack each other? For n = 2, 3, there is no solution; for other n, there exists a solution. To specify this problem in logic, we need n2 propositional variables. Let the variables be pi,j, where 1 ≤ i,j ≤ n, and the meaning of pi,j is that pi,j is true iff there is a queen at row i and column j. The following properties ensure that there are n queens on the board such that no two queens attack each other:
1. There is at least one queen in each row.
∧1≤i≤n(pi,1 ∨ pi,2 ∨ · · · ∨ pi,n)
2. No more than one queen in each row. ∧1≤i≤n(∧1≤j
(a) F = abc+abc+abc; (b) G = abc+abc+abc;
(c) H = a ⊕ b ⊕ c;
(d) I =(a∧b)→(b→c).
19. Let F be the set of all formulas built on the set of n propositional variables. Prove that there are exactly 22n classes of equivalent formulas among F.
20. (a) Prove that {→, ¬} is a minimally sufficient set of Boolean operators. (b) Prove that {→, ⊥} is a minimally sufficient set of Boolean operators.
21. Prove that {∧, ∨, →, ⊤} is not a sufficient set of Boolean operators.
22. Prove that (a) {∧, ∨, ⊤, ⊥} is not a sufficient set of Boolean operators; (b) if we add any function o which cannot be expressed in terms of ∧, ∨, ⊤, and ⊥, then {∧, ∨, ⊤, ⊥, o} is sufficient.
78
79
23. Identify the pairs of equivalent formulas from the following candidates:
(a)p↑(q↑r), (b)(p↑q)↑r, (c) ↑(p,q,r), (d)p↓(q↓r), (e)(p↓q)↓r, (f) ↓(p,q,r), (g)p↑(q↓r), (h)(p↑q)↓r, (i)p↓(q↑r), (j)(p↓q)↑r
24. Try to find an equivalent DNF of the minimal size for the following DNF formulas using K-maps:
(a) f(A,B,C) = ABC +AB +ABC +AC;
(b) f(A,B,C) = AB +BC +BC +ABC;
(c) f(A,B,C,D)=ABCD+ABC+BCD+BCD;
(d) f(A,B,C,D) = m0 +m1 +m5 +m7 +m8 +m10 +m14 +m15, where mi is a full minterm of A, B, C and D, and i is the decimal value of the binary string ABCD.
25. Using the K-map method and the procedure NNF2NOR(A), to find an equiv- alent formula, which uses only NAND/NOR/NOT operators, for each of the formulas in the previous question.
26. Try to find an equivalent CNF of the minimal size for the following Boolean functions using K-maps:
(a) f(A,B,C) = AB +AB +ABC +AC;
(b) f(A,B,C) = ABC +ABC +ABC +ABC;
(c) f(A,B,C,D) = m0 +m1 +m3 +m4 +m5 +m7 +m12 +m13 +m15; (d) f(A,B,C,D) = m0+m1+m2+m4+m5+m6+m8+m9+m12+m13+m14.
27. Trytofindaminimalandequivalentformula,whichusesonlyNAND/NOR/NOT operators, for each of the formulas in the previous question.
28. A function f : Bn −→ B, where B = {0,1}, is called linear if f(x1,x2,…,xn) = a0 + a1x1 + · · · + anxn for suitable coefficients a0, …, an ∈ B. Here + denotes the addition modulo 2, and the not written multiplication is the multiplication over integers.
(a) Show that the above representation of a linear function f is unique. (b) Determine the number of n-ary linear functions.
29. Suppose x is a 32-bit positive integer. Provide a java program for each of the following programming tasks, using as much bit-wise operations as possible:
(a) Check whether x is even or odd.
(b) Get the position of the highest bit of 1 in x.
(c) Get the position of the lowest bit of 1 in x.
(d) Count trailing zeros in x as a binary number.
(e) Shift x right by 3 bits and put the last three bits of x at the beginning of the result from the shifting. The whole process is called “rotate x right” by 3 bits.
(f) Convert a string of decimal digits into a binary number without using multiplication by 10.
30. Suppose a formula A contains 15 variables, each variable is represented by an integer from 0 to 14. We use the last 15 bits of an integer variable x to represent an interpretation of A, such that ith bit is 1 iff we assign 1 to variable i. Modify the algorithm eval(A,σ), where σ is replaced by x using bit-wise operations. Design an algorithm which lists all the interpretations of A and apply each interpretation to A using eval(A,x). The pseudo-code of the algorithm should be like a java program.
31. The meaning of the following propositional variables is given below:
t: “taxes are increased”,
e: “expenditures rise”,
d: “the debt ceiling is raised”,
c: “the cost of collecting taxes rises”,
g: “the government borrows more money”, i: “interest rates increase”.
(a) express the following statements as propositional formulas; (b) convert the formulas into CNF.
Either taxes are increased or if expenditures rise then the debt ceil- ing is raised. If taxes are increased, then the cost of collecting taxes rises. If a rise in expenditures implies that the government borrows more money, then if the debt ceiling is raised, then interest rates increase. If taxes are not increased and the cost of collecting taxes does not increase then if the debt ceiling is raised, then the govern- ment borrows more money. The cost of collecting taxes does not
80
increase. Either interest rates do not increase or the government does not borrow more money.
32. Specify the following puzzle in propositional logic and convert the specification into CNF. Find the model of your CNF and use this model to construct a solution of the puzzle.
Four kids, Abel, Bob, Carol, and David, are eating lunch on a hot summer day. Each has a big glass of water, a sandwich, and a different type of fruit (apple, banana, orange, and grapes). Which fruit did each child have? The given clues are:
(a) Abel and Bob have to peel their fruit before eating. (b) Carol doesn’t like grapes.
(c) Abel has a napkin to wipe the juice from his fingers.
33. Specify the following puzzle in propositional logic and convert the specification into CNF. Find the model of your CNF and use this model to construct a solution of the puzzle.
Four electric cars are parked at a charge station and their mileages are 10K, 20K, 30K, and 40K, respectively. Their charge costs are $15, $30, $45, and $60, respectively. Figure out how the charge cost for each car from the following hints:
(a) The German car has 30K miles.
(b) The Japanese car has 20K miles and it charged $15.
(c) The French car has less miles than the Italian car.
(d) The vehicle that charged $30 has 10K more mileages than the
Italian car.
(e) The car that charged $45 has 20K less mileage than the German
car.
34. Specify the following puzzle in propositional logic and convert the specification into CNF. Find the model of your CNF and use this model to construct a solution of the puzzle.
Four boys are waiting for Thanksgiving dinner and each of them has a unique favorite food. Their names are Larry, Nick, Philip, and Tom. Their ages are all different and fall in the set of {8, 9, 10, 11}. Find out which food they are expecting to eat and how old they are.
81
(a) Larry is looking forward to eating turkey.
(b) The boy who likes pumpkin pie is one year younger than Philip.
(c) Tom is younger than the boy that loves mashed potato. (d) The boy who likes ham is two years older than Philip.
35. A very special island is inhabited only by knights and knaves. Knights always tell the truth, and knaves always lie. You meet two inhabitants: Sally and Zippy. Sally claims, “I and Zippy are not the same.” Zippy says, “Of I and Sally, exactly one is a knight.” Can you determine who is a knight and who is a knave?
36. A very special island is inhabited only by knights and knaves. Knights always tell the truth, and knaves always lie. You meet two inhabitants: Homer and Bozo. Homer tells you, “At least one of the following is true: that I am a knight or that Bozo is a knight.” Bozo claims, “Homer could say that I am a knave.” Can you determine who is a knight and who is a knave?
37. A very special island is inhabited only by knights and knaves. Knights always tell the truth, and knaves always lie. You meet two inhabitants: Bart and Ted. Bart claims, “I and Ted are both knights or both knaves.” Ted tells you, “Bart would tell you that I am a knave.” Can you determine who is a knight and who is a knave?
82
CHAPTER 3
PROOF PROCEDURES FOR PROPOSITIONAL LOGIC
In Chapter 1, we introduced the concept of “proof procedure”: Given a set A of axioms (i.e., the formulas assumed to be true) and a formula B, a proof procedure P (A, B) will answer the question whether B is a theorem of A: If P (A, B) returns true, we say B is proved, if P (A, B ) returns false, we say B is disproved. For propositional logic, the axiom set A is regarded to be equivalent to the conjunction of formulas and a theorem is any formula B such that A |= B, i.e., B ∈ T (A). That is, a proof procedure is also called a theorem prover. The same procedure P can be used to show B is valid (i.e., B is a tautology) by calling P(⊤,B), or that C is unsatisfiable by calling P(C,⊥). That is, proving the entailment relation is no harder than proving validity or proving unsatisfiability, as the following theorem shows. The three problems are equivalent and belong to the class of co-NP-complete problems. In practice, a proof procedure is designed to solve one of the three problems as its direct goal. A proof procedure is called
• theorem prover P (X, Y ) if it is to show that formula Y is a theorem (an entailment) of X (assuming a set of formulas is equivalent to the conjunction of all its members);
• tautology prover T (A) if it is to show formula A is valid (a tautology);
• refutation prover R(B) if it is to show formula B is unsatisfiable.
If we have one of the above three procedures, i.e., P (X, Y ), T (A), and R(B), we may implement the other two as follows.
• If P(X,Y) is available, implement T(A) as P(⊤,A), and R(B) as P(B,⊥). • IfT(A)isavailable,implementP(X,Y)asT(X→Y),andR(B)asT(¬B). • If R(B) is available, implement P (X, Y ) as R(X ∧ ¬Y ), and T (A) as R(¬A).
The correctness of the above implementations is based on the following theo- rem: Given formulas A and B, the following three statements are equivalent:
1. A |= B;
83
2. A → B is valid;
3. A ∧ ¬B is unsatisfiable.
Thus, one prover can do all the three, either directly or indirectly. In Chapter
2, we have discussed several concepts and some of them can be used to construct proof procedures as illustrated below.
• Truth table: It can be used to construct a theorem prover, a tautology prover, and a refutation prover.
• CNF: It can be used to construct a tautology prover, as a CNF is valid iff every clause is valid and it is very easy to check if a clause is valid.
• DNF: It can be used to construct a refutation prover, as a DNF is unsatisfiable iff every minterm is unsatisfiable and it is very easy to check if a minterm is unsatisfiable.
• ROBDD or INF: It can be used to construct a tautology prover or a refutation prover, because ROBDD is a canonical form.
• Algebraic substitution: It can be used to construct a tautology prover or a refutation prover, as substituting “equal by equal” preserves the equivalence relation. If we arrive at ⊤ or ⊥, we can claim the original formula is valid or unsatisfiable.
In terms of its working principle, a proof procedure may have one of three styles: enumeration, reduction, and deduction.
• enumeration-style: enumerate all involved interpretations;
• reduction-style: use the equivalence relations to transform a set of formulas
into a simpler set;
• deduction-style: use inference rules to deduce new formulas.
By definition, A |= B iff M(A) ⊆ M(B), that is, every model of A is a model of B. We may use a truth table to enumerate all the interpretations involving A and B, to see if every model of A is a model of B. A proof procedure based truth tables is an enumeration-style proof procedure which uses exhaustive search. This type of proof procedures are easy to describe but highly inefficient. For example, if A contains 30 variables, we need to construct a truth table of 230 rows.
84
Reduction-style proof procedures use the equivalence relations to transform a set of formulas into desired, simplified formulas, such as reduced ordered binary decision diagrams (ROBDD). Since ROBDD is canonical, we decide that a formula is valid if its ROBDD is ⊤ (or 1), or unsatisfiable, if its ROBDD is ⊥ (or 0). We may also use disjunctive normal forms (DNF) to show a formula is unsatisfiable: Its simplified DNF is ⊥. In the following, we will introduce the semantic tableau method, which is a reduction-style proof procedure and works in the same way as obtaining DNF from the original formula.
Deduction-style proof procedures use inference rules to generate new formulas, until the desired formula is generated. An inference rule typically takes one or two formulas as input and generates one or two new formulas as output. Some of these rules come from the equivalence relations between the input and the output formulas. In general, we require that the output of an inference rule is entailed by the input. There are many deduction-style proof procedures and in this chapter we will present several of them. The goal of this presentation is not for introducing efficient proof procedures, but for introducing a formalism such that other logics can use. This comment applies to the semantics tableau method, too.
Many proof procedures are not clearly classified as enumeration, reduction, or deduction styles. Yet, typically their working principles can be understood as performing enumeration, or reduction, or deduction, or all actions implicitly. For example, converting a formula into DNF can be a reduction, however, converting a formula into full DNF can be an enumeration, as every term in a full DNF cor- responds to one model of the formula. In practice, reduction is often used in most deduction-style proof procedures for efficiency.
In Chapter 1, we say that a proof procedure is decision procedure if the proce- dure terminates for every input, that is, the procedure is an algorithm. Fortunately for propositional logic, every proof procedure introduced in this chapter is a decision procedure.
3.1 Semantic Tableau
The semantic tableau method, also called the “truth tree” method, is a proof procedure not just for propositional logic, but also for many other logics, including first-order logic, temporal and modal logics. A tableau for a formula is a tree structure, each node of the tree is associated with a list of formulas derived from the original formula. The tableau method constructs this tree and uses it to prove/refute the satisfiability of the original formula. The tableau method can also determine the satisfiability of finite sets of formulas of various logics. It is the most popular
85
proof procedure for modal logics.
3.1.1 Tableau: A Tree Structure for DNF
For propositional logic, the semantic tableau checks whether a formula is sat-
isfiable or not, by “breaking” complex formulas into smaller ones as we do for transforming the formula into DNF (disjunctive normal form). The method works on a tree whose nodes are labeled with a list of formulas. At each step, we may extend a leaf node by adding one or two successors to this node.
• Each node of the tree is associated with a list of formulas and the comma “,” in the list is a synonym to ∧;
• A node is called closed if it contains a complementary pairs of literals (or formulas), or ⊥, or ¬⊤.
• A node is called a model node if it is a set of consistent literals, which is equivalent to a minterm in DNF.
• If a node is neither closed nor a model node, then it is expandable by the transformation rules for DNF, which apply only to the top logical operation of a formula in the list. The expansion stops when the tree has no expandable nodes.
• The tree links indicate the equivalence relation among formulas. If the result of the transformation rule is a disjunction, two successors of the current node are created and the formulas of the current node is equivalent to the disjunction of those formulas in the two successors. This type of the transformation rules is called β-rules; the other rules are called α-rules, which generate only one child and the formulas of the current node is equivalent to those of the child node.
To prove the unsatisfiability of a formula, the method starts by generating the tree and check that every leaf node is closed, that is, no model node and no expandable nodes as leaves. In this case, we say the tree is closed.
Example 3.1.1. For the formulas A = p ∧ (¬q ∨ ¬p) and B = (p ∨ q) ∧ (¬p ∧ ¬q), their tableaux are shown in Figure 3.1.1. For A, the tableau has one model node (marked by ⊙) and one closed node (marked by ×). For B, it has only two closed nodes. Thus A is satisfiable as it has a model node; and B is unsatisfiable as its tableau is closed.
86
nodes in a tableau.
1 : 11 : 12 :
1 : 11 : 111 : 112 :
p ∧ (¬q ∨ ¬p) p, (¬q ∨ ¬p) p, ¬q ⊙ p, ¬p ×
(p ∨ q) ∧ (¬p ∧ ¬q) (p ∨ q), (¬p ∧ ¬q) (p ∨ q), ¬p, ¬q
p, ¬p, ¬q ×
q, ¬p, ¬q × (b) Proof of B
3.1.2 α-Rules and β-Rules
Here is the listing of α-rules and β-rules, which apply to the top logical oper-
ators in a formula.
(a) Proof of A
The labels of nodes can uniquely identify the positions (or address) of these
87
Figure3.1.1: ThetableauxforA=p∧(¬q∨¬p)andB=(p∨q)∧(¬p∧¬q)
The tableaux can be also drawn linearly, using proper labels for the parent- child relation: The root node has the empty label. If a node has label x, its first child has label x1 and its second child has label x2, and so on. For example, we may display the tableaux of A and B in Figure 3.1.1 as follows:
α
α1,α2
A∧B
A,B
¬(A ∨ B)
¬A, ¬B
¬(A → B)
A, ¬B
¬(A ⊕ B)
(A ∨ ¬B),(¬A ∨ B)
A↔B
(A ∨ ¬B),(¬A ∨ B)
¬(A ↑ B)
A,B
A↓B
¬A, ¬B
¬¬A
A
β
β1
β2
¬(A ∧ B)
¬A
¬B
A∨B
A
B
A→B
¬A
B
A⊕B
A, ¬B
¬A, B
¬(A ↔ B)
A, ¬B
¬A, B
A↑B
¬A
¬B
¬(A ↓ B)
A
B
If we start with a formula in NNF (negation normal form), all the α-rules and β-rules with the leading symbol ¬ are not needed.
For every α-rule, we transform formula α into α1,α2; for every β-rule, we transform formula β into β1 and β2. It is easy to check that α ≡ α1 ∧ α2 and β ≡ β1 ∨ β2. Those rules are what we used for transforming a formula into DNF. Since a tableau maintains the property that the formulas of the current node is equivalent to the disjunction of the formulas in its children nodes (the formulas in the same node are considered in conjunction), a tableau is a tree-like representation of a formula that is a disjunction of conjunctions, i.e., a DNF.
Example 3.1.2. To show that Frege’s formula, (p → (q → r)) → (p → q) → (p →
r), is valid, we show
1: 11 : 111 : 1111 : 1112 : 11121 : 11122 : 111221 : 111222 :
that the semantic tableau of its negation is closed.
¬((p→(q→r))→(p→q)→(p→r)) α¬→ (p → (q → r)), ¬(p → q) → (p → r) α¬→ (p → (q → r)), (p → q), ¬(p → r) α¬→
(p → (q → r)),(p → q),p,¬r β→ ¬p,(p → q),p,¬r ×
(q → r),(p → q),p,¬r β→ (q → r),¬p,p,¬r ×
(q → r),q,p,¬r β→ ¬q,q,p,¬r × r,q,p,¬r ×
In Chapter 1, we pointed out that there are two important properties con- cerning a proof procedure: soundness and completeness. For a refutation prover like semantic tableau, the soundness means that if the procedure says the formula is unsatisfiable, then the formula must be unsatisfiable. The completeness means that if the formula is unsatisfiable, then the procedure should be able to give us this result.
Theorem 3.1.3. The semantic tableau is a decision procedure for propositional logic.
Proof. Suppose the list of formulas at one node is α,β, and γ and the formula represented by this node is A = α∧β∧γ. If α is transformed by a α-rule to α1 and α2, then from α ≡ α1 ∧α2, A ≡ α1 ∧α2 ∧β∧γ. If beta is transformed by a β-rule to β1 and β2, then from β ≡ β1 ∨β2, A ≡ (α∧β1 ∧γ)∨(α∧β2 ∧γ). In other words, the logical equivalence is maintained for every parent-child link in the semantic tableau. By a simple induction on the structure of the tree, the formula at the root of the tree is equivalent to the disjunction of all the formulas represented at the leaf nodes.
88
That is, the procedure modifies the tableau in such a way that the disjunction of the formulas represented by the leaf nodes of the resulting tableau is equivalent to the original one. One of these conjunctions may contain a pair of complementary literals, in which case that conjunction is equivalent to false. If all conjunctions are equivalent to false, the original formula is unsatisfiable. This shows the soundness of the procedure.
If a formula is unsatisfiable, then every term in its DNF must be equivalent to false and the corresponding node in the tableau is closed. Thus we will have a closed tableau. This shows the completeness of the procedure.
Since neither α-rules nor β-rules can be used forever (each rule reduces ei- ther one occurrence of a binary operator or two negations), the procedure must be terminating. P
Of course, semantic tableau can be used to show the satisfiability of a formula. If there is a model node in the tableau, we may assign 1 to every literal in this node so that the formula represented by this node is true under this interpretation. The original formula will be true under this interpretation because it is equivalent to the disjunction of the formulas of the leaf nodes. Thus, this interpretation is a model of the original formula. This shows that the procedure can be used to find a model when the formula is satisfiable.
Semantic tableau specifies what rules to be used but it does not specify which rule should be used first or which node should be expanded first. In general, α-rules are better be applied before β-rules, so that the number of nodes in a tree can be reduced. In practice, the order of formulas in a list can be changed and the order of β1 and β2 can be switched. This gives the user the freedom to construct trees of different shapes, or to find a model node quicker. The procedure may terminate once a model node is found, if the goal is to show that the original formula is satisfiable.
Semantic tableaux are much more expressive and easy to use than truth-tables, though that is not the reason for their introduction. Sometimes, a tableau uses more space than a truth table. For example, an unsatisfiable full CNF of n variables will generate a tableau of more than n! nodes, which is large than a truth table of 2n lines. The beauty of semantic tableaux lies on the simplicity of presenting a proof procedure using a set of α and β rules. We will see in later chapters how new rules are added into this procedure so that a proof procedure for other logics can be obtained.
89
3.2 Deductive Systems
In logic, a deductive system S consists of a set of inference rules, where each rule takes premises as input and returns a conclusion (or conclusions) as output. Popular inference rules in propositional logic include modus ponens (MP), modus tollens (MT), and contraposition (CP), which can be displayed, respectively, as follows:
90
3.2.1
A→B A (MP) A→B ¬B (MT) A→B (CP) B ¬A ¬B → ¬A
Inference Rules and Proofs
In general, an inference rule can be specified as
P1 P2 ··· Pk C
where P1, P2, …, Pk are the premises and C is the conclusion. We may also write it as
P1,P2,···,Pk ⊢C
If the premises are empty (k = 0), we say this inference rule is an axiom rule.
An inference rule is sound if its premises entail the conclusion, that is, P1,P2,···,Pk |=C
for every inference rule {P1, P2, · · · , Pk} ⊢ C (or equivalently, (P1∧P2∧· · ·∧Pk) → C is a tautology). For example, MP (modus ponens) is sound because {A → B, A} |= B; modus tollens is sound because {A → B, ¬B} |= ¬A.
Given a set A of axioms, which are the formulas assumed to be true, a proof of formula B in S is a sequence of formulas F1, F2, …, Fn such that Fn = B and each Fi is either a formula in A or can be generated by an inference rule of S, using the formulas before Fi in the sequence as the premises. If such a proof exits, we denote it by A ⊢S B; the subscript S can be dropped if S is understood from the context. The proof procedure P(A,B) based on S is simply to show A ⊢S B. Obviously, this is a deduction-style proof procedure.
Figure 3.2.1: The proof tree of ¬q
Example 3.2.1. Let A = {p → (q → r),p,¬r}, and S = {MP,CT}. A proof of
A ⊢S ¬q is given below:
beginning of the sequence. following.
For example, the
1. p→(q→r) 2. p
3. ¬r
4. q→r
5. ¬q
above proof can be rewritten as the
assumed assumed assumed MP,1,2 MP,3,4
1. p→(q→r) 2. p
3. q→r
4. ¬r
assumed assumed MP,1,2 assumed MT,3,4
5. ¬q
The formulas in a proof can be rearranged so that all axioms appear in the
Every proof can be represented by a directed graph where each node is a formula in the proof and if a formula B is derived from A1, A2, …, Ak by an inference rule, then there are edges (Ai, B), 1 ≤ i ≤ k, in the graph. This proof graph must be acyclic as we assume that no formula can be removed from a proof without destroying the proof. Every topological sort of the nodes of this graph should give us a linear proof. Since every acyclic direct graph (DAG) can be converted into a tree (by duplicating some nodes) and still preserve the parent-child relation, we may present a proof by a tree as well. The proof tree for the above proof example is given in Figure 3.2.1. It is clear that all assumed formulas (axioms) do not have the incoming edges. The final formula in the proof does not have outgoing edges. All intermediate formulas have both incoming and outgoing edges.
91
The soundness of an inference system comes from the soundness of all the inference rules. In general, we require all inference rules be sound to preserves truth of all the derived formulas. For a sound inference system S, every derived formula B ∈ T (A). If A is a set of tautologies, then B is a tautology, too.
Theorem 3.2.2. If an inference system S is sound and A ⊢S B, then A |= B.
Proof. A ⊢S B means there is a proof of B in S. Suppose the proof is F1, F2, …, Fn = B. Byinductiononn,weshowthatA|=Fi foralli=1,…,n. IfFi isinA, then A |= Fi. If Fi is derived from an inference rule P1, P2, …, Pm ⊢ C, using Fj1 , Fj2 , …, Fjm as premises, by induction hypotheses, A |= Fjk , because jk < i, for all1≤k≤m. Thatis,A|=∧mk=1Fjk.
Since the inference rule is sound, ∧mk=1Pk |= C. Applying the substitution theorem, ∧mk=1Fjk |= Fi. Because |= is transitive, A |= Fi. P
Different inference systems are obtained by changing the axioms or the infer- ence rules. In propositional logic, all these systems are equivalent in the sense that they are sound and complete.
3.2.2 Hilbert Systems
Hilbert systems, sometimes called Hilbert calculus, or Hilbert-style inference
systems, are a type of deduction system attributed to Gottlob Frege and David Hilbert. Variants of Hilbert systems for propositional logic exist and these deductive systems are extended to first-order logic and other logics.
Definition 3.2.3. The Hilbert system H contains three axiom rules plus MP (modus ponens):
H1 : ⊢ (A → (B → A)),
H2 : ⊢ (A→(B→C))→((A→B)→(A→C)), H3 : ⊢ (¬B→¬A)→(A→B),
MP: A,A→B⊢B.
The axiom rule H3 is not needed if → is the only operator in all formulas. Other logical operators can be defined using either → and ¬. For example, A∧B ≡ ¬(A→¬B)andA∨B≡¬A∨B. ¬AcanbereplacedbyA→⊥inH3. Itwill turn out that all tautologies are derivable from the three formulas H1, H2, and H3.
Example 3.2.4. Assume A = ∅, to show that A ⊢H (p → p), we have the following proof in H:
92
1. p→((p→p)→p) H1,A=p,B=(p→p)
2. (p→((p→p)→p))→((p→(p→p))→(p→p)) H2,A=C=p,B=p→p
3. (p→(p→p))→(p→p) 4. (p→(p→p))
5. p→p
MPfrom1,2 H1,A=p,B=p MPon3,4,A=p,B=p.
Proposition 3.2.5. H is sound: If A ⊢H B, then A |= B.
Proof. It is easy to check that every inference rule of H is sound. That is, for every axiom rule Hi, i = 1,2,3, the result of the rule is a tautology. For MP, (p ∧ (p → q)) → q is a tautology. By Theorem 3.2.2, the proposition holds. P
Hilbert systems originally assumed that A = ∅ (or A = ⊤), thus every derived formula is a tautology in a proof. This restriction is relaxed today so that we may take a set of formulas as axioms. The truth of the formulas becomes conditional on the truth of the axioms.
Example3.2.6.AssumeA={p→q,q→r},toshowthatA⊢H (p→r),we have the following proof in H:
1. p → q
2. q → r
3. ((q→r)→(p→(q→r))
4. p→(q→r)
5. (p→(q→r))→((p→q)→(p→r)) H2
6. (p→q)→(p→r) MPfrom4,5
7. p→r MPfrom1,6
SinceHissound,fromA⊢H (p→r),wehave(p→q)∧(q→r)|=H (p→r), or ((p → q) ∧ (q → r)) → (p → r) is valid.
Hilbert system is one of the earliest inference systems and has great impact to the creation of many inference systems. However, theorem proving in this system has always been considered a challenge because of its high complexity. Since Hilbert systems can hardly be a practical tool for problem solving, we state without proof the completeness result of Hilbert system.
Theorem 3.2.7. The Hilbert system is sound and complete, that is, A |= B iff A ⊢H B.
assumed assumed
H1 MPfrom2,3
93
3.2.3 Natural Deduction
Both of semantic tableaux and Hilbert systems for constructing proofs have
their disadvantages. Hilbert’s systems are difficult to construct proofs in; their main uses are metalogical, the small number of rules making it easier to prove results about logic. The tableau method on the other hand is easy to use mechanically but, because of the form of the connective rules and the fact that a tableau starts from the negation of the formula to be proved, the proofs that result are not a natural sequence of easily justifiable steps. Likewise, very few proofs in mathematics are from axioms directly. Mathematicians in practice usually reason in a more flexible way.
Natural deduction is a deductive system in which logical reasoning is expressed by inference rules closely related to the “natural” way of reasoning. This contrasts with Hilbert systems, which instead use axiom rules as much as possible to express the logical laws of deductive reasoning. Natural deduction in its modern form was independently proposed by the German mathematician Gerhard Gentzen in 1934. In Gentzen’s natural deductive system, a formula is represented by a set A of formulas, A = {A1, A2, ..., An}, where the comma in A are understood as ∨. To avoid the confusion, we will write A as
(A1 |A2 |···|An).
A is called a sequent by Gentzen. If n = 0, i.e., A = ( ), which is equivalent to ⊥ or (⊥). Since we assume that A is a set, duplicated formulas are removed and the order of formulas in A is irrelevant.
Definition 3.2.8. A natural deductive system G consists of the following three in- ference rules, plus the following logical rules which guide the introduction and elim- ination of logical operators in the sequents.
94
name
inference rule
axiom
⊢ (A|¬A|α)
cut
(A|α), (¬A|β) ⊢ (α|β)
thinning
(α) ⊢ (A|α)
95
op
introduction
elimination
¬
(A|α) ⊢ (¬¬A|α)
(¬¬A|α) ⊢ (A|α)
∨
(A|B|α) ⊢ (A∨B|α)
(A∨B|α) ⊢ (A|B|α)
∧
(A|α), (B|α) ⊢ (A∧B|α)
(a)(A∧B|α) ⊢ (A|α) (b)(A∧B|α) ⊢ (B|α)
→
(¬A|B|α) ⊢ (A→B|α)
(A→B|α) ⊢ (¬A|B|α)
¬∨
(¬A|α), (¬B|α) ⊢ (¬(A∨B)|α)
(a)(¬(A∨B)|α) ⊢ (¬A|α) (b)(¬(A∨B)|α) ⊢ (¬B|α)
¬∧
(¬A|¬B|α) ⊢ (¬(A∧B)|α)
(¬(A∧B)|α) ⊢ (¬A|¬B|α)
¬→
(A|α), (¬B|α) ⊢ (¬(A→B)|α)
(a)(¬(A→B)|α) ⊢ (A|α) (b)(¬(A→B)|α) ⊢ (¬B|α)
where α and β, possibly empty, denote the rest formulas in a sequent.
In natural deduction, a sequent is deduced from a set of axioms by applying inference rules repeatedly. Each sequent is inferred from other sequents on earlier lines in a proof according to inference rules, giving a better approximation to the
style of proofs used by mathematicians than Hilbert systems.
Example 3.2.9. The following is a proof of ¬(p ∧ q) → (¬p ∨ ¬q) in G:
1. (p|¬p|¬q)
2. (q|¬q|¬p)
3. (p∧q|¬p|¬q)
4. (¬¬(p∧q) | ¬p | ¬q)
5. (¬¬(p∧q)|(¬p∨¬q))
6. (¬(p∧q) → (¬p∨¬q))
axiom,A = p,α = ¬q
axiom,A = q,α = ¬p
∧I from1,2,A=p,B=q,α=¬p|¬q
¬I from 4,A = (p∧q),α = (¬p∨¬q)
∨I from 3,A = ¬p,B = ¬q,α = ¬¬(p∧q)
→I from 5,A = ¬(p∧q),B = (¬p∨¬q),α = ⊥
Proposition 3.2.10. The deductive system G is sound, i.e., if A ⊢G B, then A |= B.
Proof. It is easy to check that every inference rule of G is sound. In particular, the rules of operator introduction preserve the logical equivalence. In the rules of operator elimination, some sequents generate two results. For example, (A ∧ B | α) infers both (a) (A | α) and (b) (B | α). If we bind the two results together by ∧, these rules preserve the logical equivalence, too. On the other hand, the cut rule and the thinning rule do not preserve the logical equivalence; only the entailment relation holds. That is, for the cut rule, (A | α), (¬A | β) ⊢ (α | β), we have
( A ∨ α ) ∧ ( ¬ A ∨ β ) |= α ∨ β ,
and for the thinning rule, (α) ⊢ (A | α), we have α |= A ∨ α.
Note that the proof in the example above uses only the axiom rule (twice) and the introduction rules for logical operators. This is the case when we are looking for a proof of a tautology. Recall how we would prove that ¬(p ∧ q) → (¬p ∨ ¬q) is a tautology by semantic tableau. Since semantic tableau is a refutation prover, we need to show that the negation of a tautology has a closed tableau.
a. :
b. :
c. :
d. :
e. :
f. 1:
g. 2:
¬(¬(p ∧ q) → (¬p ∨ ¬q)) ¬(p ∧ q), ¬(¬p ∨ ¬q) ¬(p ∧ q), ¬¬p, ¬¬q ¬(p ∧ q), p, ¬¬q
¬(p ∧ q),p,q ¬p,p,q
¬q,p,q
α¬→ α ¬∨ α¬ α¬
β ¬∧ ×
×
Comparing the proof of G and the tableau above, we can see a clear corre- spondence: Closed nodes (f) and (g) correspond to (1) and (2), which are axioms; (e) to (3), (c) and (d) to (4), (b) to (5), and finally (a) to (6), respectively. That is, the formula represented by each sequent in a proof of G can always find its negation in the corresponding node of the semantic tableau.
We show below that G can be used as a refutation prover by the same example. Example 3.2.11. Let A = {¬(¬(p∧q) → (¬p∨¬q))}. A proof of A |= ⊥ in G is
given below.
1. (¬(¬(p ∧ q) → (¬p ∨ ¬q)))
2. (¬(p∧q))
3. (¬(¬p∨¬q))
4. (¬p|¬q)
5. (¬¬p)
6. (¬¬q)
7. (p)
8. (q)
9. (¬q)
10. ()
assumed
¬→E (a)from1,A=¬(p∧q) ¬→E (b)from1,B=(¬p∨¬q) ¬∧E from2,A=p,B=q
¬∨E (a)from3,A=¬p
¬∨E (b)from3,B=¬q
¬E from5,A=p
¬E from6,A=q cutfrom4,7,A=p,α=⊥,β=¬q cutfrom8,9,A=q,α=⊥,β=⊥
96
P
The above is indeed a proof of A ⊢G ⊥, where ⊥ is used for the empty sequent. In this proof, the used inference rules are the cut rule and the rules for eliminating logical operators. If we apply repeatedly the rules of operator elimination to a formula, we will obtain a list of sequents, each of which represents a clause. In other words, these rules transform the formula into CNF, then the cut rule works on them. The cut rule for clauses has another name, i.e., resolution, which is the topic of the next section.
Indeed, G contain more rules than necessary: Using only the axiom rule and the rules for operator introduction, it can imitate the dual of a semantic tableau upside down. Using only the cut rule and the rules for operator elimination, it can do what a resolution prover can do. Both semantic tableau and resolution prover are decision procedures for propositional logic. The implication is that natural deduction G is a decision procedure, too. We give the main result without a proof.
Theorem 3.2.12. The natural deduction G is sound and complete, that is, A |= B iff A ⊢G B.
3.2.4 Inference Graphs
We have seen two examples of deductive systems, Hilbert system H and
Gentzen’s natural deduction G. To show the completeness of these deductive sys- tems, we need strategies on how to use the inference rules in these systems and the completeness depends on these strategies. To facilitate the discussion, let’s introduce a few concepts.
Definition 3.2.13. Given an inference system S, the inference graph of S over the axiom set A is defined as a directed graph G = (V,E), where each vertex of V is a set x of formulas, where A ∈ V, x ⊆ T(A), and (x,y) ∈ E iff y = x∪{C}, where C is the conclusion of an inference rule r ∈ S using some formulas in x as the premises of r.
The graph G = (V, E) defines the search space for A ⊢S B, which becomes a searchproblem:TofindaproofofA⊢S BisthesameastofindapathfromAto a node of V containing B. That is, if all the axioms of A in a proof appear in the beginning of the proof sequence, then this proof presents a path in the inference graph. Suppose the proof is a sequence of formulas F1, F2, ..., Fn such that the first kformulasarefromA,i.e.,Fi ∈Afor1≤i≤kandFi ̸∈Afori>k. Thenthis proof represents a directed path in x0, x1, …, xn−k in the inference graph such that x0 =A,Fi ∈Afor1≤i≤k,andFj ∈xj−k fork
101
(a) (b)
(c)
(q|r) (1,2) (¬q|r) (2,3) (r) (5, 6) () (4,7)
5. (¬p)
6. (p|¬q)
7. (¬q)
8. (q)
9. ()
(2, 4) (3, 4) (5, 6) (1, 5) (7, 8)
5. (p|r) (1,3)
6. (r) (2,5)
7.() (4,6) 7.
8.
The length of (a) is
is easy to check that (a) is not an input resolution proof, because clause 7 is not obtained by input resolution. (b) is not a unit resolution proof. (c) is neither a unit nor an input resolution; it is neither positive nor negative resolution proof.
Efficiency is an important consideration in automated reasoning and one may sometimes be willing to trade completeness for speed. In unit resolution, one of the parent clauses is always a literal; in input resolution, one of the parent clauses is always selected from the original set. Albeit efficient, neither strategy is complete. For example, C = {(p | q), (p | ¬q), (¬p | q), (¬ | ¬q)} is unsatisfiable, though no unit resolutions are available. We may use input resolutions to obtain p and ¬p, but cannot obtain ( ) because p and ¬p are not input clauses.
Ordered resolution impose a total ordering on the propositional variables and treats clauses not as sets of literals but a sorted list of literals. Ordered resolution is extremely efficient and complete.
Set-of-support resolution is one of the most powerful strategies employed by (Wos, Carson, Robinson 1965), and its completeness depends on the choice of set of support. To prove A |= B by resolution, we convert A∧¬B into clauses. The clauses obtained from ¬B serve in general as a set of support S and the clauses derived from A are in T. Set-of-support resolution dictates that the resolved clauses are not both from T. The motivation behind set-of-support is that since A is usually satisfiable it might be wise not to resolve two clauses from against each other.
Linear resolution always resolves a clause against the most recently derived resolvent. This gives the deduction a simple “linear” structure and it appears that we can have a straightforward implementation. Because linear resolution imposes the restriction on time, we may add a time stamp to every clause and assume that all the time stamp of input clauses is 0. The resolution rule needs to be modified as follows:
⟨(A | α),i⟩ ⟨(¬A | β),j⟩ | i is maximal. ⟨(α | β), i + 1⟩
Now consider the inference graph G = (V, E) for linear resolution: (x, y) ∈ E if ⟨(A | α),i⟩,⟨(¬A | β),j⟩ ∈ x, i is maximal in x, and y = x∪⟨(α | β),i+1⟩. If
5; the length of (b) is 3; and the length of (c) is 4. It
5. 6.
102
a clause with the same time stamp appears in two different nodes of V , it means that we have different sequences (of the same length) to derive this clause by linear resolution. Since the sequences are different, one cannot be replaced by the other. For example, in x, we may obtain ⟨C3, i + 1⟩ from ⟨C1, i⟩ and ⟨C2, j⟩, but cannot do the same in x′, even if ⟨C1, i⟩ ∈ x′, because ⟨C2, j⟩ may not be in x′ for any j. That means that the inference graph has to be a tree structure. The completeness of linear resolution ensures that there exists a path from the root (the input clauses) to a solution node (containing the empty clause) in G = (V,E). If we do not use the depth-first search in G, multiple copies of the same clause may have to be kept in the search. This leaves us the depth-first search as our best option and we have to inherit all advantages and disadvantages of the depth-first search, such as generating the same resolvent many times, or wasting many futile steps before finding a solution node.
It must be noted that some strategies improve certain aspects of the deduction process at the expense of others. For instance, a strategy may reduce the size of the proof search space at the expense of increasing, say, the length of the shortest proofs.
3.3.3 Preserving Satisfiability
The resolution rule does not preserve the logical equivalence, that is, if D is
aresolventofC1 andC2,thenC1∧C2 |=D,orC1∧C2 ≡C1∧C2∧D,butnot C1 ∧C2 ≡ D. For example, q is a resolvent of p and ¬p | q, M(q) has two models, and M(p ∧ (¬p | q)) has only one model. Thus, q ̸≡ p ∧ (¬p | q)). However, both of them are satisfiable.
Definition 3.3.6. Given two formulas A and B, A and B are said to be equally satisfiable, if whenever A is satisfiable, so is B, and vice versa. We denote this relation by A ≈ B.
Obviously, A ≈ B means M(A) = ∅ iff M(B) = ∅ and is weaker than the logical equivalence, which requires M(A) = M(B).
Proposition 3.3.7. Let S = C ∪{(α | β)} and S′ = C ∪{(α | x),(¬x | β)} be two sets of clauses, where x is a variable not appearing in S. Then S ≈ S′.
Proof. If S is satisfiable and σ is a model of S, then σ(C) = 1 and σ(α | β) = 1. Ifσ(α)=⊥,wedefineσ(x)=1;ifσ(β)=0,wedefineσ(x)=0. Thusboth σ(α|x)=1andσ(¬x|β)=1. SoσisamodelofS′.
On the other hand, if σ is a model of S′, then σ is also a model of S without modification, because (α | β) is a resolvent of (α | x) and (¬x | β). P
103
In the above proposition, the condition that x does not appear in C is neces- sary. Without it, for example, if C = {(¬p)}, α is empty, β = q and x = p, then S = {(¬p), (q)} and S′ = {(¬p), (p), (¬p | q)}. Then S is satisfiable but S′ is not.
Using the above proposition, we may break a long clause into a set of shorter clauses by introducing some new variables. For example, the clause (l1 | l2 | l3 | l4 | l5) can be transformed into three clauses: (l1 | l2 | x), (¬x | l3 | y), and (¬y | l4 | l5). This transformation does not preserve the logical equivalence, but preserves the satisfiability.
Another usage of the above proposition is to use resolution to remove x from a clause set. If we exhaust all resolutions on x, keep all the resolvents and re- move all the clauses containing x, then the resulting set of clauses will preserve the satisfiability, as stated in Lemma 3.3.11.
Proposition 3.3.8. Let B be a subformula of A and x is a new variable. Then A ≈ (A[B ← x] ∧ (x ↔ B)).
Proof. If A is satisfiable and σ is a model of A, then we define σ(x) = σ(B). When applying σ to the formula trees of A and A[B ← x], all the common nodes have the same truth value, so σ(A) = σ(A[B ← x]). Since σ(x) = σ(B), so σ(x ↔ B) = 1, thus σ is a model of A[B ← x] ∧ (x ↔ B).
On the other hand, if σ is A[B ← x] ∧ (x ↔ B), σ is also a model of A. P
The above proposition plays an important role when we convert a formula into CNF and keep the size of CNF as a linear function of the size of the original formula. When we use the distribution law
(A∧B)∨C ≡ (A∨C)∧(B ∨C)
to transform a formula into CNF, C becomes duplicated in the result. Because of this, there exists examples that the resulting CNF has an exponential size of the original size. Using the above proposition and introducing new variables, we may keep the size of CNF as four times of the original size, and every clause contains at most three literals. The CNF obtained this way is not logically equivalent to the original formula, but it preserves the satisfiability of the original formula: The CNF is satisfiable iff the original formula is satisfiable. This way of transforming a formula into CNF is called Tseitin transformation.
Example 3.3.9. Let A be (p∧(q∨¬p))∨¬(¬q∧(p∨q)), we introduce a new variable for every proper subformula whose node is a binary operator, and convert each of them into at most three clauses.
104
x1 ↔(q∨¬p) x2 ↔ (p∧x1) x3 ↔(p∨q) x4 ↔(¬q∧x3) x2 ∨ ¬x4
(x1 |q|p),(q|x1),(p|x1), (x2 | p),(x2 | x1),(p | x1 | x2), (x3 |p|q),(p|x3),(q|x3), (x4 |q),(x4 |q),(q|x3 |x4), (x2 | x4).
105
Example 3.3.10. In Example 2.3.9, we have converted p1 ↔ (p2 ↔ p3) into four clauses:
(p1 |p2 |p3), (p1 |p2 |p3), (p1 |p2 |p3), (p1 |p2 |p3).
If we like to convert p1 ↔ (p2 ↔ (p3 ↔ p4)) into clauses, we may replace p3 by (p3 ↔ p4) in the above clauses and then covert each of them into clauses. Thatis,from(p1 |p2 |(p3 ↔p4)),(p1 |p2 |(p3 ↔p4)),(p1 |p2 |(p3 ↔p4)),and (p1 | p2 | (p3 ↔ p4)), we obtain eight clauses, twice as before:
(p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4), (p1 |p2 |p3 |p4).
If we replace p4 by (p4 ↔ p5), we will get a set of 16 clauses of length 5. In general, p1 ↔ (p2 ↔ (p3 ↔ (…(pn−1 ↔ pn)…))) will produce 2n−1 clauses of length n. Instead, if we introduce n − 3 new variables for this formula of n variables, we obtain only 4(n − 1) clauses of length 3. For example, for n = 5, we introduce two new variables, q1 and q2:
(p1 ↔ (p2 ↔ q1))∧(q1 ↔ (p3 ↔ q2))∧(q2 ↔ (p4 ↔ q5)), which will generate 3 × 4 = 12 clauses of length 3.
Lemma3.3.11.LetCbeasetofclausesandxbeavariableinC. LetD⊂C suchthateachclauseofDcontainsneitherxnorx. LetX={(α|β)|(x|α),(x| β) ∈ C}, then C ≈ D ∪ X.
Proof.IfCissatisfiableandσisamodelofC,thenσ(C)=1andσ(D)=1. If σ(X)=0,thenthereexistsaclauseα|β∈Xsuchthatσ(α)=σ(β)=0. It means either σ(x | α) = 0 or σ(x | β) = 0, that is a contradiction to the fact that σ(C)=1,becauseboth(x|α)and(x|β)areinC. Soσ(X)=1. Itmeansσisa model of D ∪ X.
On the other hand, if D ∪ X is satisfiable and σ is its model, we may define σ(x) = 1 if σ(α) = 0 for some x | α ∈ C; in this case, we must have σ(β) = 1 for all (x | β) ∈ C; otherwise, if σ(β) = 0 for some (x | β) ∈ C, then σ(α | β) = 0, that is a contradiction to the fact that σ(X) = 1. Similarly, if σ(α) = 1 for all x | α ∈ C, we define σ(x) = 0. Then σ becomes a model of C. P
This lemma plays an important role in the completeness proof of resolution: The set X can be obtained from C by resolution. From C to D ∪ X, we have eliminated one variable from C. Replace C by D ∪ X and we repeat the same process and remove another variable, until no more variable in C: either C = ⊤ or C = ⊥. From the final result, we can tell if the original set of clauses is satisfiable or not.
3.3.4 Completeness of Resolution
Theorem 3.3.12. Ordered resolution is complete with any total ordering on the variables.
Proof. Suppose there are n variables in the input clauses Sn with the order xn > xn−1 >···>x2 >x1. Forifromnto1,weapplyLemma3.3.6withC=Si and x = xi, and obtain Si−1 = D ∪ X. Every resolvent in X is obtained by ordered resolution on xi, which is the maximal variable in Si. Finally, we obtain S0 which contains no variables. Since Sn ≈ Sn−1 ≈ · · · ≈ S1 ≈ S0, if Sn is unsatisfiable, then S0 must contain ⊥; if Sn is satisfiable, we arrive at S0 = ∅, i.e., S0 ≡ ⊤. In this case, we may use the idea in the proof of Lemma 3.3.6 to assign a truth value for each variable, from x1 to xn, to obtain a model of Sn. P
Example 3.3.13. Let S = {c1. (a|b|c), c2. (a|e), c3. (b|c), c4. (b|d), c5. (c|d|e), c6.(d|e)}anda>b>c>d>e. Byorderedresolution,wegetc7.(b|c|e) from c1 and c2; c8. (c|d) from c3 and c4; c9. (c|e) from c3 and c7; and c10. (d|e) from c5 and c9. No other clauses can be generated by ordered resolution, that is, c1–c10 are saturated by ordered resolution. Since the empty clause is not generated, we may construct a model from c1–c10 by assigning truth values to the variables from the least to the greatest. Let Sx be the set of clauses from c1–c10 such that x is the maximal variable.
• Se =∅andwecanassigneither0or1toe,sayσe ={e}.
• Sd ={c6,c10}. Fromc10,dhastobe1,soσd =σe ∪{d=0}={e,d}.
• Sc ={c5,c8,c9}. Fromc8 orc9,chastobe1,soσc =σd∪{c=0}={e,d,c}.
• Sb = {c3,c7}. Both c3 and c7 are satisfied by σc, we may assign 0 or 1 to b, sayb=1,soσb =σc ∪{b}={e,d,c,b}.
• Sa ={c1,c2}. Fromc2,ahastobe1,soσa =σb ∪{a}={e,d,c,b,a}.
106
The above example illustrates that if a set of clauses is saturated by ordered resolution and the empty clause is not there, we may construct a model from these clauses. In other words, we may use ordered resolution to construct a decision procedure for deciding if a formula A is satisfiable or not.
Algorithm 3.3.14. The algorithm orderedResolution will take a set C of clauses andanorderV ofvariables,V =(p1
If we remove all the occurrences of q from P1, we obtain a resolution proof P1′ and core(P1′) contains clauses from core(P) with q being removed. Since |P1′| < |P|, by induction hypothesis, we have a positive resolution proof of core(P1′) ⊢R ⊥. If we add q back to core(P1′), we obtain a positive resolution proof P”1 of core(P) ⊢R q.
We then remove all the occurrences of q from P2, and obtain a resolution proof P2′ and core(P2′) contains clauses from core(P) with q being removed. By induction hypothesis, there exists a positive resolution proof P”2 of core(P2′) ⊢R ⊥. For each clause α ∈ core(P2′), if α is obtained from α | q ∈ core(P), we add α as the resolvent of α |≠ q and q at the beginning of P”2. Finally, we append P”1, which generates q by positive resolution, and P”2 into one proof, the result is a positive resolution proofofC⊢R ⊥. P
The following theorem can be proved using a similar approach.
Theorem 3.3.17. Negative resolution is complete for refutation.
Completeness of resolution strategies are important to ensure that if the empty clause is not found, the input clauses are unsatisfiable. In this case, one strategy may generate the empty clause quicker than another strategy. When the input clauses
108
are satisfiable, we have to compute all possible resolutions to ensure that no new resolvents before claiming the input clauses are satisfiable. In this case, we say the set of clauses, old and new, is saturated by resolution. Restricted resolutions like ordered resolution, positive resolution, and set-of-support resolution, work much better than unrestricted resolution, when the saturation is needed to show the satisfiability of a set of clauses.
3.3.5 A Resolution-based Decision Procedure
If we like to employ heuristics in a decision procedure based on resolution, the
following procedure fits the purpose.
Algorithm 3.3.18. The algorithm resolution will take a set C of clauses and returns true iff C is satisfiable. It uses the procedure resolvable(A,B), which decides if clauses A and B are allowed to do restricted resolution, and if yes, resolve(A,B) will their resolvent. It uses pickClause(C) to pick out a clause according to a heuristic.
proc resolution(C)
G := C; // G: given clauses K = { }; // K: kept clauses while G ̸= { } do
A := pickClause(G);
G := G−{A};
for B ∈ K if resolvable(A, B) do
res := resolve(A, B);
if res = ( ) return false; if res ̸∈ (G ∪ K)
G := G ∪ {res}; K := K ∪ {A};
return true;
In the procedure, we move each clause from G to K, and compute the reso- lution between this clause and all the clauses in K. When G is empty, resolution between any two clauses are done.
With the exception of linear resolution, the above procedure can be used to implement all the resolution strategies introduced in this section. The restriction will be implemented in resolvable. For set-of-resolution, a better implementation is
109
that G is initialized with the set of support and K is initialized with the rest of the input clauses.
The termination of this procedure is guaranteed because only a finite number of clauses exist. When the procedure returns false, the answer is correct because resolution is sound. When it returns true, the correctness of the answer depends on the completeness of the resolution strategy implemented in resolvable.
In pickClause, we implement various heuristics for selecting a clause in G to do resolutions with clauses in K. Popular heuristics include preferring shorter clauses or older clauses. Preferring shorter clauses because shorter clauses are stronger as constraints on the acceptance of interpretations as models. An empty clause removes all interpretations as models; a unit clause removes half of the interpretations as models; a binary clause removes a quarter of the interpretations as models; and so on. Preferring older clauses alone would give us the breadth-first strategy. If we wish to find a shorter proof, we may mix this preference with other heuristics.
Despite the result that complete resolution strategies can serve as a decision procedure for positional logic, and some strategies can even build a model when the input clauses are satisfiable, resolution provers are in general not efficient for propositional logic, because the number of possible clauses is exponential in terms of the number of variables. The computer may quickly run of memory before finding a solution for real application problems.
3.3.6 Clause Deletion Strategies
There are several strategies which can safely remove redundant clauses during
the resolution. They are tautology deletion, subsumption deletion, and pure literal deletion.
A clause is a tautology if it contains a complementary pair of literals of the same variable. For example, (p | p). A tautology clause is true in every interpreta- tion and is not needed in any resolution proof.
Proposition 3.3.19. Given any set S of clauses, let S be partitioned into S′ ∪T, where T is the set of tautology clauses in S. Then S ≡ S′.
Proof. For any interpretation σ, σ(S) = σ(S′) ∧ σ(T ) = σ(S′) as σ(T ) = 1. P Definition 3.3.20. A clause A subsumes another clause B if B = A | α for some
α. That is, every literal of B appears in A.
For example, p | q subsumes p | q | r. Intuitively, resolution tries to delete literals by resolution one by one to obtain an empty clause. If we use p | q instead of p | q | r, we avoid the job of deleting r.
110
Proposition 3.3.21. If A subsumes B, then S ∪ {A, B} ≡ S ∪ {A}.
Proof. Since A subsumes B, M(A) ⊆ M(A) ∪ M(α) = M(B). M(S ∪ {A, B}) =
M(S) ∩ (M(A) ∩ M(B)) = M(S) ∩ M(A) = M(S ∪ {A}). P
The above proposition shows that if A subsumes B, then B can be dropped as a constraint in the presence of A. The following proposition shows that B is not needed in a resolution proof.
Proposition 3.3.22. If A and B are two clauses appearing in the resolution proof of S ⊢R ⊥ such that A subsumes B, then B is no longer needed once A is present.
Proof. If B is used in a resolution proof after the presence of A, we may replace B by A to obtain a proof of shorter or equal length, by checking all the resolutions which remove every literal from B: Suppose the first resolution is between B and γ and the resolvent is B′. There are two cases to consider: (1) the resolved variable is not in A. This resolution is unnecessary, because A subsumes B′ and we replace B′ by A. (2) the resolved variable is in A, then let the resolvent of A and γ be A′. It is easy to check that A′ subsumes B′, and we replace B′ by A′. We replace the descendants of B by the corresponding descendants of A, until the last resolution. The modified resolution proof will have shorter or equal length. P
Proposition 3.3.23. If there exists a resolution proof using a tautology clause, then there exists a resolution proof without the tautology clause.
Proof. Suppose the tautology clause appearing in the proof is (p | p | α). We need anotherclause(p|β)(or(p|β))toresolveoffp. Theresolventof(p|p|α)and (p|β)is(p|α|β),whichissubsumedby(p|β). ByProposition3.3.22,(p|α|β) is not needed in the resolution proof. Since every resolvent of a tautology is not needed, so we need any tautology in a resolution proof. P
Definition 3.3.24. A literal l is said to be pure in a set C of clauses if its comple- ment does not appear in C.
Proposition 3.3.25. If l is pure in C, let D be the set of clauses obtained from C by removing all clauses containing l from C, then C ≈ D.
Proof.IfσisamodelofC,thenσisalsoamodelofD. IfσisamodelofD,define σ(l) = 1, then σ is a model of C. P
Forexample,ifC={1.(p|q), 2.(p|r), 3.(p|q|r), 4.(p|q)},thenris pure. Remove 2. and 3. from C, q becomes pure. Remove 1. and 4., we obtain an
111
empty set of clauses, which is equivalent to 1. By the proposition, it means C is satisfiable. A model can be obtained by setting q = r = 1 in the model.
The above propositions allow to use the following clause deletion strategies without worrying about missing a proof:
• tautology deletion: Tautology clauses are discarded.
• subsumption deletion: Subsumed clauses are discarded.
• pure deletion: Clauses containing pure literals are discarded.
These deletion strategies can be integrated into the algorithm resolution as follows.
Algorithm 3.3.26. The algorithm resolution will take a set S of clauses and returns true iff S is satisfiable. In preprocessing(S), we may implement pure- literal check or subsumption to simplify the input clauses as a preprocessing step. subsumedBy(A,S) checks if clause A is subsumed by a clause in S.
proc resolution(S)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
G :=preprocessing(S); // G: given clauses K := { }; // K: kept clauses
while G ̸= { } do
A := pickClause(G);
G := G−{A};
N := { }; // new clauses from A and K
for
B ∈ K if resolvable(A, B) do
res :=resolve(A,B);
if res = ( ) return false;
if subsumedBy(res, G ∪ K) continue; N := N ∪ {res};
K = K ∪ {A};
for A ∈ G if subsumedBy(A, N ) do
G := G−{A};
for B ∈ K if subsumedBy(B,N) do
K:=K −{B}; G := G ∪ N;
return true; // K is saturated by resolution
A life cycle of a clause A goes as follows in the algorithm resolution.
112
113
1. If A is an input clause, it passes the preprocessing check, such as pure-literal
check and subsumption check, and goes into G (line 1).
2. If A is picked out of G (lines 4,5), it will pair with every clause B in K to try resolution on A and B. Once this is done, A will go to K as a kept clause (line 12).
3. A may be subsumed by a new clause when it is in G (line 13) or in K (line 15), and will be thrown away.
4. If A is a new clause (line 8), it has to pass the subsumption check (line 10) and parks in N (line 11).
5. All new clauses in N will be used to check if they can subsume some clauses in G (line 13) and (line 15), then merge into G (line 17), and start a life cycle the same way as an input clause.
6. If the algorithm stops without finding the empty clause (line 9), the set G will be empty (line 3), and all the resolutions among the clauses in K are done. That is, K is saturated by resolution.
In this algorithm, unit clauses are given high priority to do unit resolution
and subsumptions. Dealing with unit clauses can be separated from the main loop of the algorithm as we will see in the next section.
3.4 Boolean Constraint Propagation (BCP)
A unit clause (A) subsumes every clause like (A | α), where A appears as a literal. Unit resolution between (A) and (A | α), where A = p if A is p, generates the resolvent (α), which will subsume (A | α). The rule for replacing (A | α) by (α) at the presence of (A) is called unit deletion. That is, when subsumption deletion is used in resolution, a unit clause (A) allows us to remove all the occurrences of the variable p in (A), with the unit clause itself as the only exception. New unit clauses may be generated by unit resolution and we can continue to simplify the clauses by the new unit clause and so on. This process is traditionally called Boolean constraint propagation (BCP), or unit propagation.
3.4.1 BCP: a Simplification Procedure
Clause deletion strategies allow us to remove unnecessary clauses during reso-
lution. BCP also allows us to simplify clauses by removing unnecessary clauses and
shortening clauses. In the following, we will describe BCP as a procedure which takes a set C of clauses as input and returns a pair (U,S), where U is a set of unit clauses and S is a set non-unit clauses such that U and S share no variables and C ≡ U ∪ S. Typically, S contains a subset of C or clauses from C by removing some literals.
Algorithm 3.4.1. The algorithm BCP will take a set C of clauses and apply unit resolution and subsumption deletion repeatedly, until no more new clauses can be generated by unit resolution. It will return ⊥ if an empty clause is found; otherwise, it returns a simplified set of clauses equivalent to C.
proc BCP(C)
S := C; // S: simplified clauses U := { }; // U: unit clauses while S has a unit clause (A) do
S := S − {(A)};
U := U ∪ {A};
for (A | α) ∈ S do
S := S − {(A | α)}; for (A | α) ∈ S do
if alpha = ( ) return ⊥;
S := S −{(A | α)}∪{(α)}; return (U, S);
Example3.4.2.LetC={1: (x2 |x5), 2: (x1 |x4), 3: (x2 |x4), 4: (x1 | x2 | x3), 5 : (x5)}. Then BCP(C) will return
({x5, x2, x4, x1}, ∅)
because we first add x5 into U; clause 1. becomes (x2), x2 is then added into U; clause 4. is deleted because x2 ∈ U; clause 3. becomes (x4) and x4 is added into U; clause 2. becomes (x1) and x1 is added into U. The set U allows us to construct two models of C:
{x1 → 0,x2 → 1,x3 → v,x4 → 1,x5 → 0} where v ∈ {0, 1} as x3 can take either value.
Proposition 3.4.3. (a) If BCP(C) returns ⊥, then C is unsatisfiable. (b) If BCP(C) returns (U, S), then C ≡ U ∪ S, where U is a set of unit clauses and S is a set of non-unit clauses, and U and S share no variables.
114
Proof. In BCP(C), S is initialized with C. S is updated through three ways: (1) unit clause A is removed from S into U; (2) clause (A | α) is removed from S; and (3) clause (A | α) is replaced by (α). (α) is the resolvent of (A) and (A | α), so C |= α.
(a): Ifα=()in(3),⊥willbereturnedandC≡⊥becauseC|=⊥.
(b): Initially, C ≡ U ∪S. Since A subsumes A | α and α subsumes (A | α), by Proposition 3.3.21, it justifies that both A | α and (A | α) can be safely removed, as it maintains C ≡ U ∪S for each update of U and S. When A is moved from S to U, either A or A is removed from S, so no variables of A appear in S. The procedure stops only when S does not have any unit clauses. P
The procedure BCP can serve as a powerful simplification procedure when unit clauses are present in the input. BCP plays an important role in deciding if a formula is satisfiable, which is the topic of the next chapter. We will show later that BCP(C) can implemented in time O(n), where n is the number of literals in C.
3.4.2 BCP: a Decision Procedure for Horn Clauses
A Horn clause is a clause with at most one positive literal. Horn clauses are
named for the logician Alfred Horn, who first pointed out their significance in 1951. A Horn clause is called definite clause if it has exactly one positive literal. Thus Horn clauses can be divided into definite clauses and negative clauses. Definite clauses can be further divided into fact (a positive unit clause) and rule (non-unit definite clauses). Horn clauses have important applications in logic programming, formal specification, and model theory, as they have very efficient decision procedures, using unit resolution or input resolution. That is, unit and input resolutions are incomplete in general, but they are complete for Horn clauses.
Proposition 3.4.4. BCP is a decision procedure for Horn clauses.
Proof. Suppose H is a set of Horn clauses and ⊥ ̸∈ H. It suffices to show that procedure BCP(H) returns ⊥ iff H ≡ ⊥. If BCP(H) returns ⊥, by Proposition 3.4.3, H ≡ ⊥. If BCP(H) returns (U,S), by Proposition 3.4.3, H ≡ U ∪ S, U is a set of unit clauses, S is a set of non-unit clauses, and U and S share no variables. In this case, we create an assignment σ in which every literal in U is true, and every variable in S is false. Note that each variable in U appears only once; U and S do not share any variable. Thus this assignment is consistent and is a model of U ∪ S, because every unit clause in U is true under σ; for each every clause in S, since it is a non-unit Horn clause and must have a negative literal, which is true under σ. P
115
Theorem 3.4.5. Unit resolution is sound and complete for Horn clauses.
Proof. Unit resolution is sound because resolution is sound. The only used inference rule in BCP is unit resolution. Every clause generated by BCP can be generated by unit resolution. Subsumption only reduces some unnecessary clauses. If the input clauses are unsatisfiable, BCP will generate the empty clause, which can also be generated by unit resolution. If the input clauses are satisfiable, unit resolution will terminate with a set of clauses saturated by unit resolution, without generating the empty clause, because unit resolution is sound. P
3.4.3 Unit Resolution versus Input Resolution
In the following, we show that input resolution and unit resolution are equiv-
alent, in the sense that if there exists a unit resolution proof, then there exists an input resolution proof and vice versa.
Theorem 3.4.6. For any set C of clauses, if there exists an input resolution proof from C, then there exists a unit resolution proof from C.
Proof. We will prove this theorem by induction on the length of proofs. Let P be an input resolution proof from C of minimal length. The last resolvent is ⊥ and both its parents are unit clauses and one of them must be an input clause, say A. For simplicity, assume all the assumed clauses appear in the beginning of P and A is the first in P. If |P| = 1, then P must be a unit resolution proof and we are done. If |P | > 1, then remove A and the last clause from P , and remove all occurrences of A fromP,whereA=pifA=pandA=pifA=p. Theresultisaninputresolution proof P′ from C′, where C′ is obtained from C by removing all occurrences of A in C. That is, (A | α) ∈ C iff (α) ∈ C′. Because |P′| < |P|, by induction hypothesis, there exists a unit resolution Q′ proof from C′. If (α) ∈ C′ comes from (A | α) ∈ C, then change the reason of (α) ∈ Q′ from “assumed” to “resolution from A and (A | α)”. Let the modified Q′ be Q, then Q is a unit resolution from C. P
Example 3.4.7. We illustrate the proof of this theorem by Example 3.3.5, where C={1.(p|q), 2.(p|r), 3.(p|q|r), 4.(r)}.
LetP be(b)ofExample3.3.5. ThenA=r,C′ ={1.(p|q), 2′.(p), 3′.(p|q)} and two input resolutions in P′ are 5. : p (1,3′) and 6. ⊥ (2′,5). A unit resolution Q′ from C′ will contain three unit resolutions:
5. (q) (1,2′) 6. (q) (2′,3′) 7. () (5,6).
116
Adding two unit resolutions before Q′, we obtain Q:
2′ : (p) (2,4), 3′ : (p | q) (3,4) 5. (q) (1,2′) 6. (q) (2′,3′) 7. () (5,6),
which is a unit resolution proof from C.
Theorem 3.4.8. For any set C of clauses, if there exists a unit resolution proof
from C, then there exists an input resolution proof from C.
The proof of this theorem is left as exercise.
The above two theorems establish the equivalence of input resolution and unit resolution: A unit resolution proof exists iff an input resolution proof exists. Since unit resolution is complete for Horn clauses, we have the following result.
Corollary 3.4.9. Input resolution is sound and complete for Horn clauses.
Theorem 3.4.10. For any set H of Horn clauses, let H = S ∪ T , where S is a set of negative clauses and T = H − S. Then input resolution is sound and complete for H, when S is used as the set of support.
Proof. Resolution is not possible among the clauses in S as resolution needs a pair of complementary literals from the two parent clauses. Set-of-support resolution does not allow resolution among the clauses in T. Thus resolution is possible only when one clause is from S and one from T , and the resolvent is a negative clause, because the only positive literal from T is resolved off. For each resolution between S and T, we add the resolvent into S, until either the empty clause ⊥ is found or no new resolvents can be generated. If ⊥ is generated, then H ≡ ⊥ because resolution is sound. If no new resolvents can be generated, we create an assignment σ as follows: call BCP on T, since T is satisfiable, let (U,S) = BCP(T). For any unit clause p in U, let σ(p) = 1; for any other unassigned variable q in S ∪ S, let σ(q) = ⊥. We claim that σ is a model of H. At first, σ(U ∪S) = 1, because U and S do not share any variables, and the assignment will make every clause in U ∪ S to be true. Let S be the set of negative clauses from H plus all the resolvents between S and T.
If σ(S) = ⊥, we look for a clause C ∈ S, such that σ(C) = ⊥ and every variable of C appears as a unit clause in T. Since C is a negative clause, let C = (p1 | p2 | ··· | pk). Since σ(C) = ⊥, pi ∈ U and σ(pi) = 1 for 1 ≤ j ≤ k. If pk ̸∈ T, pk is obtained by unit resolution in BCP(T), say from (q1 | q2 | pk) ∈ T and q1,q2 ∈ U, then this unit resolution can be avoided by an input resolution between (q1 |q2 |pk)∈T andC∈Stogenerate
C′ =(p1 |p2 |···|pk−1 |q1 |q2)∈S.
117
Obviously, σ(C′) = 0. Let C be C′ and we check again if every variable of C appears as a unit clause in T. Finally, we will find a clause C ∈ S, such that every variable of C appears as a unit clause in T. In this case, input resolution between C and {(p1), (p2), ..., (pk)} will generate the empty clause ( ), a contradiction coming from the assumption that σ(S) = 0. So σ(S) = 1. P
Note that input resolution with negative clauses as the set of support on Horn clauses is also negative resolution. If the set of support initially contains a single clause, this input resolution is also a linear resolution. If the positive literal is maximal in each clause, then this resolution is also ordered resolution.
Corollary 3.4.11. If H is a set of Horn clauses which contains a single negative clause, then the negative, input and linear resolution is complete for H.
Proof. If C is the only negative clause, let S = {C} and T = H −S. Using S as the set of support, we will get an input resolution proof. Since all negative clauses between S and T are from C, the proof can be arranged as a linear resolution proof. P
3.4.4 Head-Tail Literals for BCP
For a unit clause (a) to be true, the truth value of literal a needs to be 1. We
can use a partial interpretation σ to remember which literals appear in unit clauses.
Definition 3.4.12. A partial interpretation is a set σ of literals, where no variables appear in σ more than once, with the understanding that if a ∈ σ, then a is true and a is false in σ. When σ contains every variable exactly once, it is called a full interpretation
It is easy to see that a full interpretation σ is equivalent to the interpretation σ′: foreveryvariablep,σ′(p)=1ifp∈σandσ′(p)=0ifp∈σ. Fromtheabove definition, it is convenient to consider a set of unit clauses as a partial interpretation. For example, if U = {p,¬q,r}, it represents the interpretation σ = {p → 1,q → 0, r → 1}. From now on, we do not distinguish a full interpretation from the old definition of interpretation.
We can evaluate the truth value of a clause or formula in a partial interpreta- tion as we do in a full interpretation.
Definition 3.4.13. Given a partial interpretation σ and a clause c, • c is satisfied if one or more of its literals is true in σ.
118
• c is conflicting if all the literals of c are false in σ.
• c is unit if all but one of its literals are false and c is satisfied. • c is unresolved if c is not one of the above three cases.
When c becomes unit, we have to assign 1 to the unassigned literal, say a, in c, so that c becomes true. That is, c is the reason for literal a being true, and we record this as reason(a) = c. We also say the literal a is implied by c.
Example 3.4.14. Given the clauses
{c1 :(x1), c2 :(x3), c3 :(x1 |x2 |x3)},
reason(x1) = c1, reason(x3) = c2, and reason(x2) = c3. x2 is implied to be true by c3 because c3 is equivalent to (x1 ∧ x3) → x2.
In BCP, we start with an empty partial interpretation σ. If we have found a unit clause (a) , we extend σ by adding a into σ so that the unit clause becomes satisfied. To save time, we do not remove clauses containing a from the clause set. We do not remove a from any clause to create a new clause. In stead, we just remember a is now false in σ.
We need to find out which clauses become unit clauses when some of literals become false. A clause c of k literals becomes unit when its k − 1 literals are false. If we know two literals of c are not false, we are certain that c is not unit. Initially, let the first and last literals be such literals, designated as head and tail literals of c, which are two distinct literals of c. When a head literal becomes false, we scan right to find in c the next literal which is not false and make it the new head literal; similarly, when a tail literal becomes false, we scan left to find the next tail literal. If we cannot find new head or tail literal, we must have found a unit clause or an empty clause.
To implement the above idea, we use the following data structures for clauses: For each non-unit clause c:
• lits(c): the array of literals storing the literals of c;
• head(c): the index of the first non-false literal of c in lits(c); • tail(c): the index of the last non-false literal of c in lits(c);
Let |c| denote the number of literals in c, then the valid indices for lits(c) are {0, 1, ..., |c| − 1}, head(c) is initialized with 0, and tail(c) is with |c| − 1.
The data structures associated with each literal A are:
119
120 • val(A): the truth value of A under partial interpretation, val(A) ∈ {1, 0, ×},
× (“unknown”) is the initial value;
• cls(A): list of clauses c such that A is in c and pointed by either head(c) or tail(c) in lits(c). We use insert(c, cls(A)) and remove(c, cls(A)) to insert and remove clause c into/from cls(A), respectively.
Assuming global variables are used for clauses and literals, the procedure BCP can be implemented as follows:
proc BCP(C)
// initialization
1: 2: 3: 4: 5: 6: 7: 8: 9:
10:
11: 12: 13: 14:
U := {}; // U: stack of unit clauses for c ∈ C do
if c = ( ) return ⊥; if (|c| > 1)
lits(c) := makeArray(c); // create an array of literals head(c) := 0; insert(c, cls(lits(c)[0]));
tail(c) := |c| − 1; insert(c, cls(lits(c)[|c| − 1]));
else push(c,U);
for all literal l, val(l) := ×;
// major work
res := BCPht(U);
// finishing
if (res ̸= ”SAT ”) return ⊥;
U :={l:val(l)=1};
S := {clean(c) : val(lits(c)[head(c)]) = ×, val(lits(c)[tail(c)]) = ×}; return (U, S);
Note that the procedure clean(c) at line 13 will remove from c those liter- which are false (under val) and return ⊤ if one of those literals is true. The
als
procedure BCPht will do the major work of BCP.
Algorithm 3.4.15. BCPht(U) assumes the input clauses C = U ∪ S, where U is a stack of unit clauses and S is a set of non-unit clauses stored using the head-tail data structure, and can be accessed through cls(A). BCPht returns a conflict clause if an empty clause is found during unit propagation; returns “SAT” if no empty clause is found.
proc BCPht(U)
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24:
whileU̸={}do
A := pop(U);
if val(A) = 0 return reason(A); // an empty clause is found; else if (val(A) = ×)
val(A) := 1; val(A) := 0;
for c ∈ cls(A) do // A is either head or tail of c
if (A = lits(c)[head(c)])
e1 := head(c); e2 := tail(c); step := 1; // scan from left to right
else
e1 := tail(c); e2 := head(c); step := −1; // scan from right to left x:=e1 +step;//xtakesvaluesfrome1 +steptoe2
while x ̸= e2 do
B := lits(c)[x];
if(val(B)=0) x:=x+step;//gotothenextliteral else
remove(c, cls(A)); insert(c, cls(B));
if (step = 1) head(c) := x; else tail(c) := x; break; // exit the inner while loop
if (x = e2) // no new head or tail; a unit clause is found A := lits(c)[e2]; // A is in a unit clause
if val(A) = 0 return c; // c is conflicting;
else if (val(a) = ×)
push(A, U ); reason(A) := c; val(A) := 1; val(A) := 0; return “SAT”; // no empty clauses are found;
121
Example 3.4.16. For the clauses in Example 3.4.2, we have the following data structures after the initialization (the head and tail indices are given after the literal list in each clause):
C = {c1 : ⟨[x2,x5],0,1⟩,c2 : ⟨[x1,x4],0,1⟩, c3 : ⟨[x2,x4],0,1⟩, c4 : ⟨[x1,x2,x3],0,2⟩} U = {x5}; the rest clauses are stored as follows :
cls(x1) = {c4}, cls(x1) = {c2}, cls(x2) = {c1}, cls(x2) = {c3}, cls(x3) = {}, cls(x3) = {c4}, cls(x4) = {c3}, cls(x4) = {c2}, cls(x5) = {c1}, cls(x5) = {},
and val(A) = × for all literal A. Note that each clause appears twice in the collection of cls(A).
Inside of BCPht(U), when x5 is popped off U, we assign val(x5) = 1, val(x5) =
0 (line 5) and check c1 ∈ cls(x5) (line 6). c1 generates unit clause x2 and is added into U (line 23) with the assignments val(x2) = 1, val(x2) = 0. When x2 is popped off U, we check c3 ∈ cls(x2). c3 generates unit clause x4, we push x4 into U, and assign val(x4) = 1, val(x4) = 0. When x4 is popped from U, we check c2 ∈ cls(x4). c2 generates unit clause x1, we push x1 into U and assign val(x1) = 1, val(x1) = 0. When x1 is popped off U, we check c4 ∈ cls(x1). c4 is true because val(x2) = 1. Now the head index of c4 is updated to 1, and we add c4 = ⟨[x1, x2, x3], 1, 2⟩ into cls(x2). Since no empty clause is found, BCPht returns “SAT”. Finally, BCP returns (U, S):
(U, S) = ({x5, x2, x4, x1}, ∅).
Note that the two assignments val(a) := 1;val(a) := 0 at line 23 of BCPht can be removed as the same work will be done at line 5. However, having them here often improves the performance of BCPht. BCPht can be improved slightly by checking (val(lits(c)[e2]) = 1) after line 10: if it is true, then skip clause c as c is satisfied.
In this implementation of BCP, once all the clauses C are read in, each oc- currence of literals of non-unit clauses will be visited at most once in BCPht.
Proposition 3.4.17. BCP(C) runs in O(n) time, where n is the sum of the lengths of all clauses in C.
Proof. Obviously, the initialization of BCP takes O(n). For BCPht, we access a non-unit clause c when its head (or tail) literal becomes false (line 6). We skip the head (tail) literal (line 11) and check the next literal b (line 13), if it is false, we continue (line 14); otherwise, we remove c from cls(A) into cls(B) (line 16) and update the head (tail) index (line 17). If we cannot find a new head (tail) literal (line 19), we have found a new unit clause (line 20). None of the literals of c is visited more than once by BCPht. In particular, we never access the clauses from cls(A) when literal A is true. P
Combining the above proposition with Theorem 3.4.4, we have the following result:
Theorem 3.4.18. BCP is a linear time decision procedure for Horn clauses.
3.5 Exercises
1. Prove formally that the following statements are equivalent: For any two formulas A and B, (a) A |= B; (b) A → B is valid; (c) A∧¬B is unsatisfiable.
122
123 2. Use the semantic tableaux method to decide if the following formulas are valid
or not.
(a) (A↔(A∧B))→(A→B)
(b) (B↔(A∨B))→(A→B)
(c) (A⊕B)→((A∨B)∧(¬A∨¬B)) (d) (A↔B)→((A∨¬B)∧(¬A∨B))
3. Use the semantic tableaux method to decide if the following entailment rela- tions are true or not.
(a) (A∧B)→C|=(A→C)∨(B→C) (b) (A∧B)→C|=A→C
(c) A→(B→C)|=(A→B)→(A→C) (d) (A∨B)∧(¬A∨C)|=B∨C
4. Prove by the semantic tableau method that the following statement, “either the debt ceiling isn’t raised or expenditures don’t rise” (¬d ∨ ¬e), follows logically from the statements in Problem 31, Chapter 2.
5. Prove that if all the axioms are tautologies and all the inference rules are sound, then all the derived formulas from the axioms and the inference rules are tautologies.
6. Specify the introduction and elimination rules for the operators ⊕ and ↔ in natural deductive system.
7. Estimate the upper bound of the size of the inference graph for resolution, if the input clauses contain n variables.
8. Use the Tseitin encoding to convert the following formulas into 3CNF (CNF in which every clause has at most three literals):
(a) (A↔(A∧B))→(A→B)
(b) (A⊕B)→(A∨B)
(c) ((A∧B)→C)→(A→C)
(d) ((A∨B)∧(¬A∨C))→(B∨C)
9. Use the resolution method to decide if the following formulas are valid or not. If the formula is valid, provide a resolution proof; if it is not valid, provide an interpretation which falsifies the formula.
(a) (A↔(A∧B))→(A→B)
(b) (B↔(A∨B))→(A→B)
(c) (A⊕B)→((A∨B)∧(¬A∨¬B)) (d) ((A↓B)↑C)→(A↑(B↓C))
10. Use the resolution method to decide if the following entailment relations are true or not. If the relation holds, provide a resolution proof; if it doesn’t, provide an interpretation such that the premises are true, but the consequence of the entailment is false.
(a) (A→C)∧(B→D)|=(A∨B)→(C∨D) (b) (A∧B)→C|=(A→C)
(c) A→(B→C)|=(A→B)→(A→C)
(d) (A∨B)∧(¬A∨C)|=B∨C
11. Apply clause deletion strategies to each of the following sets of clauses, and obtain an equally satisfiable simpler set.
S1 ={(p|q|r),(p|q|r|s),(p|q|s),(q|r),(q|r)}
S2 ={(p|q|r|s),(p|q|r|s),(p|q|s),(q|r),(q|s|r)} S3 ={(p|q|s),(p|q|r|s),(p|r|s),(p|r),(p|q|r)}
12. GiventheclausesetS,whereS={1.(t|e|d),2.(t|c),3.(e|d|i),4.(g| d|i),5.(t|c|d|g),6.(c),7.(i|g),8.(c|d),9.(c|e)}. Findthefollowing resolution proofs for S: (a) unit resolution; (b) input resolution; (c) positive resolution; and (d) ordered resolution using the order t > i > g > e > d > c.
13. Apply various resolution strategies to prove that the following clause set S entails ¬s: unit resolution, input resolution, negative resolution, positive res- olution, and ordered resolution using the order of p > q > r > s.
S = {1.(p | q | r), 2.(p | q | s), 3.(p | q | r), 4.(p | q | r), 5.(p | q | r), 6.(q | r), 7.(r | s)}
Please show the proofs of these resolution strategies.
14. Decide by ordered resolution if the clause set S is satisfiable or not, where S = {1. (a | c | d | f), 2. (a | d | e), 3. (a | c), 4. (b | d | e), 5. (b | d | f), 6. (b | f), 7. (c | e), 8. (c | e), 9. (c | d) }, and the order is a > b > c > d > e > f. If the clause set is satisfiable, please display all the resolvents from ordered resolution and construct a model from these clauses. If the clause is unsatisfiable, please display an ordered resolution tree of the proof.
15. Prove by resolution that S |= ¬f, where S is the set of clauses given in the previous problem. Please display the resolution tree of this proof.
16. If head-tail literals are used for implementing BCP, what literals of the clause set S in the previous problem will be assigned a truth value during unit resolution before an empty clause is found? in what order?
124
17. A set C of clauses is said to be a 2CNF if every clause contains at most two literals. The 2SAT problem is to decidable if a 2CNF formula is satisfiable or not. Show that orderedResolution takes O(n3) time and O(n2) space to solve 2SAT, where n is the number of propositional variables in 2CNF.
18. Prove Theorem 3.4.8: For any set C of clauses, if there exists a unit resolution proof from C, then there exists an input resolution proof from C.
125
CHAPTER 4 PROPOSITIONAL SATISFIABILITY
Propositional satisfiability (SAT) is the problem of deciding if a propositional formula is satisfiable, i.e., the formula has a model. Software programs for solving SAT are called SAT solvers. SAT has long enjoyed a special status in computer science. On the theoretical side, it is the first NP-complete problem ever discov- ered. NP-complete problems are notorious for being hard to solve; in particular, in the worst case, the computation time of any known solution for a problem in this class increases exponentially with the input size. On the practical side, since every NP problem can transform into SAT, an efficient SAT solver can be used to solve many practical problems. SAT found several important applications in the design and verification of hardware and software systems,and in many areas of artificial intelligence and operation research. Thus, there is strong motivation to develop practically useful SAT solvers. However, the NP-completeness is cause for pessimism, since it is unlikely that we will be able to scale the solutions to large practical instances.
In theory, the decision procedures presented in the previous chapter, such as semantic tableau or resolution, can be used as SAT solvers. However, they served largely as academic exercises with little hope of seeing practical use. Resolution as a refutation prover searches for an empty clause. Semantic tableau imitates the transformation of a formula into DNF. None of these decision procedures aims at searching for a model.
Fortunately, in the last two decades or so, several research developments have enabled us to tackle instances with thousands of variables and millions of clauses. SAT researchers introduced techniques such as conflict-driven clause learning, novel branching heuristics, and efficient unit propagation. These techniques form the basis of many modern SAT solvers. Using these ideas, contemporary SAT solvers can often handle practical very large SAT instances.
Annual competitions of SAT solvers promote the continuing advances of SAT solvers. SAT solvers are the engines of modern model checking tools. Contemporary automated verification techniques such as bounded model checking, proof-based abstraction, interpolation-based model checking, are all based on SAT solvers and
126
127 their extensions. These tools are used to check the correctness of hardware designs.
4.1 The DPLL Algorithm
In 1960 David and Putnam proposed resolution for propositional logic. For realistic problems, the number of clauses generated by resolution grows quickly. To avoid this explosion, Davis, Logemann, and Loveland suggested to replace the resolution rule with a case split: Pick a variable p and consider the two cases p → 1 and p → 0. This modified algorithm is commonly referred to as the DPLL algorithm, where DPLL stands for Davis, Putnam, Logemann, and Loveland.
DPLL can be regarded as a search procedure: The search space is all the interpretations, partial or full. A partial interpretation can be represented by a set of literals, where each literal in the set is assigned to be true and each literal outside of the set is assumed unassigned. DPLL starts with an empty interpretation and tries to extend the interpretation by adding either p or p through the case split. DPLL stops when every variable is assigned without contradiction, and the partial interpretation it maintains becomes a model.
4.1.1 Recursive Version of DPLL
In the following, we give a recursive version of DPLL. The first call to DPLL
with the input clauses C and σ = {}. It is one of the oldest backtracking procedures. DPLL uses the algorithm BCP heavily for unit propagation. From the previous chapter, we know that BCP implements unit resolution and subsumption deletion, and can be done in linear time. Given a set C of clauses, BCP(C) returns ⊥ if an empty clause is found by unit resolution; otherwise, BCP(C) returns (U,S), such that C ≡ (U ∪ X), U is a set of unit clauses, S does not contain any unit clauses, U and S share no variables.
Algorithm 4.1.1. DPLL(C,σ) takes as input a set C of clauses and a partial interpretation σ, C and σ do not share any variables. It returns a model of C if C is satisfiable; return ⊥ if C is unsatisfiable. Procedure pickLiteral(S) will pick a literal appearing in S.
proc DPLL(C, σ)
1 res := BCP(C);
2 if (res = ⊥) return ⊥; // no models for C
3 // It must be the case res = (U,S)
4 σ:=σ∪U;
5 if(S=∅)returnσ;//amodelisfound
6 A := pickLiteral(S );
128
7 8 9
Figure 4.1.1: The result of BCP is shown in the box near each node.
res := DPLL(S ∪ {(A)}, σ);
if (res ̸= ⊥) return res; // a model is found
return DPLL(S ∪ {(A)}, σ); // return either ⊥ or a model
In the above procedure, we assume again A = p if A is the literal p. If BCP(C) does not return ⊥, it simplifies C to U ∪S, where U is a set of literals, S is a set of non-unit clauses, and U and S do not share any variables. By Proposition 3.4.3, C ≡ U ∪ S. We identify two types of literals in U.
Definition 4.1.2. In DPLL, a literal A ∈ U is said to be a decision literal if A is the one returned by pickLiteral(S) at line 6 of DPLL; otherwise, A is said to be an implied literal.
Since U is a set of literals without complementary literals, U can be regarded as a partial interpretation σ such that σ(U) = 1. Since U is satisfiable and C ≡ U ∪S, we conclude that C ≈ S. Furthermore, checking the satisfiability of S is reduced recursively to testing S ∧ A and S ∧ ¬A separately for a literal A appearing in S. This transformation, applied recursively, yields a complete decision procedure.
Example4.1.3.LetC={1:(x2 |x3 |x5), 2:(x1 |x4), 3:(x2 |x4), 4:(x1 | x2 | x3), 5 : (x5)}. If pickLiteral picks a variable in the order of x1, x2, …, x5, then
DPLL(C, {}) calls
BCP(C), which returns ({x5}, C1 = {1′ : (x2 | x3), 2., 3., 4.}); x1 is picked as the decision literal of level 1;
DPLL(C1 ∪ {(x1)}, {x5}), which calls
BCP(C1 ∪ {(x1)}), which returns ⊥;
and returns ⊥
DPLL(C1 ∪ {(x1)}, {x5}), which calls
BCP(C1 ∪ {(x1)}), which returns ({x1}, C2 = {1′., 3.}); x2 is picked as the decision literal of level 2;
DPLL(C2 ∪ {(x2)}, {x1, x5}), which calls
BCP(C2 ∪ {(x2)}), which returns (∅, {x1, x2, x4, x5})
and returns {x1, x2, x4, x5} and returns {x1, x2, x4, x5}
and returns {x1, x2, x4, x5}
Thus a model of C can be constructed from {x1, x2, x4, x5} by adding x3 → v for
any value v.
In the above example, the height of the recursion tree (also called decision tree) is two; x1 and x2 are decision literals; x4 and x5 are implied literals. We have called BCP four times: one call at the root (level 0); two calls at the nodes of level one; and one call at the node of level two. If we want to search for all models, we may choose to try the branch with x2 → 0, as shown in Figure 4.1.1.
Definition 4.1.4. The level of an assigned variable is the depth of a node of the recursion tree of DPLL(C, {}) at which the variable is assigned a truth value. We write x@l if x→1 at level l, and x@l if x→0 at level l.
In the above example, for the model returned by DPLL, we may write it as {x1@1, x2@2, x4@2, x5@0}
Theorem 4.1.5. DPLL(C,{}) is a decision procedure for the satisfiability of C.
Proof. Assuming ⊥ ̸∈ C. To prove the theorem, we just need to show that DPLL is sound and terminating. DPLL(C, σ) will terminate because for each recursive call, the number of variables in C is decreased by at least one. In DPLL(C, σ), since σ and C share no variables, (C ∪ σ) ≈ C. So in the soundness proof of DPLL, we ignore the presence of σ.
We will prove by induction on the structure of the recursion tree of DPLL. As a base case, if DPLL(C, σ) returns ⊥, it means that ⊥ is generated by unit resolution
129
inside BCP(C). By Proposition 3.4.3, C ≡ ⊥. If BCP(C,σ) returns (U,S), by Proposition 3.4.3, C ≡ U ∪S. As another base case, if S = ∅, then C ≡ U and U is the model for the unit clause set U if U is regarded as an interpretation, so C is satisfiable. If S ̸= ∅, then since
S ≡ (S ∧A)∨(S ∧¬A)
for any formula A, if either DPLL(S ∪ {p}, σ) or DPLL(S ∪ {p}, σ) returns a model, by induction hypotheses, it means (S ∧ p) ∨ (S ∧ ¬p) is satisfiable, so is S. If both DPLL(S ∪ {p}, σ) and DPLL(S ∪ {p}, σ) return ⊥, by induction hypotheses, it means (S ∧p)∨(S ∧¬p) is unsatisfiable, so is S. P
4.1.2 All-SAT and Incremental SAT Solvers
Given a satisfiable set of clauses, DPLL returns a single satisfying assignment.
Some applications, however, require us to enumerate all satisfiable assignments of a formula and this problem is called All-SAT. It is easy to modify DPLL so that all the models can be found. That is, we replace line 5 of DPLL as follows:
5: if (S = ∅) save(σ); return ⊥; // save model and continue
When the modified DPLL finishes, all the models are saved. Note that these models are partial models. One partial model represents 2k full models, where k is the number of unassigned variables in the partial model.
During the execution of DPLL, the list of decision literals (labeled with 1) and their complements (labeled with 2) decides the current position in the recursion tree of DPLL. This list of literals, called guiding path, can be very useful for designing an incremental SAT solver: The solver takes the guiding path as input, uses the literals in the guiding path to skip the branches already explored and explores the new branches in the recursion tree for a new model.
Example 4.1.6. In Example 4.1.3, a model is found with the decision literals x1 and x2. The guiding path for this model is [x1 : 2, x2 : 1], where the label 2 after x1 means this is the second branch of x1. The label 1 after x2 means this is the first branch of x1. If the next model is needed, it must be in the subtree with the initial guiding path [x1 : 2, x2 : 2]. Using [x1 : 2, x2 : 2], the SAT solver avoids the search space it has traveled before.
Since modern SAT solvers are often designed as a blackbox, it is cumbersome to use them as an incremental solver for enumerating each model. We can force
130
the SAT solver to find a new model by subsequently blocking all models previously found. A model can be viewed as a conjunction of true literals, one for every variable. The negation of this conjunction can be represented as a clause, called blocking clause. If this clause is added into the set of input clauses, the model will be blocked by this clause and cannot be generated by the SAT solver. Since a blocking clause contains every variable, it is unlikely to become unit and it is expensive to keep all the blocking clauses. Therefore, it is desirable to reduce the size of blocking clauses, i.e., to construct a smaller clause which still blocks the model. If we know the guiding path of this model, the negation of the conjunction of all the decision literals in the guiding path will block the model.
Many applications of SAT solvers require solving a sequence of instances by adding more clauses. Incremental SAT solvers with the guiding path can support this application without any modification, because the search space skipped by the guiding path of the first model will contain no models.
In many application scenarios, it is beneficial to be able to make several SAT checks on the same input clauses under different forced partial interpretation, called “assumption”. For instance, people may be interested in questions like “Is the formula F satisfiable under the assumption x → 1 and y → 0?” In such applications, the input formula is read in only once, the user implements an iterative loop that calls the same solver instantiation under different sets of assumptions. The calls can be adaptive, i.e., assumptions of future SAT solver calls can depend on the results of the previous solver calls. The SAT solver can keep its internal state from the previous call to the next call. DPLL with a guiding path can be easily modified to support such applications, treating the assumption as a special set of unit clauses which can be added into or removed from the solver. The guiding path allows us to backtrack to the new search space but never revisit the search space visited before.
4.1.3 BCPw: Implementation of Watch Literals
To obtain a high-performance SAT solver, we may use the deletion techniques
presented in Section 3.3.6 as pre-processing techniques which enable us to reduce the size of the formula before passing it to DPLL.
Inside the body of DPLL, it calls itself twice, once with S ∪ {(A)} and once with S ∪ {(A)}. A persistent data structure for S would quickly run out of memory as S is very large and the recursion tree is huge for practical SAT instances. A better approach is to use a destructive data structure for S: We remember all the operations performed onto S inside the first call, and then undo them when we enter the second call.
131
In Algorithm 3.4.15, we presented an efficient implementation of BCP as a decision procedure for Horn clauses. This version of BCP, called the procedure BCPht, uses the head/tail data structure for each non-unit clause. Of course, BCPht can be used to support DPLL. Better yet, with minor modification, BCPht can implement the idea of watch literals, which is based on head/tail literals and used in solver zchaff by Moskewicz and Zhang.
The function of watch literals is the same as head/tail literals: Use two literals to decide if a clause becomes unit or not. Head and tail literals are watch literals. When head/tail literals are allowed to appear anywhere in a clause, head/tail literals are identical to watch literals. Since the idea of watch literals is based on that of head/tail literals, the implementation of watch literals is easy if we have an implementation of head/tail literals.
In BCPht, when a head or tail literal of c becomes false, we look for its replacement at index x, where head(c) < x < tail(c). There are two alternative ways to modify BCPht. One solution is to allow x ∈ {0, 1, ..., |c| − 1} − {head(c), tail(c)}. The other solution is to fix head(c) = 0 and tail(c) = |c| − 1, and swap the elements of lits(c) if necessary.
The second solution saves the space for head(c) and tail(c), but may cost more for visiting literals of lits(c). For example, if a clause has k literals which become false in the order from left to right during BCP. The second solution may take O(k2) time to visit the literals of the clause, while the first solution takes O(k) time. As a result, BCP cannot be a linear time decision procedure for Horn clauses if the second solution is used. Of course, we may introduce a pointer in the clause to remember the position of the last visited literal, so that the second solution still gives us a linear time algorithm.
Algorithm 4.1.7. BCPw(U) is an implementation of BCP(C) based on the head/tail data structure. BCPw(U) works the same way as BCPht(U), where U is a stack of unit clauses and non-unit clauses are stored using the head/tail data structure. Like BCPht, BCPw returns a conflicting clause if an empty clause is generated; returns “SAT” otherwise. Unlike BCPht, values of head(c) and tail(c) can be any valid indices of lits(c).
proc BCPw(U)
1 2 3 4 5
while U ̸= {} do A := pop(U);
if val(A) = 0 return reason(A); // an empty clause is found; else if (val(A) ≠ 1)
val(A) := 1; val(A) := 0;
132
133
6
7
8
9 10 11 12 12’ 13 14 15 16 17 18 19 20 21 22 23 24
for c ∈ cls(A) do // A is either head or tail of c if (A = lits(c)[head(c)])
e1 := head(c); e2 := tail(c); step := 1; // scan from left to right else
e1 := tail(c); e2 := head(c); step := −1; // scan from right to left x := (e1 + step) mod |c|; // x takes any valid index of lits(c) while x ̸= e1 do
if (x = e2) continue; // skip the literal watched by e2
B := lits(c)[x];
if (val(B) = 0) x := (x + step) mod |c|; // go to the next literal else
remove(c, cls(A)); insert(c, cls(B));
if (step = 1) head(c) := x; else tail(c) := x; break; // exit the inner while loop
if (x = e1) // no new head or tail found
A := lits(c)[e2]; // c is likely unit.
if val(A) = 0 return c; // c is conflicting.
else if (val(A) ̸= 1) // c is not satisfied, indeed unit.
push(A, U ); reason(A) := c; val(A) := 1; val(A) := 0; return “SAT”; // no empty clauses are found;
The above algorithm comes from BCPht (Algorithm 3.4.15) by the following modifications, which allows the values of head(c) and tail(c) to be anywhere in lits(c).
• Inlines11and14,x:=x+stepisreplacedbyx:=(x+step)mod|c|.
• In line 12, the condition (x ̸= e2) is replaced by (x ̸= e1).
• Line 12’ is added to skip the literal watched by e2.
• In line 19, the condition (x = e2) is replaced by (x = e1).
• In line 24, we record the reason for the assignment of a literal; this information is used at line 3 when a clause becomes conflicting.
In DPLL, the book-keeping required to detect when a clause becomes unit can involve a high computational overhead if implemented naively. Since it is sufficient to watch in each clause two literals that have not been assigned yet, assignments to the non-watched literals can be safely ignored. When a variable p is assigned 1, the
SAT solver only needs to visit clauses watched by p. Each time one of the watched literals becomes false, the solver chooses one of the remaining unassigned literals to watch. If this is not possible, the clause is necessarily unit, or becomes a conflicting clause, or is already satisfied under the current partial assignment. Any sequence of assignments that makes a clause unit will include an assignment of one of the watched literals. The computational overhead of this strategy is relatively low: In a formula with m clauses and n variables, 2m literals need to be watched, and m/n clauses are visited per assignment on average. This advantage is inherited from the idea of head/tail literals.
The key advantage of the watch literals over the head/tail literals is that the watched literals do not need to be updated upon backtracking in DPLL. That is, the key advantage of BCPw over BCPht is its use inside DPLL: When you backtrack from a recursive call of DPLL, you do not need to undo the operations on head(c) and tail(c), because any two positions of c will be sufficient for checking if c becomes unit.
4.1.4 Iterative Implementation of DPLL
Since the compiler will convert a recursion procedure into an iterative one
using a stack, to avoid the expense of this conversion, in practice, the procedure DPLL is not implemented by means of recursion but in an iterative manner using a stack for the partial interpretation σ. If the head/tail data structure is used, σ can be obtained from val(a), but we still need a stack of literals to record the time when a literal is assigned. This stack is called trail, which can also serve as a partial interpretation. We keep track of the recursive case-splits and their implications using the trail. Every time a literal is assigned a value, either by a decision, or an implication inside BCP, the literal is pushed into the trail and its level in the recursion tree is remembered.
A trail may lead to a dead end, i.e., result in a conflicting clause, in which case we have to explore the alternative branch of one of the case splits previously made. This corresponds to backtracking which reverts one of the decisions. When we backtrack from a recursive call, we pop those literals off the trail, and make them as “unassigned” (denoted by ×). When backtracking enough times, the search algorithm always yields a conflicting clause or a satisfying assignment and eventually exhausts all branches.
To implement the above idea, we use the trail σ which is a stack of literals and serves as a partial interpretation. σ will keep all the assigned literals. For each literal A, we use lvl(A) to record the level at which A is assigned a truth value.
134
This is actually done at line 5 of BCPw. The Line 5 of BCPw now is read as follows:
5 : val(A) := 1; val(A) := 0; lvl(A) := lvl(A) := level; push(A, σ);
where val, lvl, σ, and level are global variables. From now on, we call the modified BCPw simply as BCP.
In practice, it is more convenient to set the level of a right child the same as the level of its parent in the recursion tree, to indicate that the right child has no alternatives to consider. In the following implementation of DPLL, we have used this idea.
Algorithm 4.1.8. Non-recursive DPLL(C) takes as input a set C of clauses and returns a model of C if C is satisfiable; returns 0 if C is unsatisfiable. Procedure initialize(C) will return the unit clauses of C and store the rest clauses in the head/tail data structure.
proc DPLL(C)
1
2
3
4
5
6
7
8
9
10
11
12
13
U := initialize(C ); // initialize the head/tail data structure σ := {}; level := 0; // initialize σ and level
while (true) do
if (BCP(U) = ⊥)
if (level = 0) return ⊥; // C is unsatisfiable
level := level − 1; // level of the second child
A := undo(); // A is the last literal picked at line 10 U := {A};
else
A := pickLiteral(); // pick an unassigned literal
if (A = nil) return σ; // all literals assigned, a model is found level := level + 1; // level of the first child
U := {A};
135
proc undo()
// Backtrack to level, unassign all literals of higher levels. // Return the last literal of level + 1 in the trail, or nil.
A := nil;
while (σ ̸= {}) ∧ (lvl(top(σ)) > level) do
A := pop(σ);
val(A) := val(A) := ×; // × means “unassigned” return A;
1 2 3 4 5
The procedure undo() will undo all the assignments (made at line 5 of BCP) to the literals of levels higher than level (including their complements), and return the last literal of level + 1 in σ, which must be the last decision literal A picked at line 10 of DPLL.
The procedure pickLiteral(L) (line 12) is the place for implementing various heuristics of best search strategies. It leaves the question of which literals to as- sign open. As we know from Theorem 4.1.5, this choice has no impact on the completeness of the search algorithm. It has, however, a significant impact on the performance of the solver, since this choice is instrumental in pruning the search space. We will address this issue later.
4.2 Conflict-Driven Clause Learning (CDCL)
This section introduces conflicting clauses as a source information to prevent the repeated exploration of futile search space. We start with a motivating example.
Example 4.2.1. Suppose C is a set of clauses over the variable set {x1, x2, …, xn} and C has more than a million models. Let C′ = C ∪S, where S = {(y1 | y2),(y1 | y2),(y1 | y2),(y1 | y2)}. Then C′ has no models, because the added four clauses are unsatisfiable. Now we feed C′ to DPLL and assume that pickLiteral will prefer xi over y1 and y2. Every time when DPLL picks y1 for case-split, a model of C was found at that point. If y1 is true, then one of (y1 | y2) and (y1 | y2) will be conflicting. If y1 is false, then one of (y1 | y2) and (y1 | y2) will be conflicting. Either value of y1 will lead to a conflicting clause in S, so DPLL will backtrack. Since C has more than a million models, it means DPLL will try to show that S is unsatisfiable more than a million times.
If we can learn from the conflicting clauses of S, we may avoid repeatedly doing the same work again and again. This is the motivation for the technique of conflict-driven clause learning (CDCL): Learn new information from conflicting clauses and use the new information to avoid futile search effort. Using CDCL, new information is obtained in the form of clauses which are then added back into the search procedure. For the above example, when y1 is true by case-split, (y1 | y2) becomes unit, so y2 must be set to 1. Now (y1 | y2) becomes conflicting. A resolution between these two clauses on p2 will generate a new clause, i.e., (y1). The new clause has nothing to do with the assignments of xi, so we can jump back to the level 0 of DPLL and the subsequent BCP will find the empty clause from S ∧ y1 and DPLL returns ⊥, thus avoiding finding models in C a million times.
136
4.2.1 Generating Clauses from Conflicting Clauses
Recall that in the last DPLL algorithm, when a non-unit clause c becomes
unit, where c contains an unassigned literal A, we do reason(A) := c, i.e., c is the reason for the unassigned literal A to be true. When a conflicting clause is found in DPLL, all of its literals are false, and we pick the literal A, which is assigned last in c, in the clause and do resolution on A between this clause and reason(A), hoping the resolvent will be useful to cut the search space. Since resolution is sound, new clauses generated this way are logical consequences of the input and it is safe to keep them. For practical reason, we must selectively do resolution and keep at most one clause per conflicting clause.
Algorithm 4.2.2. conflictAnalysis(c) takes as input a conflicting clause c in the current partial interpretation at level level, generates a new clause by resolution, and returns it. Procedure latestAssignedLiteral(c) will return a literal A of c such that A is the last literal assigning a truth value in c. Procedure countLevel(c,lvl) will return the number of literals in c which are assigned at level lvl. Procedure resolve(c1,c2) returns the resolvent of c1 and c2.
proc conflictAnalysis(c)
// c is conflicting in the partial interpretation val()
1 2 3 4 5 6
α:=c;
while (|α| > 1) do // as α has more than one literal
A := latestAssignedLiteral(α); // A is assigned last in α α := resolve(α, reason(A)); // resolution on A
if (countLevel(α, level) ≤ 1) return α;
return α;
137
Example4.2.3.LettheinputclausesC={c1:(x2|x3),c2:(x1|x4), c3:(x2| x4), c4 : (x1 | x2 | x3)}. If we pick x1 as the first decision literal in DPLL, it will result in the following assignments:
assignment reason
x1 @1 x4 @1 x2 @1 x3 @1
(x1 )
c2 :(x1 |x4) c3 :(x2 |x4) c1 :(x2 |x3)
Clause c4 becomes conflicting, which contains three literals of level 1. Calling conflictAnalysis(c4), we will have the following resolutions:
1. α0 is c4;
2. α1 : (x1 | x2) = resolve(α0, c1).
3. α2 : (x1 | x4) = resolve(α1, c3).
4. α3 : (x1) = resolve(α2, c2) and is returned.
Proposition 4.2.4. Assuming c is conflicting in the current interpretation.
• (a) The procedure conflictAnalysis(c) will terminate and return a clause which
is an entailment of the input clauses.
• (b) conflictAnalysis(c) will return a new conflicting clause in the current in- terpretation.
• (c) If we backtrack off the current level, the new clause returned by conflictAnalysis(c) will have at most one literal at level level − 1.
Proof. (a): We assigned only a finite number of literals a truth value at the current level. The literals we chose to resolve off are in the backward order when they are assigned. Once resolved off, these literals can never come back into the new clause, thus there is no chance for a loop. Since the new clause is generated by resolution from the input clauses, it must be an entailment of the input clauses.
(b): In fact, any resolvent in conflictAnalysis(c) is a conflicting clause in the current interpretation. Since c is a conflicting clause, every literal is false in the cur- rent interpretation. c is resolved with a reason clause which has only one true literal in the interpretation and this true literal was resolved off during the resolution. The resulting resolvent will contain only false literals in the current interpretation.
(c): The procedure conflictAnalysis(c) will terminate when at most one literal, say B, of the current level remains in the new clause. If we backtrack off the current level, B will be unassigned and the rest literals are false by (b). P
4.2.2 DPLL with CDCL
The new DPLL procedure with CDCL (conflict-driven clause learning) can be
described as follows:
Algorithm 4.2.5. DPLL(C) takes as input a set C of clauses and returns a model of C if C is satisfiable; return ⊥ if C is unsatisfiable. Procedure initialize(C) will return the unit clauses of C and store the rest clauses in the head/tail data structure.
proc DPLL(C)
1 U:= initialize(C); // initialize the head/tail data structure
138
2
3
4
5
6
7
8
9
10
11
12
13
σ := {}; level := 0; // initialize σ and level while (true) do
res := BCP(U);
if (res ̸= ”SAT”) // res contains a conflicting clause
if (level = 0) return ⊥; // C is unsatisfiable
U := insertNewClause(conflictAnalysis(res)); // new level is set undo(); // undo to the new level
else
A := pickLiteral(); // pick an unassigned literal
if (A = nil) return σ; // all literals assigned, a model is found level := level + 1; // level of the first child
U := {A};
proc insertNewClause(α)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
if(|α|=1)
level := 0;
A := literal(α); reason(A) := α; // assume α = (A) return {A};
else // insert α into the clause set
lits(α) := makeArray(α); // create an array of literals secondHigh := 0; // look for the second highest level in α for x := 0 to |α| − 1 do
if (lvl(lits(α)[x]) = level) head(α) := x; H := lits(α)[x]; // lvl(H) = level
else if (lvl(lits(α)[x]) > secondHigh) secondHigh := lvl(lits(α)[x]); t := x; tail(α) := t;
level := secondHigh;
reason(H) := α;
return {H}; // the head literal of α
The procedure insertNewClause will return a singleton set of literals served as new unit clauses and reset the level variable of DPLL. If the new clause α generated by conflictAnalysis is a unit clause, this clause will be returned and the level is set to 0. If the new clause is not unit, by Proposition 4.2.4, there is only one literal of the current level in α and this literal will be the head literal of α when the level is decreased. We then pick one literal of the highest level other than the head literal as the tail literal. The level will be reset to that of the tail literal. When we backtrack to this level, by Proposition 4.2.4, α becomes unit and is the reason for the head literal, unassigned at this point, to be assigned true.
139
Example 4.2.6. Consider the clause set C = {c1 = (x1 | x2), c2 = (x1 | x3 | x7), c3 =(x2 |x3 |x4), c4 =(x4 |x5 |x6), c5 =(x3 |x4), c6 =(x5 |x6), c7 = (x2 |x3), c8 =(x1 |x3), c9 =(x1 |x2 |x3)}. Ifthefirstthreeliteralschosenby pickLiteral in DPLL are x7, x6, and x5, in that order, then the only implied literal is x4@3 by c4. The next decision literal is x1, which implies the following assignments: x2@4 by c1 and x3@4 by c3. c7 becomes conflicting at level 4.
Calling conflictAnalysis(c7), we obtain (x2 | x4) by resolution between c7 and
c3, which contains x2 as the only literal at level 4. We add (x2 | x4) into DPLL as
c10 .
The level of DPLL is set to 3 as lvl(x4) = 3. So we backtrack to level 3, and
c10 = (x2 | x4) becomes unit. The next call to BCP will have make the following assignments at level 3: x2@3 by c10, x1@3 by c1, x3@3 by c8, and c9 becomes conflicting.
Calling conflictAnalysis(c9), we obtain c11 = (x2) by two resolutions between c8,c9, and c1. The new clause is unit and we set level = 0. Back jumping to level 0 by undo, the next call to BCP will imply the following assignments: x2@0 by c11, x3@0 by c7, x4@0 by c10, and and c5 becomes conflicting.
Calling conflictAnalysis(c5), we obtain c12 = (x2) by two resolutions between c5,c10, and c7. Clauses c11 and c12 will generate the empty clause. Thus, the input clauses are unsatisfiable.
Clause learning with conflict analysis does not impair the completeness of the search algorithm: even if the learned clauses are dropped at a later point during the search, the trail guarantees that the solver never repeatedly enters a decision level with the same partial assignment. We have shown the correctness of clause learning by demonstrating that each conflicting clause is implied by the original formula.
The idea of CDCL was used first in solver grasp of Marques-Silva and Sakallah, and solver relsat of Bayardo and Schrag. In their solvers, they introduced inde- pendently a novel mechanism to analyze the conflicts encountered during the search for a satisfying assignment. There are many ways to generate new clauses from con- flicting clauses and most of them are based on implication graphs. The method of learning new clauses through resolution is the easiest one to present.
CDCL brings a revolutionary change to DPLL so that DPLL is no longer a simple backtrack search procedure, because the level reset by insertNewClause may not be level − 1 and DPLL will backjumping to a higher level, thus avoiding unnecessary search space. If the new clause learned contains a single literal, the new level is set to 0. DPLL with CDCL can learn from failure and back jumping to the level where the source of failure is originated. Several new techniques are proposed based on CDCL, including generating a resolution proof when the input
140
clauses are unsatisfiable, or randomly restarting the search.
4.2.3 Unsatisfiable Cores
The existence of a resolution proof for a set of unsatisfiable clauses is guaran-
teed by the completeness of resolution. How to find a resolution proof is a difficult job. It has been shown that CDCL as practiced in today’s SAT solvers corresponds to a proof system exponentially more powerful than resolution. Specifically, each learning step is in fact a sequence of resolution steps, of which the learned clause is the final resolvent; conversely, a resolution proof can be obtained by regarding the learning procedure as a guided resolution procedure. In Example 4.2.6, we have seen how the empty clause is generated. The resolution proof leading to the empty clause can be displayed as follows:
c1 c3 c5 c7 c8 c9
c10 c11 c12 c13 c14 c15
(x1 | x2)
(x2 |x3 |x4) (x3 | x4)
(x2 | x3)
(x1 | x3)
(x1 |x2 |x3) (x2 | x4)
(x1 | x2)
(x2 )
(x2 | x3)
(x2 )
()
assumed assumed assumed assumed assumed assumed resolvent of resolvent of resolvent of resolvent of resolvent of resolvent of
c7 , c3 c8 , c9 c11 , c1 c5 , c10 c13 , c7 c12 , c14
141
This example shows
DPLL finds a set of clauses unsatisfiable. During the search inside DPLL, conflict- ing clauses are generated due to decision literals. The procedure conflictAnalysis produces a new clause to show why the previous decisions were wrong and we go up in the decision tree according to this new clause. DPLL returns ⊥ when it finally finds a conflicting clause at level 0. In this case, a resolution proof is generated by conflictAnalysis and this proof uses either the input clauses or the clauses generated previously by conflictAnalysis.
Given an unsatisfiable set C of clauses, we can use all the resolution steps inside the procedure conflictAnalysis to construct a resolution proof. Obviously, such a proof provides an evidence for the unsatisfiability of C. The clauses used as the premises of the proof are a subset of the clauses of C and are called the unsatisfiable core of the proof. Note that a formula typically does not have a unique unsatisfiable core. Any unsatisfiable subset of C is an unsatisfiable core. Resolution
the general idea of obtaining a resolution proof when
proofs and unsatisfiable cores have applications in hardware verification.
An unsatisfiable core is minimal if removing any clause from the core makes the remaining clauses in the core satisfiable. We may use this definition to check if every clause in the core is necessary, and this is obviously a difficult job when the core is huge. The core of the above example contains every input clauses except
c2,c4, and c6; it is a minimal core.
4.2.4 Random Restart
Suppose we are looking for a model of C by DPLL(C) and the first decision
literal is A. Unfortunately, C ∧ A is a hard unsatisfiable instance and your DPLL stuck there without success. Randomly restarting DPLL may be your only option. By “restart”, we mean that DPLL throws away all the previous decision literals (this can be easily done by level := 0; undo() in DPLL). By “random”, we mean that when you restart the search, the first decision literal will most likely be a different literal from A. Intuitively, randomly restarting means there is a chance of avoiding bad luck and getting luckier with guessing the right literal assignments that would lead to a quick solution.
Without CDCL, random restart makes DPLL incomplete. With CDCL, the input and generated clauses keep DPLL from choosing the same sequence of decision literals, and cannot generate the same clause again. Since the number of possible clauses is finite, DPLL cannot run forever.
The same intuition suggests random restart should be much more effective when the problem instance is in fact satisfiable than if it is not. Experimental results showed that random restart also helps when the input clauses are unsatisfiable. If restart is frequent, this assumes a deviation from standard practice, CDCL can be as powerful as general resolution, while DPLL without restart has been known to correspond to the weaker tree-like resolution. The theoretical justification for the speedup is that a restart allows the search to benefit from the knowledge gained about persistently troublesome conflicting variables sooner than backjumping would otherwise allow the partial assignment to be similarly reset. In effect, restarts may allow the discovery of shorter proofs of unsatisfiability.
To implement the restart strategy, when DPLL has found a certain number of failures (empty clauses) without success (a model), it is time for a restart. Today’s SAT solvers often use the following restart policy: Let ri = r0γi−1 be the number of failures for the ith restart, where r0 is a given integer, and 1 ≤ γ < 2. If r0 = 300 and γ = 1.2, it means the first restart happens when DPLL has 300 failures, the second restart happens after another 360 failures, and so on. If γ = 1, it means DPLL restarts after a fix number of failures.
142
Given the common use of restarts in today’s clause learning SAT solvers, the task of choosing a good restart policy appears appealing. While experimental results showed that no restart policy is better than others for a wide range of problems, a clause learning SAT solver could benefit substantially from a carefully designed restart policy. Specifically, experiments shows that nontrivial restart policies did significantly better than if restarts were disabled, and exhibited considerably dif- ferent performance among themselves. This provides motivation for the design of better restart policies, particularly dynamic ones based on problem types and search statistics.
4.2.5 Branching Heuristics for DPLL
Using CDCL or not, every implementation of DPLL needs a function for
selecting literals for case-split. We assume that when a literal is chosen for case- split, the literal will be assigned true in the first branch of DPLL. Prior to the development of the CDCL techniques, branching heuristics were the primary method used to reduce the size of the search space. It seems likely, therefore, that the role of branching heuristics is likely to change significantly for algorithms that prune the search space using clause learning.
It is conventional wisdom that it is advantageous to assign the most tightly constrained variables, i.e., variables that occur in a large number of clauses. On rep- resentative of such a selection strategy is known as the MOMS rule, which branches on a literal which has the Maximum Occurrences in clauses of Minimum Size. If the clauses contain any binary clauses, then the MOMS rule will choose a literal in a binary clause. By assigning true to this literal, it is likely that a maximal number of binary clauses will become satisfied. On the other hand, if the primary criterion for the selection of a branch literal is to pick one that would enable a cascade of unit propagation (the result of such a cascade is a smaller subproblem), we would assign false to the literal chosen by the MOMS rule, in the presence of binary clauses, as- signing this literal false will create likely a maximal number of unit clauses from all the binary clauses. MOMS provides a rough but easily computed approximation to the number of unit propagation that a particular variable assignment might cause.
Alternatively, one can call BCP multiple times on a set of promising variables in turn, and compute the exact number of unit clauses that would be caused by a branching choice. Each chosen variable is separately fixed to true and to false and BCP is executed for each choice. The precise number of unit propagation caused is then used to evaluate possible branching choices. Unlike the MOMS heuristic, this rule is obviously exact in its attempt to judge the number of unit clauses caused by a potential variable assignment. Unfortunately, it is also considerably more expensive
143
to compute because of the multiple executions of BCP and undoing them. It has shown that, using MOMS to choose a small number of promising candidates, each of which is then evaluated exactly using BCP, it outperforms other heuristics on randomly generated problems.
Another strategy is to branch on variables that are likely to be backbone lit- erals. A backbone literal is one that must be true in all models of the input clauses. The likelihood that any particular literal is a backbone literal is approximated by counting the appearances of that literal in the satisfied clauses during the execution of DPLL. This heuristic outperforms those discussed in the previous paragraphs on many examples.
The development of CDCL techniques enabled solvers to attack more struc- tured, realistic problems. There are no formal studies comparing the previously dis- cussed heuristics on structured problems when CDCL techniques are used. CDCL techniques create many new clauses and make the occurrence counts of each literal difficult or less significant. Branching techniques and learning are deeply related, and the addition of learning to a DPLL implementation will have a significant effect on the effectiveness of any of these branching strategies. As new clauses are learned, the number of unit clauses an assignment will cause can be expected to vary; the reverse is also true in that the choice of branch literal can affect which clauses the algorithm learns.
Branching heuristics that are designed to function well in the context of clause learning generally try to branch on literals among the new clauses which have been learned recently. This tends to allow the execution of DPLL to keep “making progress” on a single section of the search space as opposed to moving from one area to another; an additional benefit is that existing learned clauses tend to remain relevant, avoiding the inefficiencies associated with losing the information present in learned clauses that become irrelevant and are discarded.
One of the popular branch heuristics for DPLL with clause learning is called “dynamic largest individual sum (DLIS) heuristic” and it behaviors similarly as the MOMS rule. At each decision point, it chooses the assignment that satisfies the most unsatisfied clauses. Formally, let px be the number of unresolved clauses containing x and nx be the number of unresolved clauses containing x. Moreover, let x be the variable for which px is maximal, and let y be variable for which ny is maximal. If px > ny, choose x as the next branching literal; otherwise, choose y. The disadvantage of this strategy is that the computational overhead is high: the algorithm needs to visit all clauses that contain a literal that has been set to true in order to update the values px and nx for all variables contained in these clauses. Moreover, the process needs to be reversed upon backtracking.
144
A heuristic commonly used in contemporary SAT solvers favors literals in recently added conflict clauses. Each literal is associated with a count, which is initialized with the number of times the literal occurs in the clause set. When a learned clause is added, the count associated with each literal in the clause is incre- mented. Periodically, all counters divided by a constant greater than 1, resulting in a decay causing a bias towards branching on variables that appear in recently learned clauses. At each decision point, the solver then chooses the unassigned literal with the highest count (where ties are broken randomly by default). This approach, known as the variable state independent decaying sum (VSIDS) heuris- tics, was first implemented in zchaff. Zchaff maintains a list of unassigned literals sorted by count. This list is only updated when learned clauses are added, resulting in a very low overhead. Decisions can be made in constant time. The heuristic used in berkmin builds on this idea but responds more dynamically to recently learned clauses. The berkmin heuristic prefers to branch on literals that are unassigned in the most recently learned clause that is not yet satisfied.
The emphasis on variables that are involved in recent conflicting clauses leads to a locality based search, effectively focusing on a subspace. The subspaces induced by this decision strategy tend to coalesce, resulting in more opportunities for reso- lution of conflicting clauses, since most of the variables are common. Representing count using integer variables leads to a large number of ties. Solver minisat avoids this problem by using a floating point number to represent the weight. Another pos- sible (but significantly more complex) strategy is to concentrate only on unresolved conflicting clauses by maintaining a stack of conflicting clauses.
When restart is supported, the branching heuristic should be a combination of random selection and other heuristics, so that DPLL will select a different branch literal after each restart. All told, there are many competing branching heuristics for satisfiability solvers, and there is still much to be done in evaluating their relative effectiveness with clause learning on realistic, structured problems.
4.3 Use of SAT Solvers
There exist excellent implementations of DPLL and you can find these SAT solvers at the International SAT competition web page (satcompetition.org). Most of them are available, free of charge, such as minisat, glucose, lingeling, maple LCM, to name a few. These SAT solvers are also called general-purpose model generators, because they accept CNF as input and many problems can be specified in CNF as SAT is NP-complete.
If you have a hard search problem to solve, either solving a puzzle or finding
145
a solution under certain constraints, you may either write a special-purpose pro- gram or describe your problem in CNF and use one of these general-purpose model generators. While both special-purpose tools and general-purpose model genera- tors rely on exhaustive search, there are fundamental differences. For the latter, every problem has a uniform internal representation (i.e., clauses for SAT solvers). This uniform representation may introduce redundancies and inefficiencies. How- ever, since a single search engine is used for all the problems, any improvement to the search engine is significant. Moreover, we have an accumulated knowledge of three decades on how to make such a uniform search engine efficient. Using general-purpose model generators to solve hard combinatorial problems deserves the attention for at least two reasons.
•
•
4.3.1
It is much easier to specify a problem in CNF than to write a special program to solve it. Similarly, fine-tuning a specification is much easier than fine-tuning a special program.
General-purpose model generators can provide competitive and complemen- tary results for certain combinatorial problems. Several examples will be pro- vided in the following.
Specify SAT Instances in DIMACS Format
Today SAT solvers uses the format suggested by DIMACS (the Center for
Discrete Mathematics and Theoretical Computer Science) many years ago for CNF formulas. The DIMACS format uses a text file: If the first character of a line is “c”, it menas a comment. The CNF starts with a line “p cnf n m”, where n is the number of propositional variables and m is the number of clauses. Most SAT solvers only require m be an upper bound for the number of clauses. Each literal is represented by an integer: positive integers are positive literals and negative literals are negative literals. A clause is a sequence of integers ending with 0.
Example 4.3.1. To represent C = {(x1 | x2 | x3), (x1 | x2), (x2 | x3), (x2 | x3)} in the DIMACS format, the text file will look as follows:
c a tiny example
c
p cnf 3 4
1 2 -3 0
-1 2 0
146
147
Figure 4.3.1: A typical example of Sudoku puzzle and its solution.
-2 -3 0 230
To use today’s SAT solvers, you need to prepare your SAT instances in the DIMACS format. This is done typically by writing a small program called encoder. Once a model is found by a SAT solver, you may need another program, called decoder, which translates the model into desired format. For example, the encoder for a Sudoku puzzle will generate a formula A in CNF from the puzzle and the decoder will take the model of A and generate the solution of the Sudoku.
4.3.2 Sudoku Puzzle
The standard Sudoku puzzle is to fill a 9 × 9 board with the digits 1 through
9, such that each row, each column, and each of nine 3 × 3 blocks contain all the digits from 1 to 9. A typical example of Sudoku puzzle and its solution is shown in Figure 4.3.2.
To solve this problem by a SAT solver, we need to specify the puzzle in CNF. At first, we need to define the propositional variables. Let pi,j,k be the propositional variable such that pi,j,k = 1 iff digit k is at row i and column j of the Sudoku board. Since 1 ≤ i, j, k ≤ 9, there are 93 = 729 variables.
We may encode pi,j,k as 81(i−1)+9(j−1)+k, then p1,1,1 = 1 and p9,9,9 = 729. Another encoding is pi,j,k = 100i + 10j + k, then p1,1,1 = 111 and p9,9,9 = 999. The latter wasted some variables, but it allows us to see the values of i, j and k from the integer. We will use the latter, since this is a very easy SAT problem, a tiny waste does not hurt. The first three lines of the CNF file may read as
c Sudoku puzzle
p cnf 999 1000000
where 1000000 is an estimate for the number of clauses from the constraints on the puzzle.
The general constraints of the puzzle can be stated as follows.
• Each cell contains exactly one digit of any value.
• Each row contains every digit exactly once, i.e., there are no duplicate copies of a digit in a row.
• Each column contains every digit exactly once, i.e., there are no duplicate copies of a digit in a column.
• Each 3 × 3 grid contains every digit exactly once, i.e., there are no duplicate copies of a digit in the grid.
A solution to a given board will say which digit should be placed in which cell. The first constraint above can be divided into two parts: (1): Each cell con- tains at least one digit; and (2): Each cell contains at most one digit. If every cell in a row contributes one digit and each digit appears at most once, then it implies that every digit appears exactly once in this row. The same reasoning applies to
rows and grids, so the next three constraints can be simplified.
1. Each cell contains at least one digit.
2. Each cell contains at most one digit.
3. Each row contains every digit at most once.
4. Each column contains every digit at most once.
5. Each grid contains every digit at most once.
Now it is easy to convert the above constraints into clauses:
1. For 1 ≤ i,j ≤ 9,(pi,j,1 | pi,j,2 | ··· | pi,j,9).
2. For1≤i,j≤9,1≤k
It is easy to check all of the above simplification rules are sound for weighted clauses, as they preserve optimal solutions.
To efficiently implement simplifySoft(F, σ), F will be simplified first by σ, as σ represents a set of hard unit clauses. Every true clause under σ will be removed from F and every false literal under σ will be removed from the clauses in F. After this step, F and σ do not share any variable. The second step is to apply the above simplification rules. Some rules like zero elimination and tautology elimination, are applied eagerly. Some rules like weighted resolution are applied selectively.
To implement lowerBound(F, soln), at first, we assumed that F is simplified and we count the total weights of all false clauses. If the sum exceeds the current solution, we exit.
For non-false clauses of F , there exist various techniques to estimate the lower bound. One technique is to formulate F as an instance of Integer Linear Program- ming (ILP). ILP Solvers are common optimization tools for operation research. ILP solvers solve problems with linear constraints and linear objective function where
162
some variables are integers. State-of-the-art ILP solvers are very powerful and ef- fective and can attack MaxSAT directly. In practice, ILP solvers are effective on many standard optimization problems, e.g., vertex cover, but for problems where there are many boolean constraints ILP is not as effective.
To obtain a lower bound of ILP, the constraints on integer solutions are relaxed so that ILP becomes an instance of the Linear Programming (LP), which is known to have polynomial-time solutions.
If we view ⊤ as 1 and ⊥ as 0, it is well known that a clause can be converted into a linear inequation, such as (x1 | x2 | x3) becomes (1−x1)+x2 +(1−x3) ≥ 1, so that the clauses are satisfiable iff the set of inequations has a 0/1 solution for all variables.
For any clause c, let P(c) and N(c) be the sets of positive literals and negative literals in c, respectively. Let F = {(ci;wi) : 1 ≤ i ≤ m} over n variables, say {x1, x2, …, xn}. F can be formulated as the following instance of ILP:
minimize ∑ wi(1 − yi) // minimize the weights of falsified clauses ∑1≤i≤m ∑
subjectto x∈P(ci)x+ x∈N(ci)(1−x)≥yi,1≤i≤m, //ci istrueiffyi =1 yi ∈ {0, 1}, 1 ≤ i ≤ m, // every clause is falsified or satisfied
xj ∈ {0,1}, 1 ≤ j ≤ n. // every variable is false or true
where yi is a new variable for each clause ci in F .
Intuitively, the more yi’s with yi = 1, the less the value of ∑ wi(1 − yi). ∑ ∑ 1≤i≤m
And yi = 1 implies that x∈P(ci) x + x∈N(ci)(1 − x) ≥ 1, i.e., ci is true. The above ILP instance can be relaxed to an instance L of LP:
minimize ∑ wi(1 − yi) ∑1≤i≤m ∑
subject to x∈P(ci) x+ x∈N(ci)(1−x) ≥ yi, 1 ≤ i ≤ m, 0≤yi ≤1,1≤i≤m,and
0≤xj ≤1,1≤j≤n.
If L has a solution, then ∑
163
wi(1 − yi) is a lower bound for the solutions of F. The ILP can be used to obtain an approximate solution for F when we round
1≤i≤m
up/down the values of xi from reals to integers.
Other techniques for lower bounds look for inconsistencies that force some soft
clause to be falsified. Some suggested learning falsified soft clauses, some people use the concepts of clone, minibcukets, or width-restricted BDD in relaxation. For instance, we may treat the soft clauses as if they were hard and then run BCP to locate falsified clauses. This can help us to find unsatisfiable subsets of clauses quickly, and use them to estimate the total weights of falsified clauses. These
techniques can be effective on small combinatorial problems. Once the number of variables in F gets to 1,000 or more, these lower bound techniques become weak or too expensive. Strategies for turning on/off these techniques in HyMaxSATBB should be considered.
Besides the algorithm based on branch-and-bound, some MaxSAT solvers con- vert a MaxSAT instance into a sequence of SAT instances where each instance encodes a decision problem of the form, for different values of k, “is there an inter- pretation that falsifies soft clauses of total weights at most k?”
For example, if we start with a very small value of k, the SAT instance will be unsatisfiable. By increasing k gradually, the first k when the SAT instance becomes satisfiable will be the value of an optimal solution. Of course, this linear strategy on k does not provide necessarily an effective MaxSAT solver.
There are many ongoing research in this direction. The focus of this approach is mainly doing two things:
1. Develop economic ways to encode the decision problem at each stage of the sequence.
2. Exploit information obtained from the SAT solver at each stage in the next stage.
Some MaxSAT solvers use innovation techniques to obtain more efficient ways
to encode and solve the individual SAT decision problems. Some solvers use un- satisfiable cores to improve SAT solving efficiency. These solvers appear promising as they are effective on some large MaxSAT problems, especially those with many hard clauses.
4.5.4 Use of Hybrid MaxSAT Solvers
Most real-world problems involve an optimization component and there are
high demand for automated approaches to finding good solutions to computation- ally hard optimization problems. We have seen how to encode the clique problem easily and compactly into a MaxSAT instance. MaxSAT allows for compactly en- coding various types of high-level finite-domain soft constraints due to Cook-Levin Theorem: Any NP constraint can be polynomially represented as a weighted SAT instance.
There exists an extension of DIMACS format for WCNF. To specify an in- stance of hybrid MaxSAT, the line staring with p will contain the keyword wcnf, followed by the number of variables, the number of clauses, and the label for hard clauses. That is,
164
p wcnf <# of variables> <# of clauses>
The hard clause label is an integer large than any weight of soft clauses. Each clause is started with the weight value, followed by literals and ending with 0. For the small clique example of Example 4.5.4, the input file to a MaxSAT solver will look like:
c small clique example
c xa = 1, xb = 2, xc = 3, xd = 4, xe = 5 p wcnf 5 8 9
9-1-5 0
9-2-3 0
9-3-5 0
110
120
130
140
150
A clause is hard if the weight of the clause is the label for hard clauses.
The standard format for MaxSAT instances makes the annual evaluation of MaxSAT solvers possible. The latest website for MaxSAT evaluation can be found
on the internet and the site for the year 2020 is given below:
https://maxsat-evaluations.github.io/2020/
Researchers on MaxSAT can assess the state of the art in the field of MaxSAT solvers, creat a collection of publicly available MaxSAT benchmark instances, and use tens of solvers from various research groups.
There MaxSAT solvers are built on the successful techniques for SAT, evolve constantly in practical solver technology, and offer an alternative to traditional ap- proaches, e.g., integer programming. MaxSAT solvers use propositional logic as the underlying declarative language and are especially suited for inherently “very Boolean” optimization problems. The performance of MaxSAT solvers have sur- passed that of specialized algorithms on several problem domains:
• correlation clustering
• probabilistic inference
• maximum quartet consistency
165
• software package management
• fault localization
• reasoning over bionetwork
• optimal covering arrays
• treewidth computation
• Bayesian network structure learning
• causal discovery
• cutting planes for IPs
• argumentation dynamics
The advances of contemporary SAT and MaxSAT solvers have transformed the way we think about NP complete problems. They have shown that, while these problems are still unmanageable in the worst case, many instances can be successfully tackled.
4.6 Exercise Problems
1. Prove that all derived formulas are tautologies, if all the axioms are tautologies and all the inference rules are sound?
2. What are the inference rules for the operator ⊕ and ↔ in a natural deductive system?
3. What are the logical rules for the operator ↔ in Gentzen’s natural deductive system?
4. Estimate the upper bound of the size of the inference graph for resolution, if the input clauses contain n variables.
5. Provide the pseudo-code for implementing BCPw based on BCPht such that the space for head(c) and tail(c) are freed by assuming head(c) = 0 and tail(c) = |c| − 1 for any non-unit clause c, and no additional space can be used for c. In your code, you may call swap(lits(c), i, j), which swaps the two elements at indices i and j in the array lits(c).
166
6. Given a set H of m + 1 Horn clauses over m variables:
H = {c1 =(x1 |x2 |···|xm−1 |xm),c2 =(x1 |x2),c3 =(x2 |x3),…,
cm = (xm−1 | xm), cm+1 = (x1)}
Show that the size of H is O(m), and the algorithm BCPw in the previous problem will take Ω(m2) time on H, if all the clauses in cls(A) must be processed before considering cls(B), where ¬A is assigned true before ¬B.
7. We apply DPLL (without CDCL) to the following set S of clauses: S = {1. (a | c | d | f), 2. (a | d | e), 3. (a | c), 4. (b | d | e), 5. (b | d | f), 6. (b | f), 7. (c | e), 8. (c | e), 9. (c | d) }, assuming that pickLiteral picks a literal in the following order a, b, c, d, e, f (false is tried first for each variable). Please (a) draw the decision tree of DPLL; (b) provide the unit clauses derived by BCP at each node of the decision tree; and (c) identify the head/tail literals of each clause at the end of DPLL(assuming initially the first and last literals are the head/tail literals).
8. We apply GSAT to the set S of clauses in the previous problem. Assume that the initial node (i.e., interpreation) is σ0 = {a,b,c,d,e,f}, and the strategy for picking a neighbor to replace the current node is (a) the best among all the neighbors; (b) the first neighbor which is better than the current node (flipping variables in the order of a, b, …, f. Sideways moves are allowed if no neighbors are better than the current node. If no neighbors are better than the current node, a local optimum is found and you need to start with the node σ0 = {a,b,c,d,e,f}.
9. WeapplyDPLLwithCDCLtotheclausesetC:{c1 :(x1 |x2 |x5), c2 :(x1 | x3 |x4 |x5), c3 :(x1 |x2 |x3 |x4), c4 :(x3 |x4 |x5), c5 :(x2 |x3 |x5), c6 : (x3 | x4 | x5), c7 : (x3 | x4 | x5)}. If pickLiteral picks a literal for case-split
in the following order x5, x4, …, x1 (xi = 0 is tried first), please answer the following questions:
(a) For each conflicting clause found in DPLL(C), what new clause will be generated by conflictAnalysis?
(b) What will be the head and tail literals if the new clause generated in (a) is not unit? And what will be the value of level at the end of the call to insertNewClause?
(c) Draw the DPLL decision tree until either a model is found or DPLL(C) returns false.
167
10. The pigeon hole problem is to place n + 1 pigeons into n holes such that every pigeon must be in a hole and no holes can hold more than one pigeon. Please write an encoder in any programming language that reads in an integer n and generates the CNF formula for the pigeon hole problem with n+1 pigeons and n holes in the DIMACS Format. Store the output of your encoder in a file named pigeonN.cnf and feed pigeonN.cnf to a SAT solver which accepts CNF in the DIMACS format. Turn in both the encoder, the result of the SAT solver for n = 4, 6, 8, 10, and the CNF file for n = 4 (i.e., pigeon4.cnf), plus a summary of the outputs of SAT solvers on the sizes of the input, the computing times, numbers of conflicts. What relation between the size of the input and the computing time?
11. The n-queen problem is to place n queens on an n × n board such that no two queens are in the same row, the same column, or the same diagonal. Please write a smaller program called encoder in any programming language that reads in an integer n and generates the CNF formula for the n-queen problem in the DIMACS Format. Store the output of your encoder in in a file named queenN.cnf and feed queenN.cnf to a SAT solver which accepts CNF in the DIMACS format. Write another smaller program called decoder which reads the output of the SAT solver and displays a solution of n-queen on your computer screen. Turn in both the encoder and decoder, the result of the SAT solver for n = 5, 10, 15, and the CNF file for n = 5 (i.e., queen5.cnf), and the output of your decoder for n = 10.
12. You are asked to write two small programs, one called encoder and the other called decoder. The encoder will read a sudoku puzzle from a file (some ex- amples are provided online) and generates the CNF formula for this puzzle in the DIMACS Format. Store the output of your encoder in a file named sudokuN.cnf and feed sudokuN.cnf to a SAT solver which accepts CNF in the DIMACS format. The decoder will read the output of the SAT solver and displays a solution of the Sudoku puzzle on your computer screen. Turn in both the encoder and decoder, the result of the SAT solver, and the result of your decoder for the provided Sudoku puzzles.
13. A Latin square over N = {0, 1, …, n−1} is an n×n square where each row and each column of the square is a permutation of the numbers from N. Every Latin square defines a binary function ∗ : N,N −→ N such that (x∗y) = z iff the entry at the xth row and the yth column is z. You are asked to write a small program called encoder. The encoder will read an integer n and generates the CNF formula for the Latin square of size n satisfying the constraints x ∗ x = x
168
and (x∗y)∗(y∗x) = y (called Stein’s third law) in the DIMACS Format. Store the output of your encoder in a file named latinN.cnf and feed latinN.cnf to a SAT solver which accepts CNF in the DIMACS format. Turn in the encoder, and a summary of the results of the SAT solver for n = 4, 5, …, 12.
14. Given an undirected simple graph G = (V,E), an independent set of G is a subset X ⊆ V such that for any two vertices of X, (x, y) ̸∈ E. The independent set problem is to find a maximum vertex cover of G. Specify the independent set problem in hybrid MaxSAT; and (b) specify the decision version of the independent set problem in SAT.
15. Given an undirected simple graph G = (V, E), a vertex cover of G is a subset X ⊆ V such that for every edge (x,y) ∈ E, {x,y}∩X ̸= ∅. The vertex cover problem is to find a minimal vertex cover of G. (a) Specify the vertex cover problem in hybrid MaxSAT; and (b) specify the decision version of the vertex cover problem in SAT.
16. Given an undirected simple graph G = (V,E), the longest path problem is to find the longest simple path (no repeated vertices) in G. (a) Specify the longest path problem in hybrid MaxSAT; and (b) specify the decision version of the longest path problem in SAT.
169