Basic Logic and Mathematical Structures for COMP 330 Winter 2021
Prakash Panangaden McGill University
5th January 2021
These notes are not meant as a substitute for learning the subject of the title properly. They are meant to make sure we have some basic vocabulary in place. The section on “Logical Connectives”, for example, is not an ab initio explanation; rather, it is a brief reminder. Other sections are more expository in character.
Prolegomenon to any future mathematical study
One of the greatest difficulties students have in this course is with reading mathematical notation. They are used to natural language, with its usual ambiguities and conventions. This level of ambiguity is not appropriate for mathematics. It is essential that you1 exercise care and discipline in reading and writing mathematics. Even the surrounding text is important; don’t look at a mathematical formula in isolation. Use complete sentences. Be careful about which adjectives apply to which nouns. When asking a question make sure that you formulate a proper question; too many people just say, “I don’t get it.” That’s not a question.
1 Logical Connectives
The basic logical connectives are: and ∧, or ∨, not ¬ and implication ⇒, usually rendered in English with the word “if”.
People have the greatest problem with implication. First of all they seem to
1Yes, you!
1
have a problem with the fact that implication is not symmetric. The word “if” signifies a one-directional implication. For bi-directional implication (equivalence) one uses the phrase “if and only if,” in mathematical notation wewritep ⇐⇒ q. Thestatementp⇒qisnotthesameasq⇒p. For example, “if I turn on the light, I will be able to see” is not the same as “If I can see, I turned on the light.” However, in ordinary English one often does say “if” when one means “if and only if” (something we almost never say in natural conversation). For example, “If I get hungry, I will eat.” Usually, one understands that “If I am not hungry, I will not eat.” In mathematical writing, however, we always mean the one-directional implication when we say “if.”
People have problems with implication and its converse and contrapositive. The converse of p ⇒ q is q ⇒ p. In the preceding paragraph, I have emphasized the difference between an implication and its converse: they are not the same. The contrapositive of p ⇒ q is ¬q ⇒ ¬p: this has the same meaning as the original implication. One common way of rendering p ⇐⇒ qisp⇒qand¬p⇒¬q;forthesecondstatementwehaveused the converse of the contrapositive. What is the negation of an implication? ¬(p ⇒ q) is most usefully thought of as p ∧ ¬q.
How does one prove an implication like p ⇒ q? Here one assumes p and (using this assumption, if necessary) one proves q. The important point to realize is that one is under no obligation to prove p. I have proved statements of the form p ⇒ q in class only to have students ask me afterwords “But how do you know p is true?” I don’t know p is true and I am making no promises about p. I am saying that if p turns out to be true then I can promise that q will be true; if p happens not to be true no promises are made. An implication is a contract; if something else makes p hold then q will hold. If p does not hold the contract is off and no promises are made.
How does one prove a statement like ¬p? I like to think of ¬p as shorthand for p ⇒ false. So one way it to assume p and derive a contradiction. This is where “proof by contradiction” is essential. If you want to prove p ⇒ q one often assumes p and ¬q and then derives a contradiction. Essentially they are proving ¬¬q which is equivalent to q2 This is fine, but sometimes people are “trigger-happy” to use this approach. Look for a direct proof if you can.
2This is not true in intuitionistic logic.
2
People generally understand what and and or mean but sometimes they are so sloppy they confuse the two. Usually students do not need to have this explained to them, but, on occasion, they need to have it pointed to them that they have confounded3 the two.
The basic rule for using implication is given the fancy name modus po- nens4:
(p ⇒ q) ∧ p ⇒ q.
If I have proved p ⇒ q and also proved p then I am entitled to conclude q. Unfortunately, many people use incorrect reasoning principles like this one:
(p ⇒ q) ∧ q ⇒ p.
I think this stems from the subconscious confusion of implication and equiv- alence. The classic example of this is from the film Monty Python and the Holy Grail5: If X is made of wood then X floats in water hence if X floats in water X is made of wood.
2 Set notation
I assume familiarity with set notation {· | ·}. In order to avoid confusion about the “universe” that we are talking about one often writes something like
{x ∈ A | φ(x)}.
This stands for the collection of all things within the set A that satisfy the condition φ(·). Note the implicit universal quantifier in the set notation. I am not going to belabour the well-known set operations: ∪,∩,\ or the notation ∅ for the empty set. Sometimes I use the notation X for the complement of a set X. One source of confusion (arising from careless use of language) is the distinction between containment and membership. We writeA⊂X orA⊆X forAisasubsetofX andwesayX contains A. This is a relation between two sets. By contrast, we write a ∈ X to mean thataisoneoftheelementsofX. Wesay“aisamemberofX.”
Often we talk about sets whose elements are themselves sets. The phrase “set of sets” sounds clumsy to the ear so we use phrases like “family of sets” or “collection of sets” instead.
3The word “confounded” means mixing up two different things.
4Using this name only makes you sound pompous, it is the logical principle that is important, not the name.
5Search on YouTube for “The Witch Scene by Monty Python”. 3
One very important operation on sets is the cartesian product. Given two elements a and b we define a new element, written ⟨a, b⟩, called the or- dered pair of a and b. It is simply a new element consisting of two pieces in a particular order. We define two pairs to be equal if and only if the corresponding components are equal, symbolically:
⟨a, b⟩=⟨c, d⟩iff(a=c)∧(b=d).
We introduce the projections, π1 and π2, that take pairs apart:
π1(⟨a, b⟩) = a and π2(⟨a, b⟩) = b.
Given two sets A and B we define their cartesian product (or just product)
to be the collection of all ordered pairs:
rel
A × B = {⟨a, b⟩ | a ∈ A, b ∈ B}. 3 Quantifiers and games
The meaning of quantifiers is subtle: one understands that ∀ means “for all” and that ∃ means “there exists” but reading them in context can be com- plicated. There are two crucial things to keep in mind: (a) the order of the quantifiers is essential,(b) the conceptual complexity arises from alternation of quantifiers. In this section I describe a way of understanding quantifiers by using games.
The first point can be explained by the following example. Which of the two following formulas – the variables all refer to positive integers – is correct? Thefirst: ∀n∃mn
11
for all y strictly less than x – and x2 is certainly in this collection – that F (y) does hold. Thus such a chain cannot exist. Thus we have shown that
∀x ∈ S.(∀y ∈ S.y < x ⇒ F(y)) ⇒ F(x)
which means we can apply the principle of induction to F(x) to conclude that ∀x.F(x). By our proposition above, this means that the poset is a well-founded order.
Here is another, slicker, argument for the reverse direction. Suppose that
∀u ∈ U ∃v,v < u and v ∈ U. Now consider the predicate P(x) = x ̸∈ U. Suppose that for any x ∈ S we know that ∀y < x P(y). I claim that P(x) must hold. If P (x) does not hold then x ∈ U but our assumption about x is that every element y < x is not in U (this is what P (y) means) so, in other words, x is a minimal element of U. But this contradicts the assumption about U. Thus we have proved
∀x ∈ S [∀y < x P(y)] ⇒ P(x).
We are assuming that the principle of induction holds and the formula above is exactly the premise of the principle of induction. Thus, we have proved ∀xP(x). ButwhatisP? Wehavejustproved∀xx̸∈U,i.e. Uisthe empty set. In short every non-empty set must have a minimal element.
Now we can check that the so called principle of mathematical induction is a special case of this. The structure of interest here is the natural numbers i.e. the set N = {0, 1, 2, 3, . . .}. This is clearly a well-founded order so the principle of induction is true. It only remains to put this in the familiar form. Let P be any predicate. There is a unique minimum element namely 0. Thus if we choose x to be 0 in the statement of the principle of induction above we have to show that (∀y ∈ N.y < 0 ⇒ P(y)) ⇒ P(0). The antecedent is vacuously true thus we must show that P(0) is true. Now if we choose x to be n+1 we get
(∀y ∈ N.y < (n + 1) ⇒ P (y)) ⇒ P (n + 1)
or rewriting this less clumsily we can say ∀y ≤ n.P (y) ⇒ P (n + 1). Putting
the pieces together we get the usual statement i.e.
(P(0) ∧ (∀n.(∀y ≤ n.P(y) ⇒ P(n + 1)))) ⇒ ∀n ∈ N.P(n).
12
there is a set U ⊂ S which has no minimal element. This means that def
The other principle that interests us in the principle of structural induction. Let S be any inductively defined set. Associated with the elements of S is the stage at which they enter the set S. We can define a well-founded order on S by saying that x ≤ y if x enters S at the same stage or before y in the inductive definition. Thus we can apply induction to this collection. An example will illustrate the idea.
Suppose that we define the set of binary trees as follows. The empty tree, written [] is a tree. If t1 and t2 are trees then maketree[t1,t2] is also a tree. This is a typical inductive definition. Now the principle of induction applied to these tree structures says the following. Let P be any predicate on trees. If you can prove P ([]) and if you can prove that for any trees t1 and t2 that whenever P (t1) and P (t2) holds then P (maketree(t1, t2)) also holds you can invoke the principle of structural induction to conclude that for all trees, t, P (t) must hold. Thus we can do induction on trees and a multitude of other structures as well.
5.2 Zermelo’s theorem
Please do not read this section. It will torment you needlessly.
Definition 13. A well-order is a well-founded total order.
The usual order on the natural numbers is a well-order. Given a well-order every element, except the largest one if such an element exists, has a unique next element. Many examples of well-ordered sets can be given. On the other hand try to find a well-order on the real numbers. After trying for a while one tends to think that this is not possible.
However, using the Axiom of Choice, Zermelo proved
Theorem 14. Every set can be well ordered.
This created a sensation. The axiom of choice seems like an “obvious” principle but Zermelo’s theorem seems unbelievable. As of yet no one can write down an explicit example of a well-order on the reals. This leads many logicians to reject the axiom of choice. However, the axiom of choice is equivalent to Zermelo’s theorem and also to Zorn’s lemma. The latter is essential to prove all kinds of theorems in mathematics: for example, every vector space has a basis. Thus most mathematicians are loath to give up the axiom of choice.
13
6 Infinite vs finite
Many students are confused by finite vs infinite numbers. It is not my intention to discuss cardinality here or to give a detailed discussion of infinity. I just want to recall some quick facts. I assume you know the words injection (one-to-one function), surjection (onto function) and bijection. A set X is infinite if there is a bijection between X and a proper subset of X. Thus, the natural numbers constitute an infinite set because they are in bijective correspondence with the set of even numbers, which is a proper subset of the set of all natural numbers.
There is no such thing as an infinite natural number. The natural numbers {0, 1, 2, 3, 4, . . .} are each finite; the collection of all of them taken together is an infinite set. Please use the adjectives with the appropriate nouns. If you speak sloppily you will be confused; if you speak clearly you will find that you are not confused. Unfortunately, even mathematicians have become used to a certain sloppiness in their language. For example, one hears the statement, “There is an infinite number of foobars such that...” Of course they mean, “the set of foobars is an infinite set” but it would be tedious and sound pedantic to say it like that. As long as you understand what is really meant, it is acceptable to use such idioms.
If you do not know that there are different infinities and have never seen a diagonalization argument you should learn these things now.
7 Alphabets, words and languages
The basic structures that we will deal with are called languages. An alphabet is just an arbitrary set, in this course we will always consider finite alphabets. We view the elements of an alphabet as uninterpreted symbols. Typically, we use the letters a,b,c,... or perhaps the set of bits {0,1} or some such thing. We call the elements of the alphabet letters. It is traditional to use the capital greek letter Σ to represent an alphabet.
From letters we build words. A word is just a sequence, or string if you prefer, of letters. Letters and words are just like characters and strings in a programming language. Letters may be repeated and may occur in any order. We will always take strings to be finite, thus they have a well defined length. If our alphabet is {a, b}, for example, then examples of words constructed from this alphabet are abba, bab, aaaa, bbaabba and b; they have
14
length 4, 3, 4, 7 and 1 respectively8 One very important string is the empty string or empty word with no letters whatsoever. We need a symbol for this; if we were just to leave a blank space like this we cannot9 tell whether this is the empty word or just an anomaly in the typesetting. We write ε for the empty word. We typically use symbols like u,v,w,x,y,z to stand for words. We write | x | for the length of the word x. Note that a single letter by itself is also regarded as a word. The empty word has, of course, length zero, in symbols, | ε |= 0.
The set of all possible words over the alphabet Σ is written Σ∗. This set explicitly includes ε. The set Σ is (usually) finite but the set Σ∗ is (almost) always infinite. The only time it is not infinite is when Σ = ∅. Then only one word can possibly be constructed, namely the empty word. Thus we have ∅∗ = {ε}. Note ∅∗ ̸= ∅. Even if Σ has only one element, say Σ = {a}, then Σ∗ ≃ N in an obvious way.
The set Σ∗ is equipped with a binary operation written ·, often just indicated by juxtaposition, and called concatenation. It is the simple operation of sticking two words together. For example (abba) · (baba) = abbababa. This operation is not commutative, for example (ab) · (ba) = abba ̸= (ba) · (ab) = baab. It is, however, associative: (x · y) · z = x · (y · z). This means that we do not have to be too scrupulous about bracketing. It becomes tedious to write · all the time10, so almost all the time we just use juxtaposition. For example, the associative rule is written ∀x,v,z ∈ Σ∗,(xy)z = x(yz). This concatenation operation has an identity element, namely ε. In sym- bols
∀x ∈ Σ∗,εx = xε = x.
A set with a binary associative operation is called a semi-group. If a semi- group operation has an identity element it is called a monoid. Other exam- ples of monoids are the set of non-negative integers (the natural numbers), the set of all n × n matrices with integers entries and with matrix multipli- cation as the operation and the set of all functions from a fixed set to itself with function composition as the operation11.
8It is possible to imagine infinite words and there is a rich theory of infinite words, but we will not consider them in this class.
9Please use the word “cannot”, people who write “can not” are usually using it incor- rectly.
10Especially in LATEX!
11If the operation has, in addition, an inverse for every element, it is called a group. None of the examples that I have given are groups.
15
Finally we define a language as a subset of Σ∗. A language can be empty, it can be all of σ∗ or it can be something in between. Here are examples of languages over the alphabet {a, b}:
1. The set {abba, baba, ab, baa, baabaa, b}.
2. Thesetofallwordsthatbeginwitha: {ax|x∈Σ∗}
= {a, aa, ab, aab, aba, aaa, abb, . . .}.
3. The set of all words that contain a sequence of a’s followed by a se-
quence of b’s: {anbm | n, m ≥ 0}.
In the first example I have just listed all the words; this works for finite languages. Some languages are finite, as in the first example but others are not. We have to come up with a better scheme for describing them than listing the elements. In the second example I have used mathematical notation to give a description. I have listed a few examples of words in this language but that is only illustrative; it is definitely not a formal definition. In the third example I have used a new mathematical notation that is used in language theory. When I write an, I mean a repeated n times. Thus, for example, a3b5 = aaabbbbb. It has nothing to do with exponentiation. Remember letters in the alphabet are uninterpreted symbols, they do not stand for numbers, functions, goats or anything else. The form anbm then just means a sequence of n a’s followed by a sequence of m b’s. Note that the n and m are allowed to be arbitrary natural numbers so this is how we say “any bunch of a’s followed by any bunch of b’s.” What do we do if we want to say “some bunch of a’s followed by the same number of b’s?” We write anbn; here it is understood that n is an indeterminate number but whatever it is, all its occurrences mean the same thing. Note that writing a−3 is plain nonsense, it does not mean anything.
How do we specify languages more systematically? How do we algorithmi- cally check whether a given word is in a language? How difficult is it to describe a language? These are questions that will occupy us the rest of the term. Language recognition is not the only thing computers do but essen- tially every aspect of theoretical analysis of computer science — including complexity theory — can be cast in terms of language recognition.
Concluding remarks
I will not write detailed notes like this again. This was written to make sure we are on the same page. Speaking bluntly, if all this seemed way
16
over your head, you should drop the class. If you found the section on well- foundedness and induction difficult, be reassured, it is difficult and requires some time to digest it. If all this seemed childish be patient. Future notes, which will only be for material not in the book will be more terse and to the point. The content will also be more mature.
17