COMP3161/COMP9164: Preliminaries
Gabriele ̊ ∗ September 30, 2021
1 Introduction
As we are going to discuss and reason about properties of various programming languages and language features, we need a formal meta-language which allows us to make statements about these properties. We need to specify the grammar of a language, the static semantics (often in form of typing and scoping rules) and dynamic semantics. Fortunately, it turns out that a single formalism, inductive definitions built on inference rules, is sufficient.
2 Judgements and Inference Rules
A judgement is simply a statement that a certain property holds for a specific object,1 for example: • 3 + 4 ∗ 5 is a valid arithmetic expression
• the string ”aba” is a palindrome
• 0.43423 is a floating point value
Judgements are not unlike predicates you might know from Predicate Logic. We write
for a judgement of the form The property S holds for object a. In predicate logic, this is usually written differently, as S(a). However, we will see later that for our purposes, it is much more convenient to write it in the above given post-fix notation. Alternatively, we can interpret S as a set of objects with a certain property, and read the judgement a S as: a is an element of the set S. Some examples of judgements and how to read them are:
– 5 even – 5
∗Minor edits and clarifications. Gabriele is the main author.
1More generally, a relationship between a number of objects holds. For now, we just look at statements about a single object.
is even, or
is an element of the set of even numbers
+ 4 ∗ 5 is a syntactically correct expression, or
+ 4 ∗ 5 an element of the set containing all syntactically correct expressions – 0.43423 float
– 0.43423 is a floating point value
– 0.43423 is an element of the set of floating points values
Judgements by themselves would be boring and fairly useless. Most interesting sets have an infinite number of elements, and to define such a set it would obviously be impossible to explicitly list them all using simple judgements. Luckily, the sets we are interested in are not random collections of objects, but the sets and properties can be systematically defined by so-called inference rules.
Inference rules allow us to combine judgements to obtain new judgements. They have the following general form:
If judgements J1 , and J2 , and . . . and Jn are inferable, then the judgement J is inferable and are usually given in the following standard form:
J1 J2 … Jn J
where the judgements J1 to Jn are called premises, and J is called a conclusion. An inference rule does not have to have premises, it can consist of a single conclusion. Such inference rules are called axioms. But let us have a look at a concrete example now.
We start by defining some simple properties over the set of natural numbers (Nat). For simplicity reasons, we represent them as 0, (s 0), (s ((s 0))), (s (s (s 0))) for 0, 1, 2, 3, and so on (s here stands for successor). So, first of all, how can we define Nat itself using inference rules? Listing all the element of Nat would be equivalent to including an axiom for each number:
(s (s 0)) Nat .
Apart from the first rule, all the rules have the form
where x Nat has been established by the previously listed axiom. In words, we have
1. 0 is in Nat, and
2. forallx,ifxisinNat,then(sx)inNat
which can be translated directly into the following two inference rules:
x Nat 0 Nat (s x) Nat
where x is a variable that can be instantiated to any term. In the same way, we can define the sets Even and Odd:
x Even x Odd
0 Even (s(s x)) Even (s 0) Odd (s(s x)) Odd
Rules do work in two ways: we can use them to define a property, but we can also use them to show that a judgement is valid. How does it work with inference rules? Assume we want to show that some judgement J is valid. We have to look for a rule which has J as a conclusion. If this rule is an axiom, we are already done. If not, we have to show that all of its premises are
valid by recursively applying the same strategy to all of them. For example, we can show that s(s(0)) Even, since
0 Even (s(s 0)) Even
As the last rule is an axiom, there are no premises left to prove, and we are done.
Similarly, we can show that 0 + 1 + 1 + 1 + 1 Even. An alternative and often quite convenient way to write inference proofs is to stack the rules we apply together and draw a “proof tree” — in our example, more a proof stack, since each rule has just a single premise.
(s(s 0)) Odd (s(s(s 0))) Odd (s(s(s(s(s 0))))) Odd
Note that inference works on a purely syntactic basis. Given the rules above, we are not able to prove 2 Even, even though we can show that s(s(0)) Even, and we know that s(s(0)) is equal to 2. We cannot apply that knowledge, since we have no rule which tells us it is ok to do so. We just mechanically manipulate terms according to the given rules.
Let us look at a slightly more interesting example: we want to define the language M which contains all expressions of properly matched parentheses (and no other characters):2
M = {ε,(),()(),()()(),…,(()),((())),…,()(()),()()(()),…}
Again, let us start by describing the rules in (semi-)natural language. There are basically two
ways to “legally” combine parentheses: we can either nest them, or concatenate them: 1. The empty string (denoted by ε) is in M
2. Ifs1 ands2 areinM,thens1s2 isinM
3. Ifs isinM,then(s)isinM
Again, these rules can be directly translated into inference rules:
How can we show that ()(()) is in M ? As we did previously, we check if there is a rule whose conclusion matches the judgement we want to infer. If we apply Rule (2), we have to show that both () and (()) are in M. Since () = (ε), we can apply Rule (3), and only have to show that ε is in M (Rule (1)). By applying Rule (3) in the same way, we can show that (()) is in M as well, and we are done. This argument can be captured in the following proof tree:
2ε represents the empty string
ε M(1) εM(1) ()M (3)
()M(3) (())M(3) ()(()) M (2)
Finding a proof tree is not always as straightforward as it seems, because sometime, multiple rules may be applicable. Instead of applying Rule (2) above, we could also have tried Rule (3). The latter rule has only a single premise, so we only have to prove that )(() is in M. The only rule that’s applicable now is Rule (2), and we could apply it in several different ways resulting in different premises:
)M (()M )(() M
)(M ()M )(() M
In the first application, we end up with the premise: ) is in M. But there is no rule which we can apply to get rid of it. This is not that surprising, since M should only contain expressions of properly matched parentheses, and neither ) nor )(() are such. So, by choosing the wrong rule, or applying the right rule in a wrong way—for example, splitting ()() up into ()( and )—we can easily end up with premises which are not actually valid, reaching a dead end. In our example, this was not hard to see. It can be extremely difficult to decide which rule to apply and how, without some background knowledge about the objects and properties, as there might be an infinite number of possibilities. This is one reason why it is not possible to write a program which automatically derives judgements, and is guaranteed to find such a derivation if it exists.3 It is, however, possible to write semi-automatic theorem provers, which come up with proofs in cases
where it is fairly standard, and rely on user input otherwise.
2.1 Derivable and Admissible Rules
Consider what would happen if we added the following rule to the language M:
sM ((s)) M
Does this change the set M in any way? That is, is there a string s for which we can infer s M if we use (4), but not otherwise? This doesn’t seem very likely: the rule just says that, if a string s is in M it is ok to add two pairs of matching parentheses. Since we already had a rule which allows us to add one pair of parentheses, we can just apply this rule twice and achieve the same effect:
This means that Rule (4) is derivable from the existing rules.
In all the previous rules, the strings in the premises were simpler than the string in the con-
clusion. The following rule is different in this respect:
derivable from any of the existing rules. Such a rule is called admissible. 3If it was, mathematicians and theoretical computer scientists would be out of a job.
Interestingly, although Rule (4) does not introduce any new strings to M, the rule is also not
2.2 Inductive Definitions
A set of inference rules R defining a set A is called an inductive definition of A, if the set of valid judgements s A are precisely those judgements that can be derived using R. All the examples we have discussed are inductive definitions.
Not all sets can be defined using inductive definitions. For example, while natural numbers are one of the standard examples of such sets, real numbers cannot be defined in such a way.
2.3 Judgements and Relations
So far, we defined a judgement to be a statement about a property of an object. We can generalise this definition to relation between a number of objects. Consider the following inductive definition of the relation “a < b” on natural numbers. For convenience reasons, we choose an infix notation here:
n