1 Introduction
is even, or
is an element of the set of even numbers
∗ 5 expr
+ 4 ∗ 5 is a syntactically correct expression, or
Preliminaries
Gabriele Keller September 24, 2019
As we are going to discuss and reason about properties of various programming languages and language features, we need a formal meta-language which allows us to make statements about these properties. We need to specify the grammar of a language, the static semantics (often in form of typing and scoping rules) and dynamic semantics. Fortunately, it turns out that a single formalism, inductive definitions built on inference rules, is sufficient.
2 Judgements and Inference Rules
A judgement is simply a statement that a certain property holds for a specific object,1 or alter- natively
• 3 + 4 ∗ 5 is a valid arithmetic expression • the string ”aba” is a palindrome
• 0.43423 is a floating point value
Judgements are not unlike predicates you might know from Predicate Logic. We write
aS
for a judgement of the form The property S holds for object a. In predicate logic, this is usually written differently, as S(a). However, we will see later that for our purpose, it is much more convenient to write it in the above given post-fix notation. Alternatively, we can interpret S as a set of objects with a certain property, and read the judgement aS as: a is an element of the set S. Some examples of judgements and how to read them are:
– 5 even – 5
– 5
– 3 + 4
– 3 – 3
– 0.43423 is a floating point value
– 0.43423 is an element of the set of floating points values
1More generally, a relationship between a number of objects holds. For now, we just look at statements about a single object.
+ 4 ∗ 5 an element of the set containing all syntactically correct expressions – 0.43423 float
1
Judgements by themselves would be boring and fairly useless. Most interesting sets have an infinite number of elements, and to define such a set it would obviously be impossible to explicitly list them all using simple judgements. Luckily, the sets we are interested in are also no random collections of objects, but the sets and properties can be systematically defined by so-called inference rules.
Inference rules allow us to combine judgements to obtain new judgements and have the following general form:
If judgements J1 , and J2 , and . . . and Jn are inferable, then the judgement J is inferable and are usually given in the following standard form:
J1 J2 … Jn J
where the judgements J1 to Jn are called premises, and J is called a conclusion. An inference rule does not have to have premises, it can consist of a single conclusion. Such inference rules are called axioms. But let us have a look at a concrete example now.
We start by defining some simple properties over the set of natural numbers (Nat). For simplicity reasons, we represent them as 0, (s 0), (s ((s 0))), (s (s (s 0))) for 0, 1, 2, 3, and so on (s here stands for successor). So, first of all, how can we define Nat itself using inference rules? Listing all the element of Nat would be equivalent to including an axiom for each number:
0 Nat
(s 0) Nat
(s (s 0)) Nat .
Apart from the first rule, all the rules have the form
(s x) Nat
where x Nat has been established by the previously listed axiom. In words, we have
1. 0 is in Nat, and
2. forallx,ifxisinNat,then(sx)inNat
which can be translated directly into the following two inference rules:
0 Nat
x Nat
(s x) Nat
where x can be instantiated to any In the same way, we can define the sets Even and Odd:
0 Even
x Even
(s(s x)) Even
(s 0) Odd x Odd
(s(s x)) Odd
Rules do work in two ways: we can use them to define a property, but we can also use them to show that a judgement is valid. How does it work with inference rules? Assume we want to show that some judgement J is valid. We have to look for a rule which has J as a conclusion. If
2
this rule is an axiom, we are already done. If not, we have to show that all of its premises are valid by recursively applying the same strategy to all of them. For example, we can show that s(s(0)) Even, since
0 Even (s(s 0)) Even
and
0 Even
As the last rule is an axiom, there are no premises left to prove, and we are done.
Similarly, we can show that 0 + 1 + 1 + 1 + 1 Even. An alternative and often quite convenient way to write inference proofs is to stack the rules we apply together and draw a “proof tree” — in our example, more a proof stack, since each rule has just a single premise.
(s 0) Odd
(s(s 0)) Odd (s(s(s 0))) Odd (s(s(s(s(s 0))))) Odd
Note that inference works on a purely syntactic basis. Given the rules above, we are not able to prove 2 Even, even though we can show that s(s(0)) Even, and we know that s(s(0)) is equal to 2, we cannot apply that knowledge, since we have no rule which tells us it is ok to do so. We just mechanically manipulate terms according to the given rules.
Let us look at a slightly more interesting example: we want to define the language M which contains all expressions of properly matched parenthesis (and no other characters):2
M = {ε,(),()(),()()(),…,(()),((())),…,()(()),()()(()),…}
Again, let us start by describing the rules in (semi-)natural language. There are basically two
ways to “legally” combine parenthesis: we can either nest them, or concatenate them: 1. The empty string (denoted by ε) is in M
2. Ifs1 ands2 areinM,thens1s2 isinM
3. Ifs isinM,then(s)isinM
Again, these rules can be directly translated into inference rules:
(1)
(2)
(3)
εM
s1M s2M
s1s2 M sM
(s) M
How can we show that ()(()) is in M ? As we did previously, we check if there is a rule whose conclusion matches the judgement we want to infer. If we apply Rule (2), we have to show that both () and (()) are in M. Since () = (ε), we can apply Rule (3), and only have to show that ε is in M (Rule (1)). By applying Rule (3) in the same way, we can show that (()) is in M as well, and we are done:
The problem is, however, not as straight forward as it seems, because instead of applying Rule (2), we could as well apply Rule (3). It has only a single premise, so we only have to prove
2ε represents the empty string
3
that )(() is in M. The only rule that’s applicable now is Rule (2), and we could apply it in several different ways resulting in different premises:
or or
)M (()M )(() M
)(M ()M )(() M
)((M )M
)(() M
In the first application, we end up with the premise: ) is in M, but there is no rule which we can apply to get rid of it. This is not that surprising, since M should only contain expressions of properly matched parenthesis, and ), as well as )(() are not properly matched. So, by choosing the wrong rule, or applying the right rule in a wrong way – for example, splitting ()() up into ()( and ) – it is easily possible to end up with premises which are not actually valid and reach a dead end. In our example, this was not hard to see. It can be extremely difficult to decide which rule to apply and how, without some background knowledge about the objects and properties, as there might be an infinite number of possibilities. This is one reason why it is not possible to write a program which automatically infers judgements, and which guarantees to find such a derivation if it exists. It is, however, possible to write semi-automatic theorem provers, which come up with proofs in cases where it is fairly standard, and rely on user input otherwise.
2.1 Derivable and Admissible Rules
What would happen if we added the following rule:
sM ((s)) M
Does this change the set M in any way, that is, is there a string s for which we can infer s M if we use 4, but not otherwise? This doesn’t seem very likely: the rule just says that, if a string s is in M it is ok to add two pairs of matching parenthesis. Since we already had a rule which allows us to add one pair of parenthesis, we can apply this rules twice and achieve the same effect:
sM (s) M
((s)) M
This means that Rule (4) is derivable from the existing rules.
In all the previous rules, the strings in the premises were simpler than the string in the con-
clusion. The following rule if different in this respect:
derivable from any of the existing rules. Such a rule is called admissible. 2.2 Inductive Definitions
A set of inference rules R defining a set A is called an inductive definition of A, if s A holds if and only if s A is derivable using R. All the examples we discussed are inductive definitions.
Not all sets can be defined using inductive definitions. For example, while natural numbers are one of the standard examples for such sets, floating point numbers cannot be defined in such a way.
(4)
(5)
Interestingly, although Rule (4) does not introduce any new strings to M, the rule is also not
()s M sM
4
2.3 Judgements and Relations
So far, we defined a judgement to be a statement about a property of an object. We can generalise this definition to relation between a number of objects. Consider the following inductive definition of the relation “a < b” on natural numbers. For convenience reasons, we choose an infix notation here:
n Nat
0 < (s n)
n