Limits of Computation 2020/21
Notes on Lectures 5–8
Dr Bernhard Reus February 13, 2021
5 WHILE-Extensions
The WHILE-language was simple enough to give it a precise semantics in the previous chapter. Accordingly, we adopt it as our language to write “effective procedures”. Nevertheless, it turns out that some language features not included in WHILE make the programmer’s life so much easier and WHILE-programs also more readable. Examples of such features are built-in equality, literals for num- ber and Boolean data values, additional atoms, explicit syntactic support for writing lists, (non-recursive) macro calls, and a switch statement (cascading if- then-else). These extensions are (to a degree) all available in our interpreter hwhile (available from our Canvas site). We will discuss the few exceptions at the end of this chapter.
5.1 Equality
Equality on trees is an arbitrary “complex” operation as it may involve arbitrary large trees. It does not seem to be in line with Turing’s idea of every computation step being “atomic” and thus was not included in our core WHILE-language. However, one can easily program equality with the other given features in WHILE, so in a way the equality is already syntax sugar for a “macro call”. This has been done in Exercises 3 (Question 4) and the code published on Canvas will reveal the detail of the complexity of such a general equality operator.
We can extend the syntax of expressions as follows: ⟨expression⟩ ::= . . .
| ⟨expression⟩ = ⟨expression⟩ .
Example uses of equality are given below
1. if X=Y { Z := X; } else { Z := Y; } 2. Z := hd X = cons nil nil;
1
(equality)
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
3. while X = hd Z { X:= tl X;}
We do not have to write parentheses around the equality test in the if and
while statements. 5.2 Literals
Literals as expressions are very common in programming languages, most of all string and number literals. Double (or single) quotes are pretty common to identify string literals. To make programming in WHILE easier we also introduce some literals as extensions.
5.2.1 Number Literals
We have seen in Chapter 3 how to encode natural numbers in WHILE as lists of nil atoms and as such they can be perfectly viewed as expressions. With the help of literals once can write e.g. the number 2 as con nil cons nil nil. But this becomes tedious for larger numbers. So we introduce syntax sugar for numbers We thus add to the BNF syntax the clause:
⟨expression⟩ ::= . . .
| ⟨number⟩ (number literals)
.
where nonterminal ⟨number⟩ contains the natural numbers N. Note that in this case we do not require any double quotes as numbers are otherwise not part of the language syntax and thus cannot be “misread”. Despite having those number literals we still do not have any built-in arithmetic for numbers . Arithmetic operators have to be implemented by programs.
Here are some examples for expressions that use number literals according to the extension described above:
1. 2
2. hd 99
3. cons 3 cons 1 nil
Since the numbers can be “unfolded” to lists of nil atoms we know the semantics of these expressions. For (1) it is [nil,nil]; for (2) it is nil; and for (3) it is [[nil, nil, nil], [nil]]. One might wish to have number literals that are translated into decimal or binary representation. This would require a more complicated translation of programs.
2
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
5.3 Boolean Literals
We have seen in Chapter 3 how to encode Booleans in WHILE. We thus add to the BNF syntax the clause:
⟨expression⟩ ::=
| true
| false .
5.4 Adding Atoms
The data type D of our core WHILE-language only has one atom: nil and the expression language accordingly knows only one atom constructor expression, nil.
In order to do some encodings, it is often advantageous to have several atoms. We always restrict to finitely many atoms in order to ensure we have an effective way to interpret the basic WHILE-expressions and commands1.
Definition Let Atoms denote a finite set of atoms that always includes the atom nil. The extended set D of data values is defined inductively as follows:
1. any atom a ∈ Atoms is a tree consisting of just a leaf labelled a.
2. for any two trees l and r, such that l ∈ D and r ∈ D the tree with left
subtree l and right subtree r, written linearly ⟨a.b⟩ is also in D. We extend the BNF-syntax, replacing the clause:
⟨expression⟩ ::= |
. by a new clause allowing more atoms:
. . .
nil
(atom nil)
(a ∈ Atoms)
⟨expression⟩ ::= . . . | a
.
. . .
We will use some additional atoms already in the next chapter for the purpose of encoding programs as data.
5.5 List Constructor
We have already seen how to encode lists in D in Chapter 3. To construct a list of natural numbers [1,2,3] in WHILE one writes expression cons 1 cons
1This will also be important in the Complexity part when we want to measure time usage. 3
(true literal) (false literal)
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
2 cons 3 nil. But often the list elements will be generated on the fly rather than being (number) literals. So syntax sugar for building lists (and not just “lists literals” like [1,2,3]) would be useful. Therefore, we introduce another extension to the WHILE-language, list expressions. We introduce list expressions using the square brackets [ ] (known from Haskell e.g.) already used to denote list values. A list expression will always evaluate to a list value without side effects. The elements of a list expression are themselves expressions separated by commas.
We add to the BNF-syntax the clause
⟨expression⟩ ::=
| [ ]
| [ ⟨expression-list⟩ ] .
⟨expression-list⟩ ::=
| ⟨expression⟩ , ⟨expression-list⟩
. . .
(empty list constructor) (nonempty list constructor)
assume variables X and Y that have values nil and 5, respectively.
1. [ X ]
is a list expression that in the given store will evaluate to [ nil ] (or [ 0 ] if the element is considered a number).
2.[ tlY,consXX,X ]
is a list expression that in the given store will evaluate to [ 4, 1, 0 ].
3. If we combine this with number and tree literals we can also write:
[1, 2, 3] or [ nil, [1,2,3], 1 ] which are constant lists (or “list literals” so to speak).
5.6 Macro Calls
The main reason of the success of high-level programming languages is the level of abstraction from the machine level they provide. Procedural abstraction allows one to implement procedures with parameters that can be used and re- used several times in a larger program but have to be defined only once. Every use of the procedure may instantiate the parameters differently.
When writing larger WHILE-programs, it makes a significant difference if one can use procedures as it reduces the amount of code one has to write while at the same time improving the readability of the code.
We won’t extend WHILE by proper (recursive) procedures or functions as this would need a proper language extension that cannot be simply achieved by a compilation. Yet, we will allow WHILE-programs to call other WHILE-programs by means of macro expansion. The emphasis here is on other programs since recursive self-calls will not be permitted as they would need a stack to keep intermediate results and thus a semantics extension too. We already have given
4
⟨expression⟩
Here are some examples for expressions that use the list constructor. We
(single expression list) (multiple expression list)
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
names (using nonterminal ⟨name⟩) to programs which can be used to call other programs. Of course it is important that no two programs have the same name. These names are used as handles in the macro call when they are enclosed in angular brackets.
We accordingly add to the BNF-syntax the clause ⟨command⟩ ::= . . .
| ⟨variable⟩ := < ⟨name⟩ > ⟨expression⟩ (macro call) .
The meaning of the macro calls is intuitive. First, the expression argument is to be evaluated and then the called program with the give name is to be run with the obtained value as input. The resulting output is then to be assigned to the variable in the assignment. We limit macro calls to assignments as we do not wish to change an important property of expressions, namely that they do not have side effects. If we allowed macros to be called in general expressions and the program in question does not terminate then we would have undefined expressions to deal with. More details how such a macro could be translated into a pure WHILE-program are to be found in the slides of Lecture 5. When translating away the macro call one has to replace it with the body of the macro and some “logistics” code. It is important that the copied (expanded) macro code does not share any variables with the code of th eprogram it is incorporated in (which is the one that calls this macro).
The program add that implements addition on unary numbers, which we encountered already, can be written differently to highlight the meaning of hd and tl on numbers :
succ read X {
X := cons nil X
}
write X
pred read X {
X := tl X
}
write X
add read L {
X:= hd L;
Y:= hd tl L;
while X {
X :=
Y :=
}
}
write Y
5
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
⟨command⟩ ::= |
|
⟨rule⟩ ::= ⟨rule-list⟩ ::=
|
…
switch ⟨expression⟩ {⟨rule-list⟩} switch ⟨expression⟩ {⟨rule-list⟩ default : ⟨statement-list⟩ }
.
case ⟨expression-list⟩:⟨statement-list⟩
⟨rule⟩
⟨rule⟩ ⟨rule-list⟩
caseanalysis with default case
caserule
Figure 1: BNF grammar for switch statement
5.7 Switch Statement
In WHILE, programs one often has to analyse the structure of a given tree, then break it down into parts and process those parts. This can be of course done with the help of a conditional, equality and sequence of tl and hd applications. A more concise and elegant way of achieving the same is with the help of pattern matching, known from functional languages like Haskell, Scala and F#. In order to avoid the complications of pattern matching we introduce just a useful form of case analysis in order to avoid cascading conditionals, called “switch” statement known from most programming languages.
We add to the BNF-syntax the clauses below:
The nonterminal ⟨expression-list⟩ is as defined in the list constructor ex- tension above. The meaning of the switch clause is as follows: the value of the expression argument is compared to the values in the rules in the case ex- pressions. The rules are inspected in the order of appearance in the switch statement. As soon as a match is found, the corresponding statement list of the matching rule is executed. Unlike in some other languages, we do not allow that more than one rule is ever triggered.
An example for a switch statement and how it can be translated can be found in the slides for Lecture 5.
5.8 hwhile support
The interpreter hwhile, available from our Canvas site, supports the core WHILE- language and the extensions explained above with some minor exceptions listed below:
1. there is only one expression allowed in a case clause of a switch statement 2. special atoms (for the self-interpreter, see Section 7) are available, but
need to be preceded by @, e.g. write @while (see also the next section). 6
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
3. the file passed to hwhile has to contain exactly one program and both, file and program, need to have the same name, but the file needs .while extension.
4. multiline comments of the form
(* this is a comment
that is long and fascinating
!
*)
can only be used before the start of the program, even if they are just one line long. Within programs you can only use the single line comments:
// this is a comment
For more examples see the programs published on our Canvas site.
You run a WHILE-program in hwhile in a shell (command prompt window)
as follows:
hwhile -FLAGS program.while INPUT
where FLAGS is a combination of characters that determine how the result is printed (unparsed) and whether debug output is produced. What flags are available can be checked with the command hwhile -h, program.while is the program you want to run, and INPUT is the program’s input tree, which must be enclosed in double quotes if it contains angle brackets or spaces. This input can be written using list expressions and number literal extensions as explained din this section. Please consult the Canvas page for a more detailed manual for hwhile.
6 Programs As Data Objects
Many interesting (types of) programs take as input a program or a program and some other data. In order to be able to write such programs in WHILE, we need to be able to include other WHILE-programs (or Turing machine programs for instance) as data.
Three types of programs in particular use programs as input:
• Compiler:
a compiler takes a program as input and translates it into another program (most likely in another language). The resulting program must have the same semantics as the original program, otherwise the compiler would not be correct. A compiler may involve three languages: the source language, the target language and the implementation language of the compiler.
7
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
• Interpreter:
an interpreter takes a program and a data value from the program’s datatype and returns as output the value that the corresponding program would return when run with the data value as input. An interpreter may involve two languages: the interpreted language and the implementation language of the interpreter.
• Program Specialiser:
A program specialiser takes a program with two (more generally n) in- puts and one data value from the programs datatype as input and par- tially evaluates the program with the given data value as partial input, thus generating a new program with 2 − 1 = 1 (or more generally n − m) inputs preserving the program’s behaviour. Or as Neil Jones puts it “A program specializer is given an S-program p together with part of its in- put data, s. Its effect is to construct a T-program which, when given p’s remaining input d, will yield the same result that p would have produced given both inputs.” A specialiser may involve two languages: the language of the source and target program (S and T in the quote above) and the implementation language of the specialiser. Often, for practical purposes, source and target languages are identical. The main purpose of a spe- cialiser is to achieve a faster programs in certain situations where some input is known.
To be able to be more precise (and formal) about programs that take other programs (in various languages) as input, we need to say what the semantics of a programming language is in general. We already said what the semantics of WHILE (programs) is. This can be generalised now:
Definition A programming language L consists of
1. two sets: L-programs (the set of L-programs) and L-data (the set of data
values described by the datatype used by this language)2. L
2. A function : L-programs → (L-data → L-data⊥) which maps L- programs into their semantic behaviour, namely a partial function map- ping inputs to outputs, which are both in L-data.
Definition A programming language L defined as above has pairing if its data type, L-data, permits the encoding of pairs. For a general (unknown) language that has pairing we denote pairs (a, b), i.e. using parenthesis and a comma.
From Lecture 3 we recall that WHILE has pairing. A pair (a, b) can be encoded as [a, b].
Definition A programming language L defined as above has programs as data if its data type, L-data, permits the encoding of L-programs. For a general
2Again, we make some simplifying assumptions here in the sense that we only have one datatype. We talk about untyped languages so it makes sense to have just one type.
8
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
(unknown) language that has programs as data the encoding of a program p is denoted p.
Definition Assume S has programs as data, S-data ⊆ L-data and L has pairing. An interpreter int for a language S written in L must fulfil the following equation for any given S-program p and d ∈ S-data:
int (p, d) = p (d) (1)
In order to be able to write compilers, interpreters, and specialisers in WHILE, we need a representation of WHILE-programs as data objects (“programs-as- data”). Since the WHILE-datatype of binary trees is not ideal for representing strings, it makes sense to represent programs directly as abstract syntax trees (a concept known from compilation). Such trees are usually obtained by the so-called lexical analysis phase that takes the string of ASCII symbols and rep- resent the structure of the program as abstract syntax tree (for compilation or interpretation purposes). In a way, by directly using abstract syntax trees as encodings, we dodge all issues of lexical analysis which in effect makes it even easier to use programs-as-data.
Since our binary trees have no labels for (inner) nodes we represent abstract syntax trees (ASTs) as lists. A tree of the form Node(op, t1, . . . , tn) where ti are the children (subtrees) of the node op is represented as a list the first element of which is a “label” representing op followed by the encoding of the n subtrees, which, in turn, have to be encoded the same way. Leaves for this abstract syntax tree are operations without arguments and the only ones of this kind we have are variables and literals.
The labels representing the type of operation are special atoms: var, cons, :=, while, if, quote, tl, hd, which could be added to the underlying type of binary trees or simply encoded as numbers. In hwhile, our interpreter the latter is the case. Note that in hwhile you also have to precede the atoms by the at-symbol @. There is no danger of confusing those numbers representing labels with “proper numbers” as the latter would be literals and they would not occur in the position where the operation labels occur (namely always as first elements of lists).
The only thing we need to sort out yet for the representation of programs as data is the encoding of variables. In the concrete WHILE-syntax they are identifiers which are difficult to represent (again this is as complex as encoding strings). We use numbers (unary representation) to encode variable names. This encoding is produced by the map varnum : VariableName → N for which it holds that varnumX = varnumY implies that X = Y. In other words, varnum uniquely encode variable names. The concrete numbers are actually not important, it is only important that the same variable is encoded by the same number. The i-th newly introduced/used variable is encoded by the representation of number i.
The representation p of a (core) WHILE-program p as data can now be
9
LS
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
defined by the map : WHILE−programs → WHILE−data as follows:
progname read X S write Y =
while E B = X := E = if E BT else BE = if E B =
{C1;C2;…;Cn } =
nil = X = cons E F = hd E = tl E =
The WHILE-program p below
p read X {
Y:= nil;
while X {
Y:= cons hd X Y;
X:= tl X
}
} write Y
varnum X , S, varnum Y
[ while, E, B ]
[ :=, varnumX, E ] [ if, E, BT, BE ] [if,E,B,[]]
[C1,C2,…,Cn]
[ quote, nil ]
[ var, varnum X ] [ cons, E, F ] [hd,E] [tl,E]
can be represented a data object in D = WHILE-data, i.e. as a list encoding such an abstract syntax tree, as shown below. Indentation is used to highlight the structure of the AST.
[0,
[[:=,1,[quote,nil]],
[while,[var,0],
[ [:=,1,[cons,[hd,[var,0]],[var,1]]],
[:=,0,[tl,[var,0]]]
]
]], 1]
The “tags” for commands and expressions, like := and var, are the atoms introduced earlier. They can also be considered encodings of numbers, but we do not unfold those here, to keep the AST readable. Of course, the lists on the right hand side of the definition can themselves be encoded again as “pure” trees in D according to our definitions in Chapter 3. But using the encoded version of lists and numbers (and extra atoms) makes everything more readable
10
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
and supports the intended reading of those trees as abstract syntax trees of WHILE-programs.
Note that quote could also be used to deal with the encoding of tree literals as a further extension (not part of the original “core” WHILE- language, see exercises).
7 A Self-Interpreter for WHILE
We already know what an interpreter is. In this section we will focus on an interpreter for WHILE written in WHILE itself. Why is this interesting? From a pragmatic point of view it is certainly desirable to have an interpreter for WHILE. That we can write one in WHILE itself is evidence that our WHILE-language was an acceptable choice for “effective procedures”. Or as Neil Jones in his book put it: “The WHILE language seems to have just the right mix of expressive power and simplicity.” [Preface, Page X].
But there is further good use in computability and complexity for such a self-interpreter (and for “efficient” self-interpreters). This will become clear in future chapters. Historically, the first person to define a self-interpreter was Alan Turing. His famous paper contained a Turing Machine that can take the description of another Turing machine and an input word, and simulate the run of the given machine on the given input. Turing called this machine universal for obvious reasons. Self-interpreters are therefore often called universal programs.
In the following a possible implementation of the self-interpreter for WHILE is presented in detail.
7.1 A self-interpreter for WHILE-programs with one vari- able
First we consider a slightly simpler language than WHILE, called WH1LE. Definition This language WH1LE is like WHILE but programs must only use one
program variable. The number 1 alludes to this fact.
This means, of course, that input and output variable must also be this variable. Note that one can compile any WHILE-program into a WH1LE-program by “storing” the values of all variables in a list which is stored in the single variable one is allowed to use in a WH1LE-program. But this is not our concern now. The reason for using WH1LE first is that the memory handling is much easier so one can see much better how the self-interpreter works.
The interpreter for WH1LE written in WHILE has the following main structure:
read PD {
P := hd PD ;
D := hd tl PD;
B := hd tl P;
CSt := B;
(* input is a list [P,D] *)
(* P = [X,B,X] *)
(* D input data *)
(* B is program block *)
(* CSt is code stack *)
11
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
(* initially commands of B *)
DSt := nil; (* DSt is computation stack for *)
(* intermediate results *)
val := D; (* D is initial value of variable *)
state := [ CSt, DSt, val ];
(* wrap up state for STEP macro *)
while CSt { (* main loop for interpretation *)
state :=
CSt := hd state
}
val := hd tl tl state
}
write val
(* get command stack *)
(* get final value of variable *)
(* return value of the one variable *)
Since interpretation of commands and evaluation of expressions has to be broken down in its constituent sub-expression and sub-commands we need a stack to store the parts of the AST that still needs to be visited (evaluated) and, at the same time, store the intermediate results of what we have already computed. We thus need more atoms, namely the following auxiliary atoms: doHd, doTl, doCons, doAsgn, doIf, and doWhile. With their help one can define the STEP macro.
The interpreter needs to traverse the abstract syntax tree and produce in- termediate results “on the fly”. Those intermediate results will be stored on the value stack (St). The template of the traversal algorithm is as follows:
initialise tree and value stack to be empty
push tree (to be traversed) on tree stack
while tree stack not empty do
pop a tree t from tree stack
if t is just an opcode o with arity n then //auxiliary marker encountered
pop n results r1,…,rn from value stack
r := o(r1,…,rn) // compute intermediate result
push r on value stack
else // t is proper tree and not auxiliary marker
if t’s top level operation has n arguments then
push t’s top level opcode as auxiliary marker on tree stack
push n subtrees of t on tree stack
// these subtrees are still to be evaluated/traversed
else // o is a leaf
compute result of leaf and push on value stack
The loop of this template is already given in the outline of the interpreter above. It remains to define the STEP macro. For the sake of readability, let us present the behaviour of STEP as a set of rewrite rules of the form
[CSt, DSt, val] ⇒ [CStnew , DStnew , valnew ]
12
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
that describe how the data structures of the traversal DSt, the value stack, CSt the code stack, and val the “store” for execution that consists of just one variable, change according to their current values. The STEP macro needs to behave as prescribed by the following rewrite rules. You find nice diagrams in the slides for Lecture 7 that explain the explicit shape of the stacks in each case.
Expressions
quote The evaluation of a quoted literal returns the literal as value which is therefore pushed on the result stack. The quote expression is popped from the code stack (as it has been dealt with).
var The evaluation of a variable (recall we have only one in language WH1LE) returns the value of this one variable (which we kept in val ) and this is pushed onto the result stack. The variable expression is popped from the command stack.
hd To evaluate an expression of the form hd E we first need to evaluate E. So a marker dohd is pushed onto the command stack, followed by the still to be evaluated expression E. Nothing happens on the value stack as no value is produced nor consumed at this stage.
doHd Once a marker doHd is found on top of the command stack we know that we can finish the computation of a hd expression. We also know that the evaluated argument has been pushed on top of the result stack. So we pop this value T off the result stack, compute hd T and push its result on the result stack. Since we have now dealt with the hd expression we pop the marker off the command stack.
tl To evaluate an expression of the form tl E we first need to evaluate E. So a marker doTl is pushed onto the command stack, followed by the still to be evaluated expression E. Nothing happens on the value stack as no value is produced nor consumed at this stage.
doTl Once a marker doTl is found on top of the command stack we know that we can finish the computation of a tl expression. We also know that the evaluated argument has been pushed on top of the result stack. So we pop this value T off the result stack, compute tl T and push its result on the result stack. Since we have now dealt with the tl expression we pop the marker off the command stack.
cons To evaluate an expression of the form cons E1 E2 we first need to evaluate expressions E1 and E2. So a marker doCons is pushed onto the command stack, followed by the still to be evaluated expressions E1 and E2 (on top). Note that the order in which we push both arguments is arbitrary in principle but it is important that the rule for doCons knows the order as it has to build the resulting tree correctly. Nothing happens on the value stack as no value is produced nor consumed at this stage.
13
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
doCons
Once a marker doCons is found on top of the command stack we know that we can finish the computation of a cons expression. We also know that two evaluated arguments have been pushed on top of the result stack. So we pop both values u and v off the data stack, construct the corresponding tree ⟨ u.v ⟩ which denotes the result of the cons operation according to the semantics and push its result on the data stack. Here is where we need to know that the first argument was pushed first in order to produce the right result.
Commands (statement lists)
:= To interpret an assignment of the form [ :=, X, E ] we first need to evaluate the expression E. So we push first a marker doAsgn on the code stack followed by the still to be evaluated E. Nothing happens on the value stack as no value is produced nor consumed at this stage.
doAsgn Once a marker doAsgn is found on top of the command stack, we know that we can finish the interpretation of an assignment command. We also know that the value to be assigned has been pushed onto the data stack. So we pop this value v off the data stack and update the “memory”, in this case the value for the one variable of the program. Since we have now dealt with the assignment we pop the marker off the code stack.
if For the conditional [ if, E, BT , BE ] we first need to evaluate the expres- sions E like for the assignment. We thus push it onto the command stack, but not before we pushed a marker to recall that we still have to finish the conditional, doIf and the entire conditional itself. In fact, it would suffice to just push both blocks for the then-case BT , and the else-case, BE, respectively. This is necessary to decide what branch to execute once we know the value of E.
doIf In this case we inspect the top element of the data stack which is supposed to be the value of E and act accordingly. If the top element represents true, i.e. is a non-nil value, we pop the marker and the conditional and push the commands of the then-block BT onto the command stack. We also pop the top element, the used value of E, from the data stack. In case the top element of the data stack is nil, representing false, we analogously pop marker and conditional and push the commands of the else-block BE instead. Again we pop the top element off the data stack. It is important to notice that one does not push the blocks BT and BE, respectively, in their entirety as a list but rather pushes all commands of the corresponding list in the correct order. This means that we push the last command first and the first command last, such that the first command of the block will now be executed first.
while To interpret a while loop [while, E, B], we first need to evaluate the guard expression E. Since we might need to evaluate the body of this loop for a
14
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
doWhile
yet unknown number of times we first push the entire while command on the code stack, followed by a marker doWhile, followed by the expression E still to be evaluated. Nothing happens on the value stack as no value is produced nor consumed at this stage.
Once a marker doWhile is found on top of the command stack we know that we have finished the evaluation of the guard expression and can decide what to do next accordingly. We also know that the value of the guard expression has been pushed on top of the data stack. In case it is nil, we pop from the result stack. Since this means we terminate the loop we also pop the marker as well as the pushed while loop off the command stack as the interpretation of this loop is now finished. In case it is different from nil, i.e. of the form ⟨ u.v ⟩, we pop it off the data stack. Since this means we need to execute the loop body another time, we pop the marker doWhile and push in exchange the body B onto the code stack. Since B is a block it is of the form [C1,C2,…,Cn] and we push the individual elements Ci of this list accordingly, making sure that the first one ends up on the top of the code stack as it is the first one to be executed. We leave the entire while loop on the code stack “below the loop body B”, so that the process of executing the loop can begin from the top, once the execution of block B has finished and it has disappeared from the top of the code stack.
A self-interpreter for WHILE
7.2
Now the only task that remains for the self-interpreter for WHILEis to explain how we can generalise from one program variable to many. The idea is simply to implement the variable store as a list of values. The current value of the first variable will be the first element of this list. The value of the second variable will be the second element of the list and so on. We adapt the main program for the self interpreter as follows where we make use of functions update and lookup that deal with variables and will be explained further below.
read PD {
P := hd PD ;
D := hd tl PD;
X := hd P;
Y := hd tl tl P;
B := hd tl P;
CSt := B;
DSt := nil;
bind := [ X, D ];
St := [ bind ];
state := [ CSt, DSt, St ];(* wrap state for STEP macro *)
while CSt { (* main loop for interpretation *)
15
(*inputisalist[P,D]*) (* P = [X,B,Y] *)
(* D input data *)
(* X is input var name *)
(* Y is output var name *)
(* B is program code block *)
(* CSt is code stack *)
(* initially contains only B *)
(* DSt is data stack for *)
(* intermediate results *)
(* initialise store *)
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
state :=
CSt := hd state
};
St := hd tl tl state;
arg := [ Y, St ];
Out :=
}
write Out
(* get command stack *)
(* get final store *)
(* wrap argument for lookup *)
(* lookup output variable value *)
(* return value of result variable *)
The STEP macro (now called STEPn) needs to be changed only at the places where variables are concerned, the evaluation of variables and the assignment command. Those cases must now look as follows: Let us have a look at the changes we needed to make because variables now come with a number in the form [var,X]. For the evaluation of such a variable we now need to find the right element in the list [ [X1, v1], [X2, v2], . . . , [Xn, vn] ] which will be stored in variable St. To obtain the value of variable X we must find the binding for X in the store list and then use the corresponding value v from binding [X, v]. To achieve this we will use macro lookup (the code of which can be found on Canvas). As before, the corresponding value v is then pushed on the data stack and the variable expression popped off the command stack.
For an AST representing an assignment [:=, X, E] we again push the marker and the still to be evaluated expression E but we also need to remember which variable this assignment is for so we push the variable number X onto the command stack before we also push the marker. Once we then hit the doAsgn marker on the command stack, we know the value of the assignment expression has been evaluated and resides on the top of the data stack.
We pop the marker doAsgn and the following variable number X off the command stack, pop the result value v off the data stack, and update the element that corresponds to variable X in the list [[X1,v1],[X2,v2],…,[Xn,vn]] to contain v in the binding for X. This will be done in code using the auxiliary macro update. The code for the lookup and update, respectively, is based on simple list manipulation checking for the right binding using the atom encoding the variable name. Their code can be found on the Canvas site.
8 Our first non-computable (undecidable) prob- lem
After having decided to use WHILE as our notion of “effective procedure” and after having discussed WHILE at great length it is now time to investigate our first major result of computability, and our first “interesting” problem. More precisely, a problem that is undecidable, i.e. that no computer program can decide. That means we cannot write a (WHILE) program that computes the decision procedure for this problem.
16
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
8.1 WHILE-computability
First of all, we formally define what computable means in WHILE. We do this
for functions and for sets (problems).
Definition A partial function f : D → D⊥ is WHILE-computable if there is a
WHILE
WHILE-program p such that f = p
semantics of p (we can also say “if p implements f”). The slogan here is:
, in other words if f is equal to the “A WHILE-computable (partial) function on trees is one that can be
implemented in WHILE.”
As we can encode numbers in WHILE’s data type, we know that a WHILE- computable partial function on natural numbers is one that can be implemented in WHILE. This is important when comparing WHILE-computability with other notions of computation that use the natural numbers as data type (this is quite common).
8.2 WHILE-decidability
A set A ⊂ D of trees is another way to describe a decision problem on trees.
The decision problem for A is, of course, the following: Is a given element d ∈ D also in A?
Or in other words: Given d ∈ D, does d ∈ A hold? Note how this is a problem in our sense: it is uniform, the input type is well defined (D) as well as the output type which is yes or no (one could also say: true or false). Elements of each of those types are finite and precisely describable.
We now define three properties of sets (problems) that are of great impor- tance in computability theory:
Definition A set A ⊆ D is WHILE-decidable if, and only if, there is a WHILE- WHILE WHILE
program p such that p (d)↓ (meaning p (d) is defined) for all d and WHILE
e in D, and, moreover, d ∈ A if, and only if, p (d) = true. 8.3 The Halting Problem (for WHILE)
Let us look at the following problem which is about WHILE-programs (repre- sented as elements of D, we already know how to encode programs as data so this is not a problem).
The Halting Problem – as set HALT ⊆ D – is defined as follows: Definition The Halting problem – as set HALT ⊆ D – is defined as follows:
WHILE
HALT={[p,d]∈D| p
What does the problem say informally speaking, given as a class of (uniform)
questions? Well, the question is this:
17
(d)↓}
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
“Given a WHILE-program p (as data) and a value d ∈ D, does program p terminate if we run it on input d?”
Note again the uniformity, the answer must be given for an arbitrary program and data. We also have a problem about two pieces of information here, the program and its input, but we can “pair them” together into one input tree in the way we already encountered for other WHILE-programs using the encoding of pairing (p,d) as list [p,d] which can be itself encoded in D. We drop the encoding brackets from the list to prove readability.
We will show in this section that the set HALT is not WHILE-decidable in the sense explained earlier. This equivalent to showing that the function halt defined below is not computable by a WHILE-program. In other words, the (total) function
halt(a) =
true if a = [p,d] and p false otherwise
WHILE
(d) ↓
(which is the membership test for the Halting set HALT) is not computable by a WHILE-program.
It should be clear that it would be really useful if we could compute/solve this problem. A compiler could then check if we had inadvertently written a non-terminating program for some specific input and warn us like a type checker warns us about incompatible types when we have used expressions in an ill-typed way.
Note that it is important to be able to recognise undecidable problems. You don’t want to start working on a project that seeks to develop a program that computes the solution for a non-computable problem. You’d like to avoid wasting your time on that. Note that there is an abundance of undecidable problems and we will soon present more of them. You will learn how some of them can be easily recognised. In general detecting that a problem is decidable or undecidable is not so easy.
It should be pointed out that there is still a considerable amount of research going on how to find out whether a program might not terminate for some input. Termination checkers have been recognised as a useful tool even if they cannot work for all input due to the undecidability of the Halting problem. The proof that a program terminates might be particularly important for systems code (operating system). Thus, there have been several projects about this topic at Microsoft Research.
8.4 Diagonalisation and the Barber’s Paradox
In the proof of the undecidability of the Halting Problem we will use an old idea first discovered by Georg Cantor3 in 1891, called diagonalisation. Let us first look at the diagonalisation in another context. The quite famous “barber
3Georg Ferdinand Ludwig Philipp Cantor (March 3, 1845 January 6, 1918) was a German mathematician. He is considered the inventor of set theory. For his revolutionary ideas he was as heavily critiqued during his lifetime as he is applauded now.
18
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
paradox”4 which is a version of a paradox found by the famous Welsh logician Bertrand Russell (1872–1970).
The Barber of Seville says
“In my town Seville, I shave all men who do not shave themselves. Those who shave themselves I do not shave.”
Now we can ask:
“Does the barber shave himself?”
Let us first remind ourselves that the Barber of Seville is a man and lives in Seville. Then let’s look at this question and assume for the time being the answer was “yes”. That means the barber shaves himself, but he stated that he does not shave the men in Seville who shave themselves. So since he is a man from Seville he does not shave himself. But then he stated that he shaves the men in Seville who do not shave themselves, so he must shave himself, and so on and so on.
Similarly, if we assume the answer is “no” we know that the barber does not shave himself but he stated that he shaves all men in Seville who do not shave themselves and he is a man living in Seville, so he must shave himself but then he said he does not shave those who shave themselves, so he must shave himself and so on and so on.
We see quickly that there is no answer to this question. The question is paradoxical. It does not make sense.5
We can visualise this by a two-dimensional table where on the x-axis and y-axis we list the men of Seville. We can clearly do this. We use numbers to represent them. In each cell at row x and column y we then write whether person number x shaves person number y. So, for instance for row 2 and column 3 we have ‘yes’ meaning that person number 2 shaves person number 3. Below is (part of) such a sample table. Note that the entries are arbitrarily chosen in this example.
4Strictly speaking, this is not a paradox as it can be resolved without violating any other fundamental laws of mathematics or physics.
5Note that raising such a paradoxical question is a typical Sci-Fi plot to overcomes some computer, robot, or probe that is about to destroy the spaceship/earth/galaxy but trying to answer the question it runs into the endless cycle indicated above and eventually explodes. That trick has been used at least several times in the classic Star Trek series.
19
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
# m.i.S.
1 2 3 4 5 6
.
number of man in Seville
1 2 3 4 5 6 … yesyesnoyesnono… nonoyesnonoyes… noyesnononono… yes yes no no yes yes … no no no no no no … yes yes no no no yes …
.
3466 3467 …
3466 yes yes no no yes yes … 3467 no yes yes no yes yes …
. .
no no no yes no yes
yes yes
no … yes … yes … yes … no … yes …
yes … no …
If there seems to be an “awful lot of shaving going on” in Seville, then this is by chance. Please note that it does not matter for demonstrating the argument what the concrete entries are.
Now the question is: “What does the barber’s row look like?” What entries does it have for each column. Assume we look at column y. So we ask whether the barber shaves person number y. Well he does shave y in case that y does not shave y, otherwise he does not shave y. So we have to look into the diagonal of the table at (y, y) and negate the entry. If the entry was ‘yes’ then the entry in column y for the barber must be ‘no’ and vice versa. Therefore the barber’s row is the “negated diagonal” so to speak. For the above table the barber’s row then looks as follows:
barber no yes yes yes yes no … no yes …
But this means the barber’s row cannot be one of the rows in the table. The reason is that the row is different from any other row in the table but we assumed we listed all men in Seville in the table. Why is the barber’s row different from all rows in the table? Well, it is so by construction. Assume without loss of generality, the barber’s row was the one for person number 2. In this case we look at the entry in the 2nd column. The entry for the barber will be ‘no’ if the entry for person number two was ‘yes’ and will be ’yes’ if the entry of person number 2 was ’no’. Remember this is because the barber’s row is the negated diagonal. We can play the same “spiel” for any number, not just 2. Thus for every row k we can find a column (namely the k-th column) where the barber’s row and the k-th row are different. Consequently, the barber cannot be already in the table. This is a problem though as he must be from Seville (by assumption). Thus we get a contradiction (unless we remove the assumption that the barber is from Seville or that he is a man).
Now, the “trick” here is the self reference of the barber’s statement. When he talks about all men in Seville he implicitly includes himself. So the statement becomes self-referential. This is the problem. We try to learn from this and apply the same technique to show the undecidability of the Halting Problem.
20
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
8.5 Proof of the Undecidability of the Halting Problem
We will assume that the Halting Problem is decidable, construct another pro- gram from this and then show using diagonalisation that this leads to a contra- diction. Thus, the assumption that the Halting Problem is decidable must have been wrong. This is called a proof by contradiction.
Let us assume more concretely that the WHILE-program that decides the Halting problem is called h and looks like this
h read A { B;
} write C
where B is the body of the program where all the “action happens” (we don’t need to know what it is exactly). Let us now construct a WHILE-program r that will correspond to the barber of the previous example. This program r is defined as follows:
r read X {
A := [ X, X ]; (* diagonalisation of argument *)
B;
Y := C;
while Y {
Y := Y
} }
write Y
(* run h on input [X, X] *)
(* store result of running h in Y *)
and uses A, B and C from the (assumed) program h above which is important as we will see shortly.
We now ask a question similar to “Does the barber shave himself?”, namely “Does program r terminate when run on itself as input?”. In other words, more concisely, “Does r (r) ↓ hold?” To answer this, let us go through r and interpret it with ( the encoding of) r as input. Let us do this step by step:
1. Initially, we read in the input and afterwards X will have the value r, i.e the AST of r. We now drop the encoding brackets for the sake of readability throughout.
2. After the first assignment of the program, A:= [ X, X ], has been exe- cuted, A has value [ r, r ].
3. Then block B is executed which decides the Halting problem (this was our assumption). So after that , variable C contains the result of the decision
WHILE
procedure, i.e. C = h
4. Next, Y:=C is executed and afterwards Y = h [ r, r ].
21
[r,r] .
WHILE
⃝c Bernhard Reus, Sussex University 2020/21 Limits of Computation
5. For the while loop that follows, we have to distinguish two cases: whether
Y is true or not.
First assume Y is indeed true then we are obviously caught in a non-
terminating loop, so we know that r does not terminate when run on
WHILE
(r) = ⊥. But, hang on a minute, was Y WHILE
input r, in other words, r
not supposed to be true? Indeed it was. In this case Y = h [ r, r ] must be true as well. This in turn means by assumption (h decides the
Halting problem) that r does terminate when run with input r, in other
WHILE WHILE
(r) ↓. This contradicts r (r) = ⊥. WHILE WHILE
words r
Ok, maybe this means Y is false then. In this case r will not go into
the non-terminating loop but instead terminate with result false. Thus,
Y = h
WHILE
[r,r] = false which means by assumption (h decides the
(r) = false or, ignoring the result, r (r) ↓. But, hang on
r
a minute, was Y not supposed to be false? Indeed it was, and therefore
Halting problem) that r does not terminate when run with input r, in WHILE
other words r (r) = ⊥. This contradicts r (r) ↓.
So whichever case we consider, we always get a contradiction.
The diagonalisation in this argument can be shown by writing a table similar to the one for the Barber’s paradox. Instead of the men in Seville, we list all WHILE-programs on both axes. We can do that as we can enumerate (generate) allprogramsandgiveeachanumber.6 Afterall,programsarejustfinitestrings (syntax). Now the entry in the table for row x and column y describes whether program x terminates when run with input being program y. Now r plays the role of the barber here. Its row of entries is again the negated diagonal of the above table, simply by construction. So r cannot appear in the table and thus it cannot be WHILE-program. Yet, we have seen that r is a WHILEprogram provided that WHILE-program h exists. Since the existence of h was the only assumption in our argument, we know that this assumption must be wrong. Thus we know for sure that WHILE-program h cannot exist and thus the Halting Problem is not decidable (by a WHILE-program).
6The enumerability of programs follows from the fact that for any alphabet Σ the set of all finite strings over Σ, typically denoted Σ∗, is enumerable. The latter is a good exercise.
22