University of Delaware CISC 108: Introduction to Computer Science I Fall 2016
Goals:
Lab 9
0. Goals and Instructions
• understand more complex self-referential data structures
• construct templates from self-referential type definitions
• design functions that consume data of self-referential types • perform a precise analysis of a simple game
• understand concepts related to Binary Search Trees
Instructions. Please submit all answers via Sakai in a single file named lab9.rkt. Do all work in pairs. Include the names of both partners at the top of the file in a comment. You need only submit one lab per pair.
1. Pick Up Sticks
In the game of sticks there is a heap of sticks on a board. On their turn, each player picks up 1 to 3 sticks. The one who has to pick the final stick will be the loser. The following is an example of the game of sticks.
(1) The game starts with 20 sticks on the board.
(2) Marvin takes 3 sticks, there are 17 sticks remaining. (3) Hal takes 2 sticks, there are 15 sticks remaining.
(4) Marvin takes 1 stick, there are 14 sticks remaining. (5) Hal takes 3 sticks, there are 11 sticks remaining.
(6) Marvin takes 2 sticks, there are 9 sticks remaining. (7) Hal takes 2 sticks, there are 7 sticks remaining.
(8) Marvin takes 3 sticks, there are 4 sticks remaining. (9) Hal takes 1 stick, there are 3 sticks remaining.
(10) Marvin takes 2 sticks, there is 1 stick remaining. (11) Hal has to take the final stick and loses.
Our goal is to analyze this game to determine whether there is a strategy that guarantees a win if you start with a given number of sticks.
We will analyze the game by constructing a game tree. The game tree encodes all possible games that can be played, starting from a specified number of sticks. The idea is to consider all possible moves the first player can make, and then, for each of those choices, all possible moves the second player can make, and for each of those, all possible moves the first player can make, and so on, until all possibilities have been explored.
Say the first player is Maxine (or Max for short) and the second player is Minnie (Min for short). The complete game tree starting from 3 sticks, with Max moving first, is shown in Figure 1. Note that there are two kinds of non-leaf nodes in this graph: “Max nodes” are nodes in which Max is to make the next move and have triangles pointing up, “Min nodes” are nodes in which Min is to make the next move and have downward pointing triangles. There are three kinds of leaf nodes: “Impossible” (a move that can’t happen), “Max wins”, and “Max loses”.
1
2
Max 3
1
23
Min 2
Min 1
Max Loses
Max 1
Impossible
Impossible
1
Max Wins
2
31
Impossible Max Wins
3 Impossible
2
1
Max Loses
3 2
Impossible
Figure 1. Complete game tree starting with 3 sticks. Max moves first. Triangles pointing up indicate a state in which Max is to make the next move; pointing down indicates Min is to move next. The number in the triangle is the number of sticks remaining.
Define the value of a node recursively, as follows: the value of an “Impossible” leaf node is #false. The value of a “Max wins” leaf node is 1, and the value of a “Max Loses” leaf node is -1. The value of a Max node is the maximum of the values of its children (ignoring any #false values). The value of a Min node is the minimum of the values of its children. Figure 2 shows the values of each node of the tree from Figure 1.
What is the interpretation of value? If the value of a Max node is 1, then Max has a guaranteed strategy to win the game. She simply has to choose a move that brings her to a child node with value 1 — there must be such a choice since the maximum value of the children is 1. Since that child node is a Min node with value 1, all of its children must have value 1, so no matter what move Min makes from that point, the result will be a Max node with value 1. Max just has to keep repeating this strategy until she reaches a leaf node with value 1.
If the value of a Max node is -1, then Max may win or lose. But if Min is also playing her optimal strategy, Max will definitely lose.
Similarly, if the value of a Min node is -1, then Min has a guaranteed winning strategy. If the value of a Min node is 1, then Min may win or lose, but will definitely lose if Max is playing her best game.
Figure 2 reveals that Max has a guaranteed winning strategy: she should choose to pick up 2 sticks.
Problem 1. Design a Data Definition for a GameTree [GT]. Make sure to include some GameTree examples, and the correct Template for a gt-fun. Notice that each non-leaf node has three children (corresponding to take1, take2, and take3 sticks) and that there are three base cases (Max wins, Max loses, impossible). Be sure to encode whether a non-leaf node is a Max node or a Min node.
3
1
1 1
3 2
-1 1
2
312
-1
3
#false
1
3 2
-1
#false
1
#false
1 -1
#false #false
Figure 2. Values of each node from the tree of Figure 1. The value of a Max node is computed by taking the maximum of the values of its children; the value of a Min node is computed by taking the minimum values of its children.
Problem 2. Design the function evaluate, which consumes a GameTree, and produces the value of that tree, which is either 1, -1, or #false.
Problem 3. Design the function build-game-tree, that consumes the number of sticks, and produces the game tree (starting with Max) that corresponds to that many sticks. Recall that building a tree has a template shape corresponding to the data definition. It is probably easier to think of building the tree starting with Max, and building starting with Min, and bounce between the two (you can define them either locally or globally). For example, building a tree starting at Max with 3 sticks (as in Figure 1) means creating a game tree node with three children: one a Min tree with 2 sticks, one a Min tree with 1 stick, and one base case “Max loses”.
Problem 4. Design the function max-winning-move which consumes the number of sticks. If Max has a guaranteed winning strategy starting from this number, it returns a move (1, 2, or 3) that Max should make to ensure her victory. Otherwise, it returns #false.
Problem 5. Many people say that the sticks game is always won by the first player. To investigate this hypothesis, design a function first-player-wins that consumes a positive integer n and returns a list of Booleans of length n. The first element of that list is #true if the first player has a guaranteed winning strategy starting from 1 stick, else it is #false. The second element is #true if the first player has a guaranteed winning strategy starting from 2 sticks, and so on, up to n sticks.
Apply your function to n = 25 and examine the result. Is it the case that the first player always has a guaranteed winning strategy? If not, find the general pattern: for which starting numbers does the first player not have a guaranteed winning strategy? Show your result and give your answer/explanation in a comment at the end of your solution to this problem.
4
2. Binary Search Trees
Problem 6. For this and the following problems, we will consider binary trees in which each node contains just one piece of data, a single number, called the value. Provide a data definition for this kind of binary tree, including a template for a function consuming it. Next, provide a data definition for a binary search tree.
Problem 7. Design a function bst? which consumes anything and determines whether that thing is a binary search tree. Make it efficient! A tree with 10,000 nodes should not be a problem.
Problem 8. Design a function random-bst to produce a random BST of a specified size. It should start with an empty tree and repeatedly add a randomly-chosen number to the tree in such a way that it remains a BST at each step. This function consumes two natural numbers: (1) num, the number of nodes in the tree that it returns, and (2) max, one more than the maximum value a node may have. The function returns a binary search tree with num nodes, and with values in the range [0, max − 1].
Your unit tests for this function should check that (1) the result is a binary search tree, (2) the number of nodes is num, and (3) the node values are all in [0, max − 1]. Define auxiliary functions as needed to help check these conditions. [Note: you may want to look up check-satisifed.]
Your function should also be reasonably efficient and capable of generating binary search trees of size 100,000 in at most a few seconds. (However, the tests can be on much smaller trees.)
Hint: Design a function to create a binary search tree. Use it in combination with some list- abstraction function(s).
2.1. Measuring balance. How balanced is a random BST? The easiest way to measure a tree’s balance is to look at its height.
The height of a tree is the maximum number of edges occurring in a path from the root node to a leaf node. For example, the height of a tree with one node is 0. The height of a tree with two nodes is 1. The height of a tree with 3 nodes in which the root has two children is also 1. An empty tree is considered to have height −1.
Given some fixed number of nodes, the smaller the height, the more balanced the tree. If the number of nodes is n, the smallest height possible is ⌊log2(n)⌋: that is the log base-2 of n rounded down to the nearest integer. For example if n = 8, then the smallest height possible is ⌊log2(8)⌋ = 3. In the worst case, a binary tree with n nodes can have height n − 1 (every node has at most one child). So the height of a binary tree with n nodes is in the interval [⌊log2(n)⌋,n−1]. We recommend that you draw a few trees and test this formula to convince yourself that it is true.
In general, if you have numbers in an interval [a, b], you can normalize those values so that they all fall into the range [0, 1]. Given x in [a, b] the normalized value is (x − a)/(b − a). The case a = b is special and in this case we define the normalized value to be 0.
Problem 9. Design a function height which consumes a binary tree and returns its height.
Problem 10. Design a function balance which consumes a non-empty binary tree and returns its balance factor. If the number of nodes of the tree is n and the height is h, the balance factor
5 is obtained by taking the height of the tree (a number in [⌊log2(n)⌋, n − 1]), and normalizing it to
get a value in the range [0, 1].
Problem 11. Run your balance function on 5 random binary search trees of size 100,000, with max 10,000,000. Report your results in a comment, and answer the question: how balanced is a random binary search tree?