Instructions:
Please attach this sheet to the front of your assignment.
• This assignment contributes 20% towards your final grade. The total mark is 100.
•
p.m. on Friday, 3 May 2019. The submission requirement is specified at the end of each option.
• You can team up with another student.
Option A: Knowledge and Rational Agents
The aim of this assignment is for you to gain a better understanding of knowledge and rationality in building intelligent agents. During Workshop 1, you were given the opportunity to play a game named “Win an additional mark if U can!”. The same game was played in Workshop 2.
All the participants were presented an online form like this:
Then about 5-10 minutes were given for thinking and submitting the answers. This is a variant of Keynes’ Beauty Contest Game discussed during Lecture 2. Your tasks are as follows.
Task 1 (12 Marks)
Here is a simplified version of the game “Win an additional mark if U can!”.
• There are two players.
• Each player names an integer between 1 and n.
•
• If there is a tie, each player gets reward of 1.
• Represent this game in Normal Form. You can select any n ≥ 5. (5 marks)
• Which pure strategy should each of the two rational players choose? Show your argument. (5 marks)
(Hint: The selection of your strategy must be obtained by applying the concept of dominated strategies to rule out a succession of inferior strategies until only one choice remains.)
• Explain that the above strategy is indeed a Nash equilibrium. (2 marks)
Task 2 (88 Marks)
There are 3 labs (A, B, and C, in chronicle order). In workshop 1, the participants in B and C were informed the winners’ choices of earlier sessions. In workshop 2, the tutor gave a suggestion on what he thought the average for each lab and made it public: 30 for A, 20 for B and 10 for C.
We’ve made a data file of all results by replacing the name and student IDs with pseudo IDs of letters followed by a number. Visit this link for online version:
http://bit.ly/2Y9X9rs
The same game was played last year in three labs of a similar setting and here is the data file (there was no second play) :
https://bit.ly/2HQDDMi
Note that under the [File] menu, it can be downloaded for offline access:
Your task is to study these two data files and write a report of 5 pages maximum. You need to discuss at least the following points:
• Discuss what is rationality. Were the winners win by rational thinking, pure luck, or a combination of both? Your argument should be supported by evidences given in the data files. (11 marks)
• Give a classification of players in terms of different levels of rationality and their knowledge about other’s rationality. A good classification scheme shall cover at least 3 classes. For each class, you give a clear criteria, so that each row in the data file can be classified.
To support your scheme, you shall utilise the reasons given by the players (but be minded that the players do not necessarily articulate their reasons very clearly). (22 marks)
• Across all plays in 2018 and 2019 of the same week, Lab B’s winning number is lower than that of A, and Lab C’s winning number is lower than that of B. Explain the possible reasons of this phenomena using the data sets.
Compare the difference between year 2018 and 2019.
Discuss if the suggested numbers given by the tutor may have affected the results. (40 marks)
• What is the pure strategy Nash equilibrium of this game assuming the perfect rationality of all agents? None of the lab winners chooses this strategy. If we repeatedly play this game, what would be likely to happen? Note that in 2019, each lab had a repeated play of this game.
Design an experiment to test/verify your reasons. Recruit some friends or family to play in your exper- iment (with no less than 6 players).
(15 marks) Hint: you’ll need to think about what reward to be offered, e.g. we used additional marks for the labs.
Extra marks (up to 10) will be given if you are able to do something extra, e.g., visualising the data in an interesting and meaningful way when you discuss (b), doing extensive experiments to verify your prediction when you discuss (d). These extra marks can make up your loss in other parts of this assignment. The total mark is capped at 100.
Submission
You can team up with a classmate to work on this assignment. One submission per team.
You are required to provide:
• Your solution to Task 1, and
• Your report for Task 2. (Limited to 5 pages)
We prefer that both (1) and (2) are included in a single PDF file. If you have extra data, your submission should including everything in a zip file with the following format: AI2019A1A-XY.zip, where XY is your student ID(s). The submission is via Blackboard.
Option B: Tiger vs. Dogs in GDL
The aim of this assignment is to develop your ability of formalising abstract concepts and enhance your understanding of logical reasoning in General Game Playing.
Game Description
Here is a very ancient game originated from China: Tiger vs. Dogs.
×
The tiger is controlled by the tiger player and the dogs are controlled by the dog player. The tiger player goes first and then they take turns. Each player can go one step along the line to an adjacent position that is not occupied.
When the tiger enters a position such that the following condition hold “two dogs are adjacent to this position such that they three are in the same line, and also these two dogs have no adjacent dogs in the same line”, then these two dogs are killed by the tiger. If 6 dogs are killed, then the tiger player wins and the dog player loses.
When the dogs surrounded the tiger such that there is no unoccupied adjacent position for the tiger to move, then the tiger player loses and the dog player wins.
Task 1 (20 Marks)
Describe this game mathematically (Refer to lecture slides). Here are some key points:
• The set of game states S. Give your estimation of the number of different states. (4 marks) Hint: take the Tic-Tac-Toe for example, a state can be described using,
•
• 1 variable of control, with possible values xplayer and oplayer, to represent which player is in control
e.g., for the initial state, s0, it can be described as ( Cell(1,1)=b, Cell(1,2)=b, … , Cell(3,3)=b, con- trol=xplayer )
×
( Cell(1,1)=x, Cell(1,2)=x, Cell(1,3)=x, Cell(2,1)=o, Cell(2,2)=o, Cell(2,3)=o, Cell(3,1)=b, Cell(3,2)=b, Cell(3,3)=b, control=xplayer )
In the above state both xplayer and oplayer have formed 1 line of their own markings, but this is not possible as once one player manages to get 1 line of its own marking, then the game is over.
• { }
• Actions A. You may use abstraction to get as few types of actions as possible. (4 marks) Hint: use A1 for the tiger player, and A2 for the dog player. A = A1 × A2.
Just list all possible actions for each player, e.g., you can use left as one action for tiger player. Note
that for a particular state, not necessary all actions are possible.
• × →
(6 marks)
Hint: You can just use the following format:
give a state s, and players’s joint action (a1, a2), then give the resulting state s/.
• Terminal states and utilities.
Hint: When a game terminates, it is either because the tiger is captured or 6 dogs are killed. So describe the conditions on a terminal state using the state representation you give in (a).
You may use a few representative ones to illustrate.
(6 marks)
To help you to understand this game, here are a few examples:
Example:
When the dog makes the move by red arrow, the tiger is surrounded.
Example:
When the tiger makes the move by red arrow, the marked two dogs are killed.
Example:
When the tiger makes the move by red arrow, the marked two dogs are killed.
Example:
1
2
Task 2 (80 Marks)
• Write a game description in GDL (KIF form) for this game. Note that this involves in making the concepts developed in Task 1 into GDL. The transition function in Task 1 is now succinctly represented as a set of logical rules. Your GDL should be annotated for better readability, especially the new predicates that you’ve created. Have a clear separation of the following sections: (60 Marks)
• Roles and the initial states;
• Legal moves;
• State transitions (next rules);
• Terminal conditions;
• Utilities (goal).
• Test and debug your game description using the game controller (refer to workshop) to ensure that it runs correctly. (20 Marks)
http://sourceforge.net/projects/ggpserver/files/gamecontroller/
In your submission, you need to include at least two execution traces generated by the game controller with both players using the random type.
Extra marks (up to 10) will be given if you are able to do something extra, e.g., classifying and visualising the transition function in 1(d) and the traces in 2(b). These extra marks can make up your loss in other parts of this assignment. The total mark is capped at 100.
Submission
You can team up with a classmate to work on this assignment. One submission per team.
You are required to provide
• Your solution to Task 1,
• Your GDL file for Task 2(a), and
• Your test results for Task 2 (b).
We prefer that (1) and (3) are included in a single PDF file. Your submission should including everything in a zip file with the following format: AI2019A1B-XY.zip, where XY is your student ID(s). The submission is via Blackboard.
Option C: Game Playing Agent with Minimax
The aim of this assignment is for you to better understand computer game playing by implementing a special game playing agent with the minimax algorithm and evaluation function.
Task 1 (20 Marks)
Tic-Tac-Toe This problem exercises the basic concepts of game playing, using Tic-Tac-Toe (Naughts and Crosses) as an example, so that you have a better understanding on the main method in Deep Blue beating the best human chess player.
Tic-tac-toe has two players, X and O, who take turns marking the spaces in a 3×3 grid. X player starts first. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
• Estimate the number of possible game plays of Tic-Tac-Toe. One game play is a sequence of moves from initial state till a terminal state. (3 marks)
• Show the whole game tree starting from an empty board down to depth 2 (i.e. one X and one O on the board), taking symmetry into account. (4 marks)
By symmetry, we mean that if you turn the whole board around (without moving the relative position of the pieces), you’ll get the same configuration. For example, the following 4 states are the same under symmetry.
They are different from the following state no matter how hard you turn
• Mark on your tree the evaluations of all the positions at depth 2. The evaluation function is given as follows. (5 marks)
We define Xn as the number of rows, columns or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns or diagonals with just n O’s. The utility function assigns +10 to any position with X3 >= 1 and -10 to any position with O3 >= 1. All other terminal positions have utility
0. For the nonterminal positions, we use a linear evaluation function defined as
Eval(s) = 3X2(s) + X1(s) − (3O2(s) + O1(s))
• Using the minimax algorithm, mark on your tree the backed-up values for the positions at depths 1 and 0, and use those values to choose the best starting move. (5 marks)
• Circle the nodes at depth 2 that would not be evaluated if alpha-beta pruning were applied, assuming the nodes are generated in the optimal order for alpha-beta pruning (i.e., in that order, you can prune the most numbers of nodes). (3 marks)
• (Optional) If minimax (or alpha-beta) were to search the entire game tree, it would evaluate all opening moves equally – because it assumes the opponent will play optimally, which leads to a draw in the end, no matter what the opening move. However, if we assume the opponent could make an error, we can see that some opening moves are better than others. Explain why.
Task 2 (80 Marks)
Consider the Take-away game discussed in the lecture:
• There is a pile of X chips on the table.
• Two players take turns to remove 1, 2, or 3 chips from the table, with player 1 starting first.
• The player removing the last chip(s) wins the game.
In this task, you are asked to develop a game playing agent which can play the Tic-Tac-Toe and the Take-away games against a human player. In particular, you will need to:
• Implement the minimax algorithm in two versions: (i) complete tree search, (ii) depth limited tree search and an evaluation function. (50 marks)
For complete tree search, you may use Depth-first search (DFS). For depth limited tree search, you just need to introduce a depth variable and when it decreases to 0, either the evaluation value or the terminal value is returned. For Tic-Tac-Toe game, the evaluation function is outlined in task 1. For Take-away game, you need to come up with your own evaluation function.
Report on the details of your program design and compare the performance of both versions (e.g., search time per move and memory consumption). Let two versions compete against each other and report on the win-draw-lose rate.
• Implement an intuitive user interface which allows the human player know the game state/actions to play. In the start of games, the human player should be able to choose who plays first and which version of the minimax algorithm to use (in the case of depth limited, allowing to set the search depth).
Note that for the Take-away game, when the number of chips is large, it might not be practical to display chips visually. In that case, you can use numbers.
Report on the program design.
(15 marks)
• Your agent is resource bounded, so the response of the player will be slower if the game gets larger. Design experiments to study the scalability of your game playing agent by using the Take-away game.
Specifically, here are some important aspects to consider:
• Varying the value of X.
E.g., when you have a large value, say X=1000, the game tree for Take-away will be super large, more than 3334 leaves (the branching factor b=3 (three actions), and the depth of leaves ranges from 334 – 1000). So one question on scalability will be: what is the maximal value of X that your agent (running in your hardware) can handle (producing a response in a reasonable amount of time, say 100 seconds?)
• Extending the number of chips the players are allowed to take away.
E.g., you could add two more actions: taking away 4 chips, taking away 5 chips. This will increase the branching factor to 5.
• Varying the limit of search depth.
Report on your experiment design and results. Your hardware and software configurations should also be noted as they are very relevant to the results.
(15 marks)
• (Optional) Implement the alpha-beta pruning algorithm and compare it with the minimax algorithm. Report on the performance difference. If you do this task, add alpha-beta option in (b).
Extra marks (up to 10) will be given if you are able to do something extra, e.g., the optional tasks, or adding another game, such as the more complex Nine-board Tic-Tac-Toe. These extra marks can make up your loss in other parts of assignments. The total mark is capped at 100.
Submission
You can team up with a classmate to work on this assignment. One submission per team.
You are required to provide:
• Your solution to Task 1.
• A report for your program design in (a), (b) and experiments in (c). (Limited to 5 pages)
• The source code and the compiled program of your game playing agent (with notes on what library it may depends on and instructions on how to run it)
We prefer that both (1) and (2) are included in a single PDF file. Your submission should including everything in a zip file with the following format: COMP808A1C-XY.zip, where XY is your student ID(s). The submission is via Blackboard.