COMP4418, 2016 – Assignment 2
Due: Friday, 14 October, 23:59:59 Worth: 1
3
1. [25 Marks] (Planning)
The game of Lights Out consists of a 5×5 grid of lights that can be turned on and o↵. Clicking on a light will toggle both itself and its four adjacent lights. The goal of the game is to switch o↵ all the lights on the grid. A more complete description of the game can be found here:
https://en.wikipedia.org/wiki/Lights Out (game) and can be played on-line here:
http://www.logicgamesonline.com/lightsout/daily.php
Importantly this game can be treated as a planning problem. Your task in this question is to approach a simplified version of the Lights Out game using the delete relaxtion. Consider the following 3 ⇥ 3 Lights Out grid with 4 lights switched o↵ and 5 lights switched on.
123 1
2 3
(a) (2 Marks) We will use the following operator to formalise the planning domain in the classical representation:
Explain why we need separate operators for turning a light on and turning a light o↵. Why can we not express this by a single operator?
(b) (8 Marks) Use the state features off(X,Y) (for light at cell column X and row Y is o↵) to define the preconditions and e↵ects of these two operators.
(c) (2 Marks) Express formally the initial state and the goal of this planning problem.
(d) (4 Marks) The delete relaxation gives rise to a relaxed problem. Give the preconditions and
e↵ects of the operators and the representation of the goal in the relaxed problem.
(e) (9 Marks) For each action applicable in the original problem, determine the resulting state. In each case, give the state feature representation of the state and compute the value of the delete relaxtion heuristic.
name
abbreviation
meaning
turn on(X,Y)
ton(X,Y)
Turn on the light at cell column X and row Y.
turn off(X,Y)
toff(X,Y)
Turn o↵ the light at cell column X and row Y.
1
2. [25 Marks] (Answer Set Programming)
Your task is to use ASP to solve a Lights Out planning problems such as described in question 1.
(a) (5 Marks) First consider the following simplified program:
time(0).
cell(1..2).
on(1,0).
on(X,T+1) :- click(X,T), not on(X,T), cell(X), time(T).
on(X,T+1) :- not click(X,T), on(X,T), cell(X), time(T).
With the additional fact click(2,0) compute all stable models of this program. You must provide your working of how you computed the stable models.
Hint: You will first need to calculate the grounding of the program.
(b) (5 Marks) Now consider a representation for the Lights Out game. The actions of the planning domain can be represented in ASP using the following action name:
Encode the e↵ects of these actions on the status of cells in the grid. To represent that a grid cell is turned on use the predicate on(X,Y,T) to stand for “cell at column X and row Y is on at time T”. For example, the following diagram corresponds to the facts about the following grid at the initial time 0: on(1,1,0), on(2,3,0).
Hint: To encode your action e↵ects you are allowed to introduce auxiliary predicates.
(c) (15 Marks) Using the clauses from (b), write a uniform ASP encoding for solving the planning problem described in question 1 for arbitrary N x N grid sizes and arbitrary initial states. Use the ASP system Clingo to compute answer sets for your ASP program (each answer set will constitute a solution to the planning problem).
Clingo is installed on the CSE computers (as described in the lecture notes) or can be down- loaded from http://potassco.sourceforge.net/.
Your ASP program must be parameterised using the Clingo #const syntax to allow for the size of the grid and the maximum time horizon to be varied. Use the following at the start of your program:
#const size=5.
#const time horizon=10.
To test your program for varying grid sizes, time horizons, and problem instances:
clingo -c size=5 -c time horizon=20 lightsout.asp data.asp
where lightsout.asp provides the uniform encoding of the problem and data.asp provides a particular problem instance.
name
meaning
click(X,Y,T)
Click the cell at column X and row Y at time T.
2
3. [25 Marks] (Situation Calculus and GOLOG)
Consider the following Blocksworld planning problem: There are three tables with blocks a,b and c stacked on the far left table in some configuration. There is a robot arm that can pick up items from each table. The goal of the robot is to build the stack a-b-c on the far right table.
Initial State Goal State Additional assumptions:
• The arm can hold only one block at a time and can only access uncovered blocks.
• Each table has limited space so only a single block can be accommodated directly on a table.
• Unique name axioms are provided to guarantee the uniqueness of objects: a, b, c, arm, table1, table2, and table3.
• Use the following actions:
(a) (4 Marks) Provide a description of the fluents (i.e., state predicates) required to represent the blocksworld domain (similar to the description of the actions above). It may also be useful to define some abbreviations. For example, a goal abbreviation could be defined to capture the desired goal state:
def
Goal(s) = . . . specification of desired stack configuration on table3. . .
(b) (4 Marks) Use the Situation Calculus to specify a precondition axiom for each action.
(c) (5 Marks) Use the Situation Calculus to specify a successor state axiom for each fluent.
(d) (5 Marks) Define a Goal(s) abbreviation to specify the goal state and use this abbreviation to write a GOLOG procedure solve1 that solves the Blocksworld planning problem for arbitrary initial situations.
Hint: Your GOLOG program can be written with the purpose of running it o↵-line.
(e) (7 Marks) Hard question. Write a GOLOG program to solve Blocksworld for arbitrary initial situations consisting of the following procedures:
i. A procedure pop(t,u) that removes the top element o↵ the stack on table t and places it onto the stack on table u.
ii. A procedure move(x,t) that clears any blocks covering x (without changing the stack on table t) and then moves x to the stack on table t. This procedure should use pop(t,u).
iii. A solve procedure solve2 that uses procedures pop(t,u) and/or move(x,t) to achieve the goal.
Hint: This question is hard because it may require a slightly more sophisticated modelling of the fluents for the domain in order to refer to the block at the top of a stack.
pickup(x)
putdown(x,y)
Pickup block x.
Place the currently held block x onto block or table y.
3
4. [25 Marks] (Decision Making)
(a) (3 Marks) For each of the following games, choose the best model among (A) Markov chain (Markov process); (B) Markov decision process (MDP); (C) Hidden Markov model (HMM);
(D) Partially-observable Markov decision process (POMDP); and (E) • Blackjack
• Candy Crush
• Chess
• Minesweeper
• Snakes and Ladders
• Texas Hold ’em Poker
None/Other.
For the next questions, consider the Markov Decision Process depicted below. Edges are labelled “name of the action (reward associated), probability of the transition”.
Leave (5), 0.5
Leave (5), 0.5 Stay ( 2), 1
Stay (1), 1 s1
Leave ( 2), 1
(b) (3 Marks) Using your intuition, give an optimal policy for situations where the discount
factor is very high (for instance, = 0.999)? Explain your reasoning in two or three sentences.
(c) (3 Marks) Using your intuition, give an optimal policy for situations where the discount
factor is very low (for instance, = 0.001)? Explain your reasoning in two or three sentences.
(d) (7 Marks) Represent the values computed during the first three iterations of the Value Itera- tion algorithm using the following format where L represents the action Leave and S represents the action Stay. Use a discounting factor of 0.6.
V0(s) V0(s, S) V0(s, L) V1(s) V1(s, S) V1(s, L) V2(s) V2(s, S) V2(s, L) V3(s)
s1 0 1 … s2 0 …
s3 0
(e) (3 Marks). Let ⇡ be the following policy: ⇡(s1) = L, ⇡(s2) = L, ⇡(s3) = S. If ⇡ is assumed to hold, the MDP turns into a Markov Chain. Represent this Markov Chain / Markov Process.
(f) (6 Marks) Assuming the agent uses ⇡, express the value associated to each state as a function of the discount factor . Provide the formal derivation of the result as part of your answer. Elaborate on whether the computations of this question support the intuition of questions 4b and 4c.
Leave (0), 1
s2 Stay (0), 1 s3
4
Submission
• Put your written solutions to questions 1, 2(a)–(b),3 and 4 in a single PDF file answers.pdf • Put your solution to 2(c) in a file lightsout.asp
Submit using the command:
give cs4418 a2 answers.pdf lightsout.asp
The deadline for this submission is 23:59:59 Friday 14 October.
Late Submissions
In case of late submissions, the maximum available mark is reduced by 10 points for each day late. No extensions will be given for the assignment (except in case of illness or misadventure).
Academic Honesty and Plagiarism
All work submitted for assessment must be your own work. Assignments must be completed individually. We regard copying of assignments, in whole or part, as a very serious o↵ence. Be warned that:
• the submission of work derived from another person, or jointly written with someone else will, at the very least, result in automatic failure for COMP4418 with a mark of zero;
• allowing another student to copy from you will, at the very least, result in a mark of zero for your own assignment; and
• severe or second o↵ences will result in automatic failure, exclusion from the University, and possibly other academic discipline.
Collaborative work in the form of “think tanking” is encouraged, but students are not allowed to derive solutions together as a group during such discussions. Students are also warned not to send solution fragments of the assignments to each other in any form (e.g. as email or listings). In addition, copy- ing/purchasing of solutions that is available on the web is also not permitted. Students are strongly advised to protect their work. Do not leave your terminal/computer unattended, or leave printouts at the printer for others to take. Read the study guide carefully for the rules regarding plagiarism.
5