EECS 3401 — AI and Logic Prog. — Lecture 13
Adapted from slides of Yves Lesperance
Vitaliy Batusov
vbatusov@cse.yorku.ca
York University
November 4, 2020
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 1 / 25
Today: Search Algorithms and Constraint Satisfaction Required reading: Russell & Norvig Chapters 3.6, 4.1, 6
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 2 / 25
Recap from last time
Searching in a state space for a sequence of transitions taking us from initial state to goal
Frontier: a list of nodes to expand, at start contains just the initial state
Choice of which node to expand defines the search strategy
Formalized as: sort frontier by some criterion and always choose the first node
Blind (uninformed) searches: BFS, DFS, IDS, UC (pros and cons) Informed search: guided by a heuristic function (domain-specific) Heuristics: admissible, monotone
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 3 / 25
Recap: Heuristics and A*
A* Search: sort frontier by f -values (increasing) f (n) = g(n) + h(n)
where g(n) is the actual cost to get to n.
h is admissible if h(n) ≤ h∗(n) for all n
h is monotone if h(n) ≤ c(n → n′) + h(n′) for all n, n′, c Every monotone h is admissible
Monotonicity makes A* amazing (relatively)
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 4 / 25
A* search with monotonicity
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 5 / 25
Admissibility without monotonicity
What happens to the properties of A* when h is admissible but not monotonic?
Time and space complexity remain the same Completeness still holds
Without cycle checking, optimality still holds, but for a different
reason
Assume the goal path ⟨S,…,G⟩ found by A* has cost g(G) greater than the optimal cost C∗. Then, there must exist a node n in the optimal path that is still in the frontier. So:
f(n) = g(n)+h(n) ≤ g(n)+h∗(n) = C∗ < f(G)
If the f -value of n is smaller than that of the goal, then n would’ve been selected
for expansion before G — a contradiction.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 6 / 25
Admissibility without monotonicity
No longer guaranteed to get an optimal path to a node on first visit. Cycle checking, as defined previously, will not preserve optimality
To fix this, must remember cost of previous path. If new path is cheaper, must explore again.
Monotonic contours no longer exist
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 7 / 25
Building Heuristics
How to come up with heuristic functions?
A good approach: simplify the problem, let h(n) be the cost of reaching the goal in the simplified version
Example: 8-Puzzle
In the original problem, can move a tile from square A to B if A is adjacent to B and B is empty.
Can relax this in several ways:
1 Ignore whether B is empty
2 Ignore whether A is adjacent to B
3 Ignore all preconditions on actions
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 8 / 25
Building Heuristics: Relaxing the Problem
“Ignore all preconditions on actions” — known as the misplaced tiles heuristic
To solve the puzzle, need to move each tile in its final position Number of required moves = number of misplaced tiles
Let h(n) = (# of misplaced tiles). Clearly, this underestimates the actual cost
“Ignore whether B is empty” — known as the manhattan distance heuristic
To solve the puzzle, need to slide each tile (in sequence of vertical or horizontal steps) to its proper place (different for each tile)
Number of required moves = t∈Tiles manhattan distance(t)
Let h(n) = t∈Tiles manhattan distance(t). This also underestimates the actual cost
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 9 / 25
Building Heuristice: Relaxing the Problem
The optimal cost to nodes in the relaxed problem is an admissible heuristic for the original problem
Proof: The optimal solution in the original problem is also a solution for the relaxed problem. Therefore, it must be at least as expensive as the optimal solution in the relaxed problem.
Real example: solving the 8-Puzzle using IDS, A*+misplaced, and A*+manhattan (average total nodes expanded)
Depth IDS
10 47127
14 3473941
A*+mispl.
93 539 39135
A*+manh.
39 113 1641
24
—
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU)
EECS 3401 Lecture 13
November 4, 2020
10 / 25
Building Heuristice: Relaxing the Problem
Does manhattan always expand fewer nodes than misplaced?
Yes. hmispl (n) ≤ hmanh(n). We say that hmanh “dominates” hmispl .
Among several admissible heuristics, the one with the highest value is the fastest.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 11 / 25
Building Heuristice: Pattern Databases
Admissible heuristics can also be derived from solutions to
subproblems
Each state is mapped into a partial specification
Example: in the 15-Puzzle, assume only the position of some specific tiles matters
By searching backwards from these goal states, we can compute the distance of any configuration of these tiles to their goal locations. We are ignoring the identity of the other tiles, thus underestimating the effort in the general case.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 12 / 25
Building Heuristice: Pattern Databases
These configurations are stored in a database, along with the number of moves required to move the tiles into place
The maximum number of moves taken over all of the databases can be used as a heuristic
In the 15-Puzzle:
The “fringe” database yields about 300-fold decrease in the search tree size
The “corner” database yields about a 400-fold decrease
Can also generate a database of disjoint patterns (mutually non-contradictory), so that the number of moves can be added rather than taking the maximum. This gives about a 10000-fold decrease compared to hmanh.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 13 / 25
Assignment 2
To be posted by Friday
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 14 / 25
Local Search
So far, we were concerned with findint a path to the goal
For many problems, we don’t care for the path—only want to find a
goal state
Examples: scheduling, IC layout, network optimization, etc.
Local search algorithms operate using a single current state and generally move to neighbours of that state
There is an objective function that tells the value of each state. The goal is defined as the state with the highest value (global maximum)
Algorithms like Hill Climbing try to move to a neighbour with the highest value
Danger of being stuck in local maximum. To deal with that, some degree of random exploration is added
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 15 / 25
Local Search Algorithms
Simulated Annealing: instead of the best available move, take a random move and if it improves the situation then always accept, otherwise accept with a probability < 1. Progressively decrease the probability of accepting such moves.
Local Beam Search: like a parallel version of Hill Climbing. Keeps K states and at each iteration chooses the K best neighbours (so information is shared between the parallel threads). Also stochastic version.
Genetic Algorithms: similar to Stochastic Local Beam Search, but mainly use crossover operation to generate new nodes. This swaps feature values between the parent nodes to obtain children. This gives a hierarchical flavor to the search: chunks of solutions get combined. Choice of state representation becomes very important. Has had wide impact, but not clear if/when better than other approaches.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 16 / 25
Constraint Satisfaction Problems
The search algorithms we discussed so far had no knowledge of the state representation (black box)
Couldn’t take advantage of domain-specific information
CSP are a special class of search problems with a uniform and
simple state representation
This allows to design more efficient algorithms
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 17 / 25
Constraint Satisfaction Problems
Many problems can be represented as a search for a vector of feature values
k-features: variables
Each feature has a value from some domain Example: height = {short,average,tall}, weight = {light,average,heavy}
In such problems, the task is to search for a set of values for the features (variables) so that the values satisfy some given conditions (constraints)
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 18 / 25
Constraint Satisfaction Problems
Sudoku
81 variables: 9 by 9 grid, each cell needs a value
Values: a fixed value for those cells that are already filled in, some value from the set {1, 2, 3, 4, 5, 6, 7, 8, 9} for each of the empty cells Solution: a value for each cell satisfying the constraints:
No cell in the same column can have the same value No cell in the same row can have the same value
No cell in the same sub-square can have the same value
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 19 / 25
Constraint Satisfaction Problems
Scheduling
Want to schedule a time and a space for each final exam, so that
no student is scheduled to take more than one final exam at any one time
The space allocated has to be available at the time chosen
The space has to be large enough to accommodate all of the students taking the exam
Variables:
T1,...,Tm: each Ti represents the scheduled time for the i-th final
(Assume domains are fixed to something like {MonAM, MonPM, . . . , FriPM})
S1, . . . , Sm: each Si is the space variable for the i-th final
(Domain of Si is the set of all rooms big enough to hold the i-th final)
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 20 / 25
Constraint Satisfaction Problems
Want to find an assignment of values to each variable, subject to the constraints:
For all pairs of finals i, j, such that there is a student taking both, want Ti ̸= Tj
For all pairs of finals i, j, want to have either Ti ̸= Tj or Si ̸= Sj.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 21 / 25
CSP, Formally
A Constraint Satisfaction Problem (CSP) consists of:
A set of variables V1,...,Vn
For each variable Vi , a domain of possible values Dom[Vi ] A set of constraints C1,...,Cm
Each variable Vi can be assigned any value from Dom[Vi ] Each constraint C has
A scope: a subset of the problem’s variables which it concerns e.g., V1, V2, V4
A boolean function that maps assignments to these variables to True/False, e.g.,
C(V1 =a,V2 =b,V4 =c)=True C(V1 =b,V2 =c,V4 =c)=False
A solution to a CSP is an assignment of a value to all of the variables
such that every constraint is satisfied.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 22 / 25
Example: Sudoku
Variables: V1,1, V1,2, . . . , V9,9
Domains:
Dom[Vi ,j ] = {1..9} for empty cells,
Dom[Vi,j] = {k} for each cell pre-filled with some k
Row constraints: CR1(V1,1, V1,2, . . . , V1,9) CR2(V2,1, V2,2, . . . , V2,9) ...
CR9(V9,1, V9,2, . . . , V9,9)
Column constraints: CC1(V1,1, V2,1, . . . , V9,1) etc.
Sub-square constraints:
CSS1(V1,1, V1,2, V1,3, V2,1, V2,2, V2,3, V3,1, V3,2, V3,3) etc.
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 23 / 25
Example: Sudoku
Each of these constraints is over 9 variables, and they are all the same constraint: all values must be unique
Such constraints are often called the ALL-DIFF constraints
Thus, Sudoku has 3 x 9 ALL-DIFF constraints, one over each set of variables in the same column, one over each set of variables in the same row, and one over each set of variables in the same sub-square
Note: An ALL-DIFF constraint over k variables can be equivalently expressed by k-choose-2 not-equal constraints over each pair of these variables.
Let NEQ be a not-equal constraint; then
CSS1(V1,1, V1,2, V1,3, V2,1, V2,2, V2,3, V3,1, V3,2, V3,3) = NEQ(V1,1, V1,2), NEQ(V1,1, V1,3), . . . , NEQ(V3,2, V3,3)
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 24 / 25
End of Lecture
Next time: Solving CSPs
Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 13 November 4, 2020 25 / 25