COMP90038 – Algorithms and Complexity Lecture 22
COMP90038 Algorithms and Complexity
Lecture 22: NP-completeness
(with thanks to Harald Søndergaard & Michael Kirley)
Casey Myers Casey.Myers@unimelb.edu.au David Caro Building (Physics) 274
COMP90038 – Algorithms and Complexity
Lecture 22
Review from Lecture 21: Huffman Encoding
• Sometimes (for example for common English text) we may know the frequencies of letters fairly well.
• If we don’t know about frequencies then we can still count all characters in the given text as a first step.
• But how do we assign codes to the characters once we know their frequencies?
• By repeatedly selecting the two smallest weights and fusing them.
• This is Huffman’s algorithm—another example of a greedy method.
• The resulting tree is a Huffman tree.
COMP90038 – Algorithms and Complexity
Lecture 22
Review from Lecture 21: Huffman Encoding
Symbol
𝐴
𝐵
𝐶
𝐷
_
Frequency
0.35
0.1
0.2
0.2
0.15
Codeword
11
100
00
01
101
1
0
0.6
0
1 1
0.2 0.2 0.1 0.15 0.35 𝐶𝐷𝐵_𝐴
0.4 0.25 010
COMP90038 – Algorithms and Complexity
Lecture 22
Review from Lecture 21: Huffman Encoding
Symbol
𝐴
𝐵
𝐶
𝐷
_
Frequency
0.4
0.1
0.2
0.15
0.15
Codeword
0
100
111
101
110
Encode 𝐴𝐵𝐴𝐶𝐴𝐵𝐴𝐷: 0100011101000101
0
1 0.6
1
Decode 100010111001010:
100
0
101
110
0
101
0
𝐵
𝐴
𝐷
_
𝐴
𝐷
𝐴
0
0.25 010
0.35
1
0.4 0.1 0.15 0.15 0.2 𝐴𝐵𝐷_𝐶
COMP90038 – Algorithms and Complexity Lecture 22
Concrete Complexity
• We have been concerned with the analysis of algorithm’s running times (best, average, worst cases).
• Our approach has been to give a bound for the asymptotic behaviour of running time, as a function of input size.
• For example, the quicksort algorithm is Ο(𝑛!) in the worst case, whereas mergesort is Ο log 𝑛 .
COMP90038 – Algorithms and Complexity Lecture 22
Abstract Complexity
• Complexity theory instead asks:
“What is the inherent difficulty of the problem?”
• How do we know when we have come up with an algorithm which is optimal (in the asymptotic case).
COMP90038 – Algorithms and Complexity Lecture 22
Difficult Problems
• Which problems are difficult to solve?
• The travelling Salesman problem can be solved through brute force for very small instances
– One solution is: 𝑎 − 𝑏 − 𝑑 − 𝑐 − 𝑎
• However, it becomes very difficult as the
number of nodes and connections increase.
– The solution can be checked to determine if it is a good solution or not.
COMP90038 – Algorithms and Complexity Lecture 22
Does P=NP?
• The “P versus NP” problem comes from computational complexity theory.
• P means with polynomial time complexity
– That is, algorithms that have Ο poly( 𝑛)
– Sorting is a type of polynomial time problem
• NP means mon-deterministic polynomial
– The answer can be checked in polynomial time, but cannot find the answer in polynomial time for large 𝑛.
– TheTSPproblemisanNPproblem.
• This is the most important question in Computer Science
COMP90038 – Algorithms and Complexity Lecture 22
Algorithmic Problems
• When we talk about a problem in computer science, we almost always mean a family of instances of a general problem.
• An algorithm for the problem has to work for all possible instances (inputs).
• Example: The sorting problem—an instance
is a sequence of items.
• Example: The graph 𝑘-colouring problem—an instance is a graph.
• Example: Equation solving problems—an instance is a set of, say, linear equations.
COMP90038 – Algorithms and Complexity
Lecture 22
Easy and Hard Problems
• A path in a graph 𝐺 is simple if it visits each node of 𝐺 at most once.
• Consider this problem for undirected graphs 𝐺:
SPATH: Given 𝐺 and two nodes 𝑎 and 𝑏 in 𝐺, is there a simple path from a to 𝑏 of length at most 𝑘?
• And this problem:
LPATH: Given 𝐺and two nodes 𝑎 and 𝑏 in 𝐺, is
there a simple path from a to 𝑏 of length at least 𝑘?
• If you had a large graph 𝐺, which of the two problems would you rather have to solve?
COMP90038 – Algorithms and Complexity
Lecture 22
Easy and Hard Problems
• There are fast algorithms to solve SPATH.
• Nobody know of a fast algorithm for
LPATH.
• It is likely that the LPATH problem cannot be solved in polynomial time (But we do not know for sure).
COMP90038 – Algorithms and Complexity
Lecture 22
Easy and Hard Problems
• Two other related problems:
– EUL: The Eulerian tour problem: in a graph is there a path which visits each edge of the graph once, returning to the origin?
– HAM: The Hamiltonian tour problem: in a graph, is there a path which visits each node of the graph once, returning to the origin?
• Is the Eulerian tour problem P?
– We just need to know whether the edge distribution
is even.
• Is the Hamiltonian tour problem P?
– No.Asthenumberofnodesincreases,theruntime becomes exponential.
COMP90038 – Algorithms and Complexity
Lecture 22
Easy and Hard Problems
• Try to rank these problems according to inherent difficulty:
– SAT: Given a propositional formula 𝜑, is 𝜑 satisfiable?
– SUBSET-SUM: Given a set 𝑆 of positive integers and a positive integer 𝑡, is there a subset of 𝑆 that adds up to 𝑡?
– 3COL: Given a graph 𝐺, is it possible to colour the nodes of 𝐺 using only three colours, so that no edge connects two nodes of the same colour?
COMP90038 – Algorithms and Complexity
Lecture 22
Easy and Hard Problems
• Try to rank these problems according to inherent difficulty:
– SAT: Given a propositional formula 𝜑, is 𝜑 satisfiable?
– SUBSET-SUM: Given a set 𝑆 of positive integers and a positive integer 𝑡, is there a subset of 𝑆 that adds up to 𝑡?
– 3COL: Given a graph 𝐺, is it possible to colour the nodes of 𝐺 using only three colours, so that no edge connects two nodes of the same colour?
Then 𝑂𝑅 = 𝑝⋁𝑞, etc.
Satisfiable problem: 𝜑 = 𝑥!⋁¬ 𝑥”⋀ 𝑥#⋁¬ 𝑥$
Can we find an assignment of values (F or T) to the variables 𝑥% that result in 𝜑 == 𝑇?
𝐴𝑁𝐷
p
q
𝒑⋀𝒒
F
F
F
F
T
F
T
F
F
T
T
T
COMP90038 – Algorithms and Complexity
Lecture 22
Polynomial-Time Verifiability
• SAT, SUBSET-SUM, 3COL, LPATH and HAM share an interesting property.
• If someone claims to have a solution (a “yes instance”) then we can quickly test
their claim.
• In other words, these problems seem hard to solve, but the solutions allow for
efficient verification. They are polynomial-time verifiable.
• That property is shared by a very large number of interesting problems in planning,
scheduling, design, information retrieval, networks, games, …
• To understand this concept, we need to talk about Turing Machines.
COMP90038 – Algorithms and Complexity Lecture 22
Turing Machines
• Turing Machines are an abstract model of a computer.
• Despite their simplicity they appear to have the same computational power as any
other computing device.
– That is, any function that can be implemented in C, Java, Python, etc. can be implemented in a Turing Machine.
• Moreover, a Turing Machine is able to simulate any other Turing Machine. – This is known as the universality property.
COMP90038 – Algorithms and Complexity Lecture 22
Turing Machines
• A Turing Machine are has an infinite tape through which it takes its input and perform its computations.
• It can:
– Both read from and write to the tape, – Move either left or right over the tape.
• Whether the Turing Machine reads, write or moves left or right depends on a control sequence.
• The tape is unbounded in both directions.
• A Turing machine may fail to halt.
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Example
• Let the control sequence be:
– If read 1, write 0, go LEFT
– If read 0, write 1, HALT
– If read _, write 1, HALT.
• Alphabet is _, 𝟎, 𝟏
• The input will be 47″# = 101111!
• The output is 48″# = 11000!
• The rules add one to a number.
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is ⊔, 𝒂, 𝒃
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is _, 𝒂, 𝒃
• We will develop a state automaton:
𝑺𝟏 𝑺𝟐 𝑺𝟑
𝑺𝟒
i. If 𝑺𝟏
ii. If 𝑺𝟏
iii. If 𝑺𝟐
iv. If 𝑺𝟐
v. If 𝑺𝟑
vi. If 𝑺𝟑
vii. If 𝑺𝟒
and 𝒂, go RIGHT, stay in 𝑺𝟏 and 𝒃, go RIGHT, go to 𝑺𝟐
and 𝒂, write 𝒃, go LEFT, go to 𝑺𝟑 and 𝒃, go RIGHT, stay in 𝑺𝟐
and 𝒂 or _, go RIGHT, go to 𝑺𝟒 and 𝒃, go LEFT, stay in 𝑺𝟑
and 𝒃, write 𝒂, go RIGHT, go to 𝑺𝟐
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is _, 𝒂, 𝒃
• We will develop a state automaton:
i. If 𝑺𝟏
ii. If 𝑺𝟏
iii. If 𝑺𝟐
iv. If 𝑺𝟐
v. If 𝑺𝟑
vi. If 𝑺𝟑
vii. If 𝑺𝟒
and 𝒂, go RIGHT, stay in 𝑺𝟏 and 𝒃, go RIGHT, go to 𝑺𝟐
and 𝒂, write 𝒃, go LEFT, go to 𝑺𝟑 and 𝒃, go RIGHT, stay in 𝑺𝟐
and 𝒂 or _, go RIGHT, go to 𝑺𝟒 and 𝒃, go LEFT, stay in 𝑺𝟑
and 𝒃, write 𝒂, go RIGHT, go to 𝑺𝟐
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is _, 𝒂, 𝒃
• We will develop a state automaton:
i. If 𝑺𝟏
ii. If 𝑺𝟏
iii. If 𝑺𝟐
iv. If 𝑺𝟐
v. If 𝑺𝟑
vi. If 𝑺𝟑
vii. If 𝑺𝟒
and 𝒂, go RIGHT, stay in 𝑺𝟏 and 𝒃, go RIGHT, go to 𝑺𝟐
and 𝒂, write 𝒃, go LEFT, go to 𝑺𝟑 and 𝒃, go RIGHT, stay in 𝑺𝟐
and 𝒂 or _, go RIGHT, go to 𝑺𝟒 and 𝒃, go LEFT, stay in 𝑺𝟑
and 𝒃, write 𝒂, go RIGHT, go to 𝑺𝟐
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is _, 𝒂, 𝒃
• We will develop a state automaton:
i. If 𝑺𝟏
ii. If 𝑺𝟏
iii. If 𝑺𝟐
iv. If 𝑺𝟐
v. If 𝑺𝟑
vi. If 𝑺𝟑
vii. If 𝑺𝟒
and 𝒂, go RIGHT, stay in 𝑺𝟏 and 𝒃, go RIGHT, go to 𝑺𝟐
and 𝒂, write 𝒃, go LEFT, go to 𝑺𝟑 and 𝒃, go RIGHT, stay in 𝑺𝟐
and 𝒂 or _, go RIGHT, go to 𝑺𝟒 and 𝒃, go LEFT, stay in 𝑺𝟑
and 𝒃, write 𝒂, go RIGHT, go to 𝑺𝟐
COMP90038 – Algorithms and Complexity
Lecture 22
Turing Machines: Complex Example
• This (halting) Turing machine sorts a sequence of 𝑎s and 𝑏s.
• The tape alphabet is _, 𝒂, 𝒃
• We will develop a state automaton:
i. If 𝑺𝟏
ii. If 𝑺𝟏
iii. If 𝑺𝟐
iv. If 𝑺𝟐
v. If 𝑺𝟑
vi. If 𝑺𝟑
vii. If 𝑺𝟒
and 𝒂, go RIGHT, stay in 𝑺𝟏 and 𝒃, go RIGHT, go to 𝑺𝟐
and 𝒂, write 𝒃, go LEFT, go to 𝑺𝟑 and 𝒃, go RIGHT, stay in 𝑺𝟐
and 𝒂 or _, go RIGHT, go to 𝑺𝟒 and 𝒃, go LEFT, stay in 𝑺𝟑
and 𝒃, write 𝒂, go RIGHT, go to 𝑺𝟐
COMP90038 – Algorithms and Complexity
Lecture 22
Non-Deterministic Turing Machines
• From now onwards we will assume that a Turing Machine will be used to implement decision procedures: algorithms with a YES/NO answer.
• One variant of the Turing Machine has a powerful guessing capability: If different moves are available the machine will favour a move that ultimately leads to a ‘yes’ answer.
• Adding this kind of non-determinism to the capabilities of Turing Machines does not change what Turing machine can compute, but it may have an impact of efficiency.
• What a non-deterministic Turing Machine can compute in polynomial time corresponds exactly to the class of polynomial-time verifiable problems.
COMP90038 – Algorithms and Complexity
Lecture 22
Non-Deterministic Turing Machines
• • • •
What a non-deterministic Turing Machine can compute in polynomial time corresponds exactly to the class of polynomial-time verifiable problems.
P is class of problems solvable in polynomial time by a deterministic Turing Machine.
NP is the class of problems solvable in polynomial time by a non-deterministic Turing Machine.
Clearly 𝑃 ⊆ 𝑁𝑃. Is 𝑃 = 𝑁𝑃?
COMP90038 – Algorithms and Complexity Lecture 22
Problem Reduction
• The main tool used to determine the class of a problem is reducibility.
• A decision problem 𝑃 is said to be polynomially reducible to a decision problem 𝑄,
if there exists a function 𝑡 that transforms instances of 𝑃 to instances of 𝑄 such that: – 𝑡 maps all yes instances of 𝑃 to yes instances of 𝑄 and all no instances of 𝑃 to
no instances of 𝑄
– 𝑡 is computable by a polynomial time algorithm.
• A decision problem 𝐷 is said to be NP-complete if: – it belongs to class NP
– Every problem in NP is polynomially reducible to 𝐷.
COMP90038 – Algorithms and Complexity
Lecture 22
NP-Complete Problems
• SAT, SUBSET-SUM, 3COL, LPATH and HAM, as well as thousands of other interesting and practically relevant problems have been shown to be NP-complete.
• This explains the continuing interest in NP-completeness and related concepts from complexity theory.
• For such problems, we do not know of solutions that are essentially better than exhaustive search.
• There are many other interesting complexity classes, including space complexity classes and probabilistic classes.
COMP90038 – Algorithms and Complexity Lecture 22
Open Problems
• We do not know whether P=NP, and there are many other unsolved problems.
• We know that 𝑃 ⊆ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸, that is, there are problems that can be solved in
exponential time but provably not in polynomial time.
• We also know:
𝑃 ⊆ 𝑅𝑃 ⊆ 𝑁𝑃 ⊆ 𝑃𝑆𝑃𝐴𝐶𝐸 ⊆ 𝑁𝑃𝑆𝑃𝐴𝐶𝐸 ⊆ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸
• But which if these inclusions are strict? (Some must be; all could be…)
COMP90038 – Algorithms and Complexity
Lecture 22
Dealing with NP-Completeness
• Pseudo-polynomial problems (SUBSET-SUM and KNAPSACK are in this class): Unless you have really large data, don’t worry; for reasonably sized sets and numbers, the bad behaviour will not have kicked in yet.
• Clever engineering to push the boundary slowly: SAT solvers.
• Approximation algorithms: Settle for less than perfection.
• Live happily with intractability: Sometimes the bad instances never turn up in practice.
COMP90038 – Algorithms and Complexity Lecture 22
Coming Up Next Week
• Special topic lecture on quantum computing
• Subject review