Analysis of Algorithms
V. Adamchik CSCI 570 Spring 2020 Lecture 12 University of Southern California
NP-Completeness
Reading: chapter 9
Outline
The P vs. NP problem Polynomial reduction
23 Problems of Hilbert
In 1900 D. Hilbert presented a list of 23 challenging (unsolved) problems in math.
Hilbert’s 10th problem:
Given a multivariate polynomial with integer coeffs, e.g. 4x2y3 − 2x4z5 + x8, devise an effectively calculable procedure for determining whether it has an integer root.
Described a model of computation, known today as the Turing Machine.
A problem P is decidable (or computable) if it can be solved by a Turing machine that halts on every input.
We say that P has an algorithm.
Alan Turing (1936, age 22)
Turing Machines were adopted in the 1960’s, years after his death.
Turing’s Inspiration
Human writes symbols on paper
The paper is a sequence of squares
No upper bound on the number of squares At most finitely many kinds of symbols Human observes one square at a time Human has only finitely many mental states
Human can change symbols and change
focus to a neighboring square, but only
based on its state and the symbol it observes
Human acts deterministically
Machine with the read/write “head”
Head moves to left or right
…
input (the “tape”), indefinitely extensible to the right. input consists of symbols from an alphabet Σ
# and Δ represent the start and end of the input
The machine never overwrites the leftmost symbol, but
can overwrite the rightmost symbol.
When halt(accept/reject) state is reached, the machine halts. It might also never halt, in which case we say it loops.
#
0
1
1
0
1
0
0
Δ
Δ
High Level Example of a Turing Machine
transition
The machine that takes a binary string and appends 0 to the left side of the string.
states
start
0,0,R
1,1,R
Input: #10010Δ Output: #010010Δ
# – leftmost char Δ – rightmost char
Transition on each edge read,write,move (L or R)
S0
1,0,R 0,1,R
S1
halt
Runtime Complexity
Let M be a Turing machine that halts on all inputs.
Assume we compute the running time purely as a function of the length of the input string.
Definition: The running complexity is the function
f : N → N such that f(n) is the maximum number of steps that M uses on any input of length n.
Algorithms
A problem P is decidable if it can be solved by a Turing machine T that always halt.
We say that P has an algorithm.
A problem P is undecidable if there is no algorithm that solves it.
The intuition is based on the fact that there are as many problems as there real numbers, and only as many programs as there are integers, so there are not
enough programs to solve all the problems.
Complexity Classes
A fundamental complexity class P (or PTIME) is a class of decision problems that can be solved by a deterministic Turing machine in polynomial time.
A fundamental complexity class EXPTIME is a class of decision problems that can be solved by a deterministic Turing machine in
O(2 p(n) ) time, where p(n) is a polynomial.
A fundamental complexity class PSPACE is a class of decision problems that can be solved using a reasonable amount of memory (regardless time).
The Church–Turing Thesis
“Any natural / reasonable notion of computation can be simulated by a TM.”
This is not a theorem.
Is it… …an observation? …a definition? …a hypothesis? …a law of nature? …a philosophical statement?
Everyone believes it. No counterexample yet.
Hypercomputation
In 1995 Hava Siegelman proposed
Artificial Recurrent Neural Networks (ARNN). She claims that ARNNs can “compute” non-computable functions.
In 2006 Martin Davis criticized the idea in a paper “The Myth of Hypercomputation.”
Some people were disagree with Martin saying that it’s a myth that hypercomputation is a myth.
Nondeterministic Turing Machine
The deterministic Turing machine means that there is only one valid computation starting from any given input. A computation path is like a linked list.
Nondeterministic TM defined in the same way as deterministic, except that a computation is like a tree, where at any state, it’s allowed to have a number of choices.
The big advantage: it is able to try out many possible computations in parallel and to accept its input if any one of these computations accepts it.
deterministic computation
nondeterministic computation
Computations
. . .
accept or reject
reject
accept
accepts if some branch reaches an accepting configuration
Complexity Class: NP
A fundamental complexity class NP is a class of decision problems that can be solved by a nondeterministic Turing machine in polynomial time.
NP does NOT stand for “non-polynomial”.
Equivalently, the NP decision problem has a certificate that can be checked by a polynomial time deterministic Turing machine.
NP-problems like FIND a needle in a haystack: hard to find but always easy to verify.
P versus NP
It has been proven that Nondeterministic TM can be simulated by Deterministic TM.
But how fast we can do that simulation?
The famous P ≠ NP conjecture, would answer that we cannot hope to simulate nondeterministic Turing machines very fast (in polynomial time).
Dana Scott and Michael Rabin have introduced the concept of nondeterminism (Turing award in 1976) Without the notion of nondeterminism,
there would be no P=NP
Undecidable Problems
Undecidable means that there is no computer program that always gives the correct answer: it may give the wrong answer or run forever without giving any answer.
The halting problem is the problem of
deciding whether a given Turing machine halts when presented with a given input.
Turing’s Theorem:
The Halting Problem is undecidable.
Polynomial Reduction
Denoted by
Y ≤p X
Problem Y is “easier” than X.
Polynomial Reduction: Y ≤p X
To reduce a decision problem Y to a decision problem X (we write Y ≤p X) we want a function f that maps Y to X such that:
1) f is a polynomial time computable 2) ∀y ∈ Y (y is instance of Y) is YES
if and only if f(y) ∈ X is YES.
Y ≤p X
If we can solve X in polynomial time, we can solve Y in polynomial time.
Examples:
Bipartite Matching ≤p Max-Flow Circulation ≤p Max-Flow
Y ≤p X
If we can solve X, we can solve Y. The contrapositive of the above statement:
If we cannot solve Y, we cannot solve X.
We use this to prove NP completeness: knowing that Y is hard, we prove that X is harder.
Two ways of using Y ≤p X
1) X is easy
If we can solve X in polynomial time, we can solve Y in polynomial time.
2) Y is hard
Then X is at least as hard as Y
P and NP
P = set of problems that can be solved in
polynomial time by a deterministic TM.
NP = set of problems that can be solved in polynomial time by a nondeterministic TM.
NP = set of problems for which solution can be verified in polynomial time by a deterministic TM.
NP-Hard and NP-Complete
XisNP-Hard,if∀YNP andY≤p X.
X is NP-Complete, if X is NP-Hard and X NP.
NP-hard
NP-complete
NP
P
Halting Problem Chess,TSP, optim.
Tetris,Knapsack, SAT, Vertex Cover
Max-Flow Sorting
NPC problems can be solved by a nondeterministic TM in polynomial time.
It’s not known if NPC problems can be solved by a deterministic TM in polynomial time.
Note a gap?
NP-hard
NP-complete
Graph isomorphism
Factorization Discrete Log
and few
other candidates
NP-Intermediate
NP
P
?
It is proven that the gap is not empty.
But we do not know any natural problems in there.
Graph Isomorphism
Two graphs G1=(V1,E1) and G2=(V2,E2) are isomorphic if there is a bijective function f: V1 → V2 such that (v,w) E1 {f(v),f(w)} E2
b12 e
dc543
a
The two graphs look differently but are structurally the ‘same’, up to the renaming of the vertices.
Problem is in NP, but
• No NP-completeness proof is known
• No polynomial time algorithm is known
Boolean Satisfiability Problem (SAT)
A propositional logic formula is built from variables, operators AND (conjunction, ∧), OR (disjunction, ∨), NOT (negation, ¬), and parentheses:
(X1∨¬X3)∧(X1 ∨¬X2∨X4∨X5)∧…
A formula is said to be satisfiable if it can be made TRUE by assigning appropriate logical values (TRUE, FALSE) to its variables.
A formula is in conjunctive normal form (CNF) if it is a conjunction of clauses.
A literal is a variable or its negation.
A clause is a disjunction of literals.
Cook-Levin Theorem (1971) Theorem. CNF SAT is NP-complete.
No proof…
Cook received a Turing Award for his work. You are not responsible for knowing the proof.
Disjunctive Normal Form
A propositional logic formula that is a disjunction of conjunctions of literals:
(X1 ⋀¬X3 ⋀X4 )∨(¬X1 ⋀X2 ⋀¬X5)∨… Such a formula is satisfiable if and only if at least one of
its clauses is satisfiable.
And a conjunction is satisfiable if and only if it does not contain both X and X for some variable X.
This can be checked in linear time.
Independent Set
Given a graph, we say that a subset of vertices is “independent” if no two of them are joined by an edge.
12 345
67
Independent Set
Optimization Version:
Given a graph, find the largest independent set.
Decision Version:
Given a graph and a number k, does the graph contains an independent set of size at least k?
Optimization vs. Decision
Optimization vs. Decision
If one can solve an optimization problem (in polynomial time), then one can answer the decision version (in polynomial time)
Conversely, by doing binary search on the bound b, one can transform a polynomial time answer to a decision version into a polynomial time algorithm for the corresponding optimization problem
In that sense, these are essentially equivalent.
However, they belong to two different complexity classes.
Independent Set is NP Complete
Given a graph and a number k, does the graph contains an independent set of size at least k?
Is it in NP?
We need to show we can verify a solution in polynomial time.
Given a set of vertices, we can easily count them and then verify that any two of them are not joined by an edge.
Independent Set is NP Complete
Given a graph and a number k, does the graph contains an independent set of size at least k?
Is it in NP-hard?
We need to pick Y such that Y ≤p IndSet for ∀ Y NP
Reduce from 3-SAT.
3-SAT is SAT where each clause has at most 3 literals.
3SAT ≤p IndSet
We construct a graph G that will have an independent set of size k iff the 3-SAT instance with k clauses is satisfiable.
For each clause (X ⋁ Y ⋁ Z) we will be using a special gadget:
XZ
Y
(X ⋁ Y ⋁ Z) ⋀ (X ⋁ ¬Y ⋁ Z) ⋀ (¬X ⋁ Y ⋁ ¬Z) ⋀ (¬X ⋁ ¬Y)
Next, we need to connect gadgets.
As an example, consider the following instance:
3SAT ≤p IndSet
(X ⋁ Y ⋁ Z) ⋀ (X ⋁ ¬Y ⋁ Z) ⋀ (¬X ⋁ Y ⋁ ¬Z) ⋀ (¬X ⋁ ¬Y) X Z X Z ¬X ¬Z ¬X
Y ¬Y Y How do we connect gadgets?
Claim:
¬Y
3SAT ≤p IndSet
(X ⋁ Y ⋁ Z) ⋀ (X ⋁ ¬Y ⋁ Z) ⋀ (¬X ⋁ Y ⋁ ¬Z) ⋀ (¬X ⋁ ¬Y) X Z X Z ¬X ¬Z ¬X
Y ¬Y Y Proof. ⟹)
¬Y
3SAT ≤p IndSet
(X ⋁ Y ⋁ Z) ⋀ (X ⋁ ¬Y ⋁ Z) ⋀ (¬X ⋁ Y ⋁ ¬Z) ⋀ (¬X ⋁ ¬Y) X Z X Z ¬X ¬Z ¬X
Y ¬Y Y Proof. ⟸)
¬Y
Vertex Cover
Given G=(V,E), find the smallest S⊂V s.t. every edge is incident to vertices in S.
Vertex Cover
Theorem: for a graph G=(V,E), S is an independent set if and only if V-S is a vertex cover
Proof. ⟹)
Vertex Cover
Theorem: for a graph G=(V,E), S is a independent set if and only if V-S is a vertex cover
Proof. ⟸)
Vertex Cover in NP-Complete
Ind. Set ≤p Vertex Cover
Claim: a graph G=(V,E) has an independent set of size at least k if and only if G has a vertex cover of size at most V-k.
Discussion Problem 1
Show that vertex cover remains NP-Complete even if the instances are restricted to graphs with only even degree vertices. Let us call this problem VC-even.
Prove: VC ≤p VC-even
VC ≤p VC-even
VC ≤p VC-even
Discussion Problem 2
You are given an undirected graph G = (V, E) and for each vertex v, you are given a number p(v) that denotes the number of pebbles placed on v. We will now play a game where the following move is the only move allowed. You can pick a vertex u that contains at least two pebbles, and remove two pebbles from u and add one pebble to an adjacent vertex. The objective of the game is to perform a sequence of moves such that we are left with exactly one pebble in the whole graph. Show that the problem of deciding if we can reach the objective is NP-complete.