O. Braun
1Algorithmic
Complexity
Algorithmic Complexity
and NP-hard problems
O. Braun
2Algorithmic
Complexity
Try again harder!
NP-hard problems
Continue with the
next problem.
Yes No
Problem
Can you find an
efficient algorithm?
Still no?
How do you convey the bad
information to your boss?
O. Braun
3Algorithmic
Complexity
Approach 1: Take the loser‘s way out
„I can‘t find an efficient algorithm.
I guess, I‘m just too dumb.“
Drawback: Could seriously damage
your position within the company …
O. Braun
4Algorithmic
Complexity
Approach 2: Prove that the problem is inherently intractable
„I can‘t find an efficient algorithm,
because no such algorithm is possible!“
Drawback: Proving inherent
intractability can be as hard as finding
efficient algorithms. Even the best
theoreticians have failed!
O. Braun
5Algorithmic
Complexity
Approach 3: Prove that the problem is NP-complete
„I can‘t find an efficient algorithm,
but neither can all these famous people.“
Advantage: This would inform your
boss that it is no good to fire you
and hire another expert on algorithms.
O. Braun
6Algorithmic
Complexity
Problemlösungsprozess
Try again harder!
Show that the
problem is
NP-complete.No
Problem
Can you find an
efficient algorithm?
O. Braun
7Algorithmic
Complexity
NP-complete problems
A large class of problems consists of problems that are equivalent in the
following sense:
• A polynomial-time algorithm for any one of them would imply the existence
of a polynomial-time algorithm for all of these problems.
• These are the NP-complete problems.
• In 1971, Steve Cook proved the existence of NP-complete problems by
showing that the so called Satisfiability problem (Can a given Boolean
formula be satisfied or not?) is NP-complete (and the notion of NP-
completeness was first identified in the work of Steve Cook).
• The notion NP-complete refers always to decision problems, whereas the
corresponding optimization problems are NP-hard.
O. Braun
8Algorithmic
Complexity
NP-Completeness
1 Million US$
for the solution of one of
the Millenium problems.
Clay Mathematics Institute
of Cambridge,
Massachusetts (CMI)
NP-complete problems:
Problems that are “just as hard” as a large number of other problems that are
widely recognized as being difficult by algorithmic experts.
Properties:
– exponential running time (as what we know today)
– polynomial reducable
O. Braun
9Algorithmic
Complexity
P=NP? P=NP?
The question of whether or not NP-complete problems are intractable is considered to be
one of the foremost open questions of contemporary computer science.
O. Braun
10Algorithmic
Complexity
O. Braun
11Algorithmic
Complexity
Problemlösungsprozess
Try again harder!
Show that the
problem is
NP-complete.No
Problem
Can you find an
efficient algorithm?
Use a Heuristic /
Approximation
algorithm.
Worst-case
analysis.
O. Braun
12Algorithmic
Complexity
Optimization vs. Decision problems
Optimization problem
Find a subset of all objects such that
under the condition that the knapsack
capacity is not exceeded, this subset
has the maximum possible value.
Answer: optimal solution
NP-hard
Decision problem
The theory of NP-completeness applies
only to decision problems, where the
solution is either a “Yes” or a “No”.
We associate with the optimization
problem a decision problem that
includes a numerical bound B as an
additional parameter and that asks
whether there exists a solution having a
value greater or equal to B.
Answer: yes or no
NP-complete
O. Braun
13Algorithmic
Complexity
Complexity Classes
O. Braun
14Algorithmic
Complexity
Terminology
P = Polynomial: The class P is the class of all decision problems that, under reasonable encoding
schemes, can be solved by polynomial-time deterministic algorithms.
NP = Non-deterministic Polynomial: The class NP is the class of all decision problems that, under
reasonable encoding schemes, can be solved by polynomial-time non-deterministic algorithms.
NP–complete: The class NP-complete is the class of all problems X in NP for which it is possible to
reduce any other NP-complete problem Y to X in polynomial time. Intuitively this means that we can
solve Y quickly if we know how to solve X quickly. Precisely, Y is reducible to X, if there is an
algorithm f to transform instances y of Y to instances f(y) = x of X in polynomial time, with the property
that the answer to y is yes, if and only if the answer to f(y) = x is yes.
NP–hard: Intuitively, these are the problems that are at least as hard as the NP-complete problems. NP-
hard problems do not have to be in NP, and they do not have to be decision problems. The halting
problem (given a program P and input I, will it halt?) is an NP-hard problem that is not in NP. The
precise definition is that a problem Z is NP-hard, if there is an NP-complete problem Y, such that Y is
reducible to Z in polynomial time.
O. Braun
15Algorithmic
Complexity
Proving NP-completeness
Let’s say we want to show that a (new) problem X that we can’t solve is NP-complete.
The guideline for the process of devising an NP-completeness proof for a decision problem
X is the recipe described in Garey and Johnson’s book [GJ79]:
1. Show that X is in NP.
2. Select a known NP-complete problem Y.
3. Show that Y is polynomially reducible to X.
4. Prove that the input of the problem X can be computed in polynomial time from the
input of Y.
We say that Y is polynomially reducible to X (denoted Y ∝ X) if it is possible to transform
inputs for Y into inputs for X in polynomial time such that we can solve Y if and only if we
can solve X. Reducibility among decision problems is transitive, so for an NP-completeness
proof of a decision problem X ∈ NP it is sufficient to show that a known NP-complete
problem Y ∈ NP is polynomially reducible to X.
O. Braun
16Algorithmic
Complexity
Proving NP-completeness
The first step (show that X is NP by verifying that a certificate for X, i.e. a solution to the
problem X, actually solves the problem), and to prove that the input of the problem X can
be computed in polynomial time from the input of Y are often trivial to see.
The challenging parts are selecting a known NP-complete problem Y and showing that Y is
reducible to X. The recipe for doing this is as follows:
• Take an instance of Y and add some more input so that it will be a special instance for X.
• Show that the only way to solve this special instance for X is to solve problem Y.
O. Braun
17Algorithmic
Complexity
Script
• Y=Subset Sum: Partition is NP-complete. Sequencing within intervals is NP-complete.
• Y=Partition: Sequencing within intervals is NP-complete. Parallel processor scheduling
with two processors (P2||Cmax) is NP-complete. Knapsack is NP-complete. Flow-Shop
Scheduling with three processors (F3||Cmax) is NP-complete.
• NP-complete in the ordinary and strong sense. Time complexity as a function of natural
parameters [GJ79], pp.106-107.
• Y=3-Partition: Sequencing within intervals is NP-complete in the strong sense. Parallel
processor scheduling is NP-complete in the strong sense. F3||Cmax is NP-complete in
the strong sense.
Algorithmic complexity
and NP-hard problems
(Oliver Braun, June 20, 2021)
A large class of problems consists of problems that are equivalent in the following sense: a
polynomial-time algorithm for any one of them would imply the existence of a polynomial-
time algorithm for all of these problems. These are the NP-complete problems. In 1971
Steve Cook [Coo71] proved the existence of NP-complete problems by showing that the
so called Satisfiability problem (Can a given boolean formula be satisfied or not?) is NP-
complete (and the notion of NP-completeness was first identified in the work of Steve
Cook). The notion NP-complete refers always to decision problems, whereas the corre-
sponding optimization problems are NP-hard. The guideline for the process of devising
an NP-completeness proof for a decision problem X is the recipe described in Garey and
Johnson’s book [GJ79]:
1. Show that X is in NP,
2. select a known NP-complete problem Y ,
3. show that Y is reducible to X,
4. prove that the input of X can be computed in polynomial time from the input of
Y .
The first step (show that X is NP by verifying that a certificate for X, i.e. a solution to
the problem X, actually solves the problem), and to prove that the input of the problem
X can be computed in polynomial time from the input of Y are often trivial to see. The
1
2 6. Algorithmic complexity
challenging parts are selecting a known NP-complete problem Y and showing that Y is
reducible to X. The recipe for doing this is as follows:
Take an instance of Y and add some more input so that it will be a special instance for
X. Show that the only way to solve this special instance for X is to solve problem Y .
We say that Y is polynomially reducible to X (denoted Y ∝ X) if it is possible to
transform inputs for Y into inputs for X in polynomial time such that we can solve Y
if and only if we can solve X. Reducibility among decision problems is transitive, so for
an NP-completeness proof of a decision problem X ∈ NP it is sufficient to show that a
known NP-complete problem Y ∈ NP is polynomially reducible to X.
6.1 NP-complete problems
The literature about NP-complete problems is vast and we decided to choose the following
NP-complete problems:
� Subset Sum,
� Partition,
� 3-Partition,
� (Numerical)3-Dimensional Matching,
� Sequencing within intervals
(where we reduce from), and
� Parallel processor scheduling (with m or m = 2 processors),
� Flow-Shop scheduling (with m = 3 processors),
� Knapsack,
� Bin Packing,
� Zero-One Equations (ZOE)
6.1 NP-complete problems 3
(where we reduce to). The formal description of these problems, together with problem
instances, is described in what follows.
[Subset Sum] Given a list of n positive integers s1, s2, . . . , sn and a positive integer z.
Does there exist a subset I ⊆ {1, . . . , n} such that
∑
i∈I si = z? (I can be considered as
an index set.)
[Example 6.1] n = 9 positive integers 2, 3, 3, 5, 6, 7, 8, 9, 9, z = 35.
The answer to the decision problem is yes, such a subset exists: I = {2, 3, 4, 6, 7, 8} or
si = 3, 3, 5, 7, 8, 9.
[Partition] Given a list of n positive integers s1, s2, . . . , sn with σ =
∑n
i=1 and a positive
integer z = σ/2. Does there exist a subset I ⊆ {1, . . . , n} such that
∑
i∈I si =
∑
i 6∈I si =
z? (It is interesting that the problem remains NP-complete even if it is required that
| I |= n/2 ([GJ79], p.223).)
[Example 6.2] n = 9 positive integers 2, 3, 3, 5, 6, 7, 8, 9, 9, z = 26.
The answer to the decision problem is yes, such a subset exists: I = {1, 6, 7, 8} or
si = 2, 7, 8, 9. Another yes-answer would be I = {7, 8, 9} or si = 8, 8, 9.
[3-Partition] Given a list of n = 3m positive integers s1, s2, . . . , sn with σ =
∑n
i=1 si and
a positive integer z = σ/m. Does there exist a partition of {1, . . . , n} into m disjoint sets
I1, . . . , Im such that
∑
i∈I1 si = . . . =
∑
i∈Im si = z? (Note that often it is assumed that
z/4 < si < z/2 and that each set I1, . . . , Im must therefore contain exactly three elements of 1, . . . , n.) [Example 6.3] n = 12 positive integers 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6,∑n i=1 si = 56, m = n/3 = 4, z = 56/4 = 14 and 14/4 < si < 14/2. The answer to the decision problem is yes, such a partition exists: I1 = {4, 4, 6}, I2 = {4, 4, 6}, I3 = {4, 5, 5}, I4 = {4, 5, 5}. 4 6. Algorithmic complexity [Example 6.4] If we omit the assumption that z/4 < si < z/2, there might obviously be nevertheless a solution where each set I1, . . . , Im contains exactly three elements of 1, . . . , n: n = 12 positive integers 1, 2, 2, 3, 3, 3, 4, 6, 7, 7, 8, 10,∑n i=1 si = 56, m = n/3 = 4, z = 56/4 = 14. The answer to the decision problem is yes, such a partition exists: I1 = {2, 2, 10}, I2 = {3, 3, 8}, I3 = {1, 6, 7}, I4 = {3, 4, 7}. Note that another yes-solution is: I1 = {4, 10}, I2 = {6, 8}, I3 = {7, 7}, I4 = {1, 2, 2, 3, 3, 3}. [Example 6.5] When we add a sufficiently large number to each one of the si, then we can force that in the sets I1, . . . , Im there must be exactly 3 elements of 1, . . . , n: n = 12 positive integers 1001, 1002, 1002, 1003, 1003, 1003, 1004, 1006, 1007, 1007, 1008, 1010, ∑n i=1 si = 12056, m = n/3 = 4, z = 12056/4 = 3014. The answer to the decision problem is yes, such a partition exists: I1 = {1002, 1002, 1010}, I2 = {1003, 1003, 1008}, I3 = {1001, 1006, 1007}, I4 = {1003, 1004, 1007}. [Sequencing within intervals: 1 | rj, dj | ] Given one processor, n jobs with positive integer processing times p1, . . . , pn, release times r1, . . . , rn and deadlines d1, . . . , dn. Does there exit a feasible schedule so that each job completes by its deadline? [Parallel processor scheduling (with m or m = 2 processors): P || Cmax, P2 || Cmax] Given an arbitrary large number m (or m = 2) parallel and identical processors, n jobs with positive integer processing times p1, p2, . . . , pn and a number z. Does there exit a schedule with makespan not more than z? [Flow-Shop Scheduling with 3 processors: F3 || Cmax] Given a Flow-Shop with three processors, n jobs with positive integer processing times of the operations p1,1, . . . , p1,n on processor P1, p2,1, . . . , p2,n on processor P2, p3,1, . . . , p3,n on processor P3, and a number z. Does there exit a schedule with makespan not more than z? [Knapsack] Given positive integer weights w1, . . . , wn and values v1, . . . , vn, an integer weight constraint W ≥ 1 and an integer value goal V ≥ 1. Does there exist a subset I ⊆ {1, . . . , n} such that ∑ j∈I wj ≤ W and ∑ j∈I vj ≥ V ? 6.2 Y=Subset Sum 5 6.2 Y=Subset Sum In our first example for performing an NP-completeness proof, we are going to show that Subset Sum is polynomially reducible to Partition, i.e. Subset Sum ∝ Partition. Theorem 6.1 Partition is NP-complete. Proof: It is easy to see that Partition is in NP. A certifier takes as a certificate the subset A ⊂ {1, . . . , n} that is purported to add up to z. In polynomial time, we can compute the sum of these numbers and verify that it is equal to z. We now show that Subset Sum is reducible to Partition. Consider an instance of Subset Sum with n positive integers s1, s2, . . . , sn and a value z. We construct (in polynomial time) an equivalent instance of Partition as follows: The numbers si are the same as in the Subset Sum problem instance. In addition, we define for the Partition instance two more jobs (Garey and Johnson introduce the term Enforcer for such jobs [GJ79], p. 69) as follows. Let σ = ∑n i=1 si be the sum of the processing times of the first n jobs. Then we set sn+1 = 2σ − z and sn+2 = σ + z. Now we consider any feasible solution to this instance of Partition. In this solution, the jobs sn+1 and sn+2 cannot be in the same subset A as their sum (3σ) is too big to be part of the same partition (all other jobs add up to just σ and the sum of all of the numbers s1, . . . , sn+2 is exactly 4σ). Partition 1 sn+1 sj , j ∈ A 0 2σ − z 2σ Partition 2 sn+2 sj , j 6∈ A 0 σ + z 2σ Figure 6.1: We have to solve Subset Sum in order to solve Partition. The only way to achieve this is to solve a Subset Sum problem as out of the first n numbers s1, . . . , sn we have to choose a subset A ⊆ {1, . . . , n} such that ∑ j∈A sj = z. The numbers in A sum up together with sn+1 to 2σ and the numbers in {1, . . . , n} \ A sum up together with sn+2 to 2σ as well. Conversely, if there are numbers that add up to exactly z in the Subset Sum instance, then this leads directly to a solution of the Partition instance. 6 6. Algorithmic complexity Since Partition is in NP, since the input for Partition can be computed in polynomial time from the input for Subset Sum, and since we can reduce the NP-complete problem Subset Sum to Partition, Partition must also be NP-complete. � Next, we want to look at a scheduling problem with release times and deadlines 1 | rj, dj |. Garey and Johnson name this problem Sequencing within intervals (p. 70 in [GJ79]). The following proof can be found in similar form in the book of Kleinberg and Tardos (pp. 493–494). Theorem 6.2 Sequencing within intervals is NP-complete. Proof: It is easy to see that Sequencing within intervals is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each job) and checks that the job runs between its release time and deadline. We now show that Subset Sum is reducible to Sequencing within intervals. Consider an instance of Subset Sum with n positive integers s1, s2, . . . , sn, σ = ∑n j=1 sj and a positive integer z. We construct (in polynomial time) an equivalent instance of Sequencing within intervals as follows: pj = sj, rj = 0, dj = σ + 1 for all j ∈ {1, . . . , n}. In addition, we define for the 1 | rj, dj | instance one more job Jn+1 with pn+1 = 1, rn+1 = z, dn+1 = z+ 1. Garey and Johnson call this job the Enforcer (p. 69 in [GJ79]). Now we consider any feasible solution to this instance of Sequencing within intervals. In this solution, job Jn+1 must be run in the interval [z, z + 1]. This leaves σ available time units for the first n jobs (that all have release time 0 and deadline σ+ 1). in two separate blocks, namely in the range from 0 to z and from z+ 1 to σ+ 1, Thus, the processor must not have any idle time, when no jobs are running. P1 pj , j ∈ I pn+1 pj , j /∈ I 0 z z + 1 σ + 1 Figure 6.2: We have to solve Subset Sum in order to solve 1 | rj, dj |. The only way to achieve this is to solve a Subset Sum problem as out of the first n numbers s1, . . . , sn we have to choose a subset I ⊆ {1, . . . , n} such that ∑ j∈I sj = z. These jobs are scheduled in the interval [0, z), and the remaining jobs are scheduled in the interval [z + 1, σ + 1). 6.3 Y=Partition 7 Conversely, if we can solve a given instance of the scheduling problem, then this leads directly to a solution of the Subset Sum instance. � 6.3 Y=Partition We start with the scheduling problem with release dates and deadlines that already has been considered in Section 6.2. This time, we use Partition to show that Sequencing within intervals is NP-complete. Garey and Johnson gave a similar version of this proof in their book on pp. 70-71 [GJ79]. The proof is almost identical to the proof of Theorem 6.2. Theorem 6.3 Sequencing within intervals is NP-complete. Proof: It is easy to see that Sequencing within intervals is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each job) and checks that the job runs between its release time and deadline. We now show that Partition is reducible to Sequencing within intervals. Consider an instance of Partition with n positive integers s1, s2, . . . , sn, σ = ∑n i=1 si and a positive in- teger z = dσ/2e. We construct (in polynomial time) an equivalent instance of Sequencing within intervals as follows: pj = si (for i = j), rj = 0, dj = σ + 1 for all j ∈ {1, . . . , n}. In addition, we define for the 1 | rj, dj | instance one more job (the Enforcer) Jn+1 with pn+1 = 1, rn+1 = z, and dn+1 = z + 1 if z is even and dn+1 = z if z is odd. If n is odd, then we would have rn+1 = dn+1 = z so that Jn+1 could not possibly be scheduled. Thus, from now on, let us assume that z is even. Now we consider any feasible solution to this instance of Sequencing within intervals. In this solution, job Jn+1 must be run in the interval [z, z + 1]. This leaves σ available time units for the first n jobs (that all have release time 0 and deadline σ + 1). in two separate blocks, namely in the range from 0 to z and from z + 1 to σ + 1, Thus, the processor must not have any idle time, when no jobs are running. This can be done only if we can solve the Partition problem, i.e. if there is a partition I ⊆ {1, . . . , n} such that∑ i∈I si = ∑ i 6∈I si = z = σ/2. Conversely, if we can solve a given instance of the scheduling problem, then this leads directly to a solution of the Partition instance. � 8 6. Algorithmic complexity P1 pj , j ∈ I pn+1 pj , j /∈ I 0 z = σ/2 z + 1 σ + 1 σ/2 time units σ/2 time units Figure 6.3: We have to solve Partition in order to solve Sequencing within intervals. Next, we look at the decision version of the parallel processor scheduling problem with two processors P2 || Cmax. The formal definition of this problem can be found in Garey and Johnson [GJ79] on p. 65, where they also give a very easy way to prove the NP- completeness of P2 || Cmax. We would just have to write: Restrict P2 || Cmax to Partition by allowing only instances with z = (∑n j=1 pj ) / 2. As it is a good exercise, we want to give a complete reduction proof. Theorem 6.4 Parallel processor scheduling with two processors (P2 || Cmax) is NP- complete. Proof: It is easy to see that P2 || Cmax is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each job on P1 and P2) and checks that the makespan is at most the given bound z. We now show that Partition is reducible to P2 || Cmax. Consider an instance of Partition with n positive integers s1, s2, . . . , sn, σ = ∑n j=1 and an integer value z = σ/2. If σ is odd, then it would be impossible to find a partition and the answer to the decision problem would be no. Thus, from now on, let us assume that z is even. We construct (in polynomial time) an equivalent instance of P2 || Cmax as follows: pj = si for i = j = 1, . . . , n and the same z as in Partition. Now we consider any feasible solution to this instance of P2 || Cmax. Let I be a subset of {1, . . . , n}. In this solution, the jobs are scheduled on P1 (all pj, j ∈ I) and P2 (all pj, j /∈ I) such that the makespan is equal to z = σ/2. The only way to achieve this is to solve a Partition problem as out of the numbers p1, . . . , pn we have to choose a subset I ⊆ {1, . . . , n} such that ∑ j∈I sj = z = σ/2. The numbers in I sum up together to z = σ/2 and the numbers in {1, . . . , n} \ I sum up together to z = σ/2 as well. Conversely, if there are numbers that add up to exactly z = σ/2 in the Partition instance, 6.3 Y=Partition 9 P1 pj , j ∈ I 0 z = σ/2 P2 pj , j 6∈ I 0 z = σ/2 Figure 6.4: We have to solve Partition in order to solve P2 || Cmax. then we can schedule these on P1 and the remainder on P2 and get a feasible solution to the instance of P2 || Cmax. Since P2 || Cmax is in NP, since the input for P2 || Cmax can be computed in polynomial time from the input for Partition, and since we can reduce the NP-complete problem Partition to P2 || Cmax, P2 || Cmax must also be NP-complete. � Similar as before, we can prove by restriction that the decision version of the knapsack problem is NP-complete. Theorem 6.5 Knapsack is NP-complete. Proof: Restrict Knapsack to Partition by allowing only instances in which wj = vj for all j = {1, . . . , n} and B = K = (∑ j∈Awj ) /2. � We now look at the decision version of the Flow-Shop problem with three processors F3 || Cmax. Theorem 6.6 Flow-Shop Scheduling with three processors (F3 || Cmax) is NP-complete. Proof: It is easy to see that F3 || Cmax is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each operation on P1, P2 and P3) and checks that the makespan is at most the given bound z. We now show that Partition is reducible to F3 || Cmax. Consider an instance of Partition with n positive integers s1, s2, . . . , sn, σ = ∑n i=1 si and a positive integer z = σ/2. We construct (in polynomial time) an equivalent instance of F3 || Cmax as follows: We are going to define n jobs J1, . . . , Jn, corresponding to s1, . . . , sn, and one additional job Jn+1. 10 6. Algorithmic complexity Jobs J1, . . . , Jn: p1,j = 0, p2,j = sj, p3,j = 0 for j = 1, . . . , n (this means that we can schedule n jobs with processing times that correspond to si whenever we want on processor P2). In addition to these jobs we define an additional job Jn+1 with p1,n+1 = p2,n+1 = p3,n+1 = z. Finally we set z = 3 · z. Now we consider any feasible solution to this instance of F3 || Cmax. Imagine, we would only have to schedule the last job Jn+1 with processing times p1,n+1 = p2,n+1 = p3,n+1 = z on the three processors P1, P2 and P3. The best we can do (in order to minimize the makespan) is to schedule this job Jn+1 as depicted in Figure 6.5 and the resulting makespan is z = 3 · z. P1 pn+1 0 z = σ/2 3z = z P2 pn+1 0 z = σ/2 2z 3z = z P3 pn+1 0 2z 3z = z Figure 6.5: Jn+1 has processing times p1,n+1 = p2,n+1 = p3,n+1 = z. The question is: Can we schedule the remaining n jobs without increasing the makespan? Again, as the first n jobs have only operations on processor P2 we can schedule these jobs whenever we want on processor P2. The only way to achieve this is to solve a Partition problem as out of the numbers p2,1, . . . , p2,n we have to choose a subset I ⊆ {1, . . . , n} such that ∑ j∈I p2,j = z. The processing times pj,j∈I sum up together to z and the processing times pj,j∈{1,...,n}\I , sum up together to z as well. P1 pn+1 ///////////////////////////////// 0 z = σ/2 3z = z pj,j∈I pn+1 pj,j∈{1,...,n}\IP2 0 z = σ/2 2z 3z = z P3 ///////////////////////////////// pn+1 0 2z 3z = z Figure 6.6: We have to solve Partition in order to solve F3 || Cmax. 6.4 NP-complete in the ordinary and strong sense 11 Conversely, if there are numbers that add up to exactly z in the Partition instance, then we can schedule these as show in Figure 6.6. Since F3 || Cmax is in NP, since the input for F3 || Cmax can be computed in polynomial time from the input for Partition, and since we can reduce the NP-complete problem Partition to F3 || Cmax, F3 || Cmax must also be NP-complete. � 6.4 NP-complete in the ordinary and strong sense Sources: [GJ79], pp. 90–92, [KT14], pp. 492–495. Subset Sum and Partition are NP- complete in the ordinary sense, i.e. the problems cannot be optimally solved by algorithms with polynomial-time complexity but with algorithms of time complexity O (n · z) (Subset Sum) and O (n · ∑ sj) (Partition). When z and max sj are bounded by a polynomial function of n, these are in fact a polynomial-time algorithms. In these cases, we say that the algorithms run in pseudo-polynomial time to indicate that they run in time polynomial in the magnitude of the input numbers, but not polynomial in the size of their representation. If a problem is NP-complete in the strong sense, there is (unless P=NP) not even a polynomial time approximation scheme for this problem. For example, the NP-hard knapsack problem can be solved by a dynamic programming algorithm requiring a number of steps polynomial in the size of the knapsack and the number of items (assuming that all data are scaled to be integers); however, the runtime of this algorithm is exponential time since the input sizes of the objects and knapsack are logarithmic in their magnitudes. However, as Garey and Johnson [GJ79] observed: ”A pseudo-polynomial-time algorithm () will display exponential behavior only when con- fronted with instances containing exponentially large numbers, [which] might be rare for the application we are interested in. If so, this type of algorithm might serve our purposes almost as well as a polynomial time algorithm.” Another example for a weakly NP-complete problem is the subset sum problem. For example, Bin Packing is strongly NP-complete while the 0-1-Knapsack Problem is only weakly NP-complete. Thus the version of Bin Packing where the object and bin sizes are integers bounded by a polynomial remains NP-complete, while the corresponding ver- sion of the Knapsack problem can be solved in pseudo-polynomial time by Dynamic Pro- gramming. While weakly NP-complete problems may admit efficient solutions in practice as long as their inputs are of relatively small magnitude, strongly NP-complete problems 12 6. Algorithmic complexity do not admit efficient solutions in these cases. From a theoretical perspective any strongly NP-hard optimization problem with a polynomially bounded objective function cannot have a fully polynomial-time approximation scheme (or FPTAS) unless P = NP. Time complexity as a function of natural parameters [GJ79], pp.106–107. In practice, we consider time complexity expressed in terms of natural problem parameters instead of the artificially constructed input length. For example, consider Multiprocessor Scheduling (P2 || Cmax and P3 || Cmax). Here a collection of natural parameters might consist of the number n of jobs, the number m of processors, and the length L of the longest job. The ordinary NP-completeness result for this problem proved in Theorem 6.4 implies that, unless P=NP, P2 || Cmax cannot be solved in the three parameters n,m, and logL. However, one can still ask whether it is possible to have an algorithm with time complexity polynomial in mn and logL, or polynomial in nm and logL, or polynomial in n,m, and L, or even polynomial in (nL)m. The original NP-completeness result for P2 || Cmax actually shows that the subproblem in which m is restricted to the value 2 is NP-complete, thus ruling out an algorithm polynomial in nm and logL (unless P=NP), since such an algorithm would be a polynomial time algorithm for this P2 || Cmax. The NP-completeness result for P2 || Cmax does not rule out an algorithm polynomial in mn = 2n and logL, and indeed exhaustive search algorithms having such a time complexity can be designed. Analogously, the strong NP-completeness result for P3 || Cmax claimed in Theorem 6.8 rules out an algorithm polynomial in n,m, and L (unless P=NP). It leaves open the pos- sibility of an algorithm polynomial in (nL)m (which would give a pseudo-polynomial time algorithm for each fixed value of m), and such an algorithm can be shown to exist. Thus, considering the subproblems P2 || Cmax and P3 || Cmax obtained by placing restrictions on one or more of the natural problem parameters, we obtain useful information about what types of algorithms are possible for the general problem P || Cmax. Thus the theory of NP-completeness can be used to guide our search not only for polynomial time algorithms, but for exponential time algorithms as well. Input for X is polynomially related to input for Y with respect to length of the input and largest number [GJ79]. 6.5 Y=3-Partition 13 6.5 Y=3-Partition We start this section again with the scheduling problem with release dates and deadlines that already has been considered in Sections 6.2 and 6.3. This time, we use 3-Partition to show that Sequencing within intervals is NP-complete in the strong sense. Garey and Johnson gave a similar version of this proof in their book on pp. 102-103 [GJ79]. Theorem 6.7 Sequencing within intervals is NP-complete in the strong sense. Proof: It is easy to see that Sequencing within intervals is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each job) and checks that the job runs between its release time and deadline. Given a list of n = 3m positive integers s1, s2, . . . , sn with σ = ∑n i=1 and a positive integer z = S/m. Does there exist a partition of {1, . . . , n} into m disjoint sets I1, . . . , Im such that ∑ i∈I1 si = . . . = ∑ i∈Im si = z? We now show that 3-Partition is reducible to Sequencing within intervals. Consider an instance of 3-Partition with n = 3m positive integers s1, s2, . . . , sn, σ = ∑n i=1 si and a positive integer z = σ/m. We construct (in polynomial time) an equivalent instance of Sequencing within intervals as follows: pj = si (for i = j), rj = 0, dj = σ + (m − 1) for all j ∈ {1, . . . , n}. In addition, we define for the 1 | rj, dj | instance m − 1 enforcers Ji as follows: pi = 1, ri = iz + i− 1, di = iz + i for i = n+ 1, . . . , n+m− 1. Now we consider any feasible solution to this instance of Sequencing within intervals. In this solution, the enforcer jobs Ji, i = n + 1 . . . , n + m − 1, must be run in the intervals [iz + i− 1, iz + i] as shown in Figure 6.7. P1 Jn+1 Jn+2 · · · Jn+m−1 0 z z + 1 2z + 1 2z + 2 (m− 1)z +m− 2 mz +m− 1 z z z Figure 6.7: We have to solve 3-Partition in order to solve Sequencing within intervals. This leaves m separate blocks of time (marked gray), each of length exactly z, and since this is just enough time in total to accommodate all the jobs Jj, j ∈ I = {1, . . . , n}, each 14 6. Algorithmic complexity block must be completely filled. These blocks therefore play the same role as the sets I1, I2, . . . , In in the desired partition of I. It follows that the desired sequence exists if and only if the desired partition exists for the given 3-Partition instance. Sequencing within intervals is in NP, the input for Sequencing within intervals can be computed in polynomial time from the input for 3-Partition, the constructed instance of Sequencing within intervals is polynomial related to the length of the given 3-Partition instance, the largest number in the constructed instance is mz+m−1 (which is polynomial in z), and we can reduce the strongly NP-complete problem 3-Partition to Sequencing within intervals. Therefore, there exists a pseudo-polynomial transformation from 3- Partition to Sequencing within intervals and this proves that the latter problem is NP- complete in the strong sense (see Lemma 4.1 on p. 101 in [GJ79]). � Next, we look at the decision version of the parallel processor scheduling problem with an arbitrary large number m of processors P || Cmax. Theorem 6.8 Parallel processor scheduling is NP-complete in the strong sense. Proof: It is easy to see that Parallel processor scheduling is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each job on P1, . . . , Pm) and checks that the makespan is at most the given bound z. We now show that 3-Partition is reducible to Parallel processor scheduling. Consider an instance of 3-Partition with n = 3m positive integers s1, s2, . . . , sn, σ = ∑n i=1 si and a positive integer z = σ/m. We construct (in polynomial time) an equivalent instance of Parallel processor scheduling as follows: pj = si for i = j = 1, . . . , n and the same z as in 3-Partition. Now we consider any feasible solution to this instance of Parallel processor scheduling. Let I be a subset of {1, . . . , n}. In this solution, the jobs must be scheduled on the processors P1, . . . , Pm such that the completion time of the jobs on each processor is equal to z = σ/m. The only way to achieve this is to solve a 3-Partition problem and answer the question if there exists a partition of {1, . . . , n} into m disjoint sets I1, . . . , Im such that ∑ i∈I1 p2,i = . . . = ∑ i∈Im p2,i = z. Note that we do not need for this proof that each partition contains exactly 3 elements, i.e. it is not necessary to assume that z/4 < si < z/2. 6.5 Y=3-Partition 15 P1 pj , j ∈ I1 0 z = σ/m P2 pj , j ∈ I2 0 z = σ/m ·· · ·· · Pm pj , j ∈ Im 0 z = σ/m Figure 6.8: We have to solve 3-Partition in order to solve Parallel processor scheduling. Parallel processor scheduling is in NP, the input for Parallel processor scheduling can be computed in polynomial time from the input for 3-Partition, the constructed instance of Parallel processor scheduling is polynomial related to the length of the given 3-Partition instance, the largest number in the constructed instance is z and we can reduce the strongly NP-complete problem 3-Partition to Parallel processor scheduling. Therefore, there exists a pseudo-polynomial transformation from 3-Partition to Parallel processor scheduling and this proves that the latter problem is NP-complete in the strong sense. � In the next Theorem, we show that F3 || Cmax is not only NP-complete, it is even NP- complete in the strong sense. This result has been published by Garey, Johnson, and Sethi in 1976 [GJS76]. Theorem 6.9 F3 || Cmax is NP-complete in the strong sense. Proof: It is easy to see that F3 || Cmax is in NP. A certifier takes as a certificate a schedule (given e.g. as starting times for each operation on P1, P2 and P3) and checks that the makespan is at most the given bound z. We now show that 3-Partition is reducible to F3 || Cmax. Consider an instance of 3- Partition with n positive integers s1, s2, . . . , sn, σ = ∑n i=1 si and a positive integer z = σ/m. We construct (in polynomial time) an equivalent instance of F3 || Cmax as follows: Jobs J1, . . . , Jn: p1,j = 0, p2,j = sj, p3,j = 0 for j = 1, . . . , n (this means that we can schedule n jobs with processing times that correspond to si whenever we want on the second pro- cessor). In addition to these n = 3m jobs we define m+ 1 additional jobs Jn+1, . . . , J4m+1 16 6. Algorithmic complexity with p1,n+1 = 0, p2,n+1 = z, p3,n+1 = 2z, p1,t = 2z, p2,t = z, p3,t = 2z (for t = n+1, . . . , 4m), p1,4m+1 = 2z, p2,4m+1 = z, p3,4m+1 = 0. Finally we set zf = (2m+ 1) · z. Imagine, we would only have to schedule the jobs Jn+1, . . . , J4m+1. The best we can do (in order to minimize the makespan) is to schedule them in the given order as depicted in Figure 6.9 and the resulting makespan is zf = (2m + 1) · z. Note that a makespan of value < (2m + 1) · z cannot be achieved for these m + 1 jobs as each one of these jobs will incur a delay of at least z at the start of processor P3 and the sum of the processing times of the jobs Jn+1, . . . , J4m+1 on P3 is 2m · z. P1 Jn+2 Jn+3 . . . J4m+1 ///// 0 2z 4z 2mz (2m+ 1)z P2 Jn+1 Jn+2 Jn+3 . . . J4m+1 0 z 2z 3z 4z 5z 2mz (2m+ 1)z P3 ///// Jn+1 Jn+2 Jn+3 J4m 0 z 3z 5z (2m− 1)z (2m+ 1)z Figure 6.9: Schedule for F3 || Cmax. The question is: Can we schedule the first n = 3m jobs without increasing the makespan? Again, as the first n jobs J1, . . . Jn have only operations on processor P2 we can schedule these jobs whenever we want on processor P2. On P2, there are m gaps of idle time of length z each (marked gray in Figure 6.9) and if we do not want to increase the makespan, there must be a way to split the first n jobs into these gaps. The only way to achieve this is to solve a 3-Partition problem and answer the question if there exists a partition of {1, . . . , n} into m disjoint sets I1, . . . , Im such that ∑ i∈I1 p2,i = . . . = ∑ i∈Im p2,i = z. These subsets will fit into the gaps in our schedule. Note that we do not need for this proof that each partition contains exactly 3 elements, i.e. it is not necessary to assume that z/4 < si < z/2. F3 || Cmax is in NP, the input for F3 || Cmax can be computed in polynomial time from the input for 3-Partition, the constructed instance of F3 || Cmax is polynomial related to the length of the given 3-Partition instance, the largest number in the constructed instance is (2m + 1)z (which is polynomial in z), and we can reduce the strongly NP- complete problem 3-Partition to F3 || Cmax. Therefore, there exists a pseudo-polynomial transformation from 3-Partition to F3 || Cmax and this proves that the latter problem is 6.5 Y=3-Partition 17 NP-complete in the strong sense. � [Example 6.6] Consider the problem instance of F3 || Cmax in Table 6.1. J1 J2 J3 J4 J5 J6 J7 J8 J9 J10 J11 J12 J13 J14 J15 J16 J17 P1 0 0 0 0 0 0 0 0 0 0 0 0 0 28 28 28 28 P2 1 2 2 3 3 3 4 6 7 7 8 10 14 14 14 14 14 P3 0 0 0 0 0 0 0 0 0 0 0 0 28 28 28 28 0 Table 6.1: Problem instance of F3 || Cmax that is used for the reduction from 3-Partition. The schedule as described in the proof of Theorem 6.5 is displayed in Figure 6.10. P1 J14 J15 J16 J17 ///// 0 2z 4z 6z 8z 9z P2 Jn+1 Jn+2 Jn+3 . . . J4m+1 0 z 2z 3z 4z 5z 6z 7z 8z 9z P3 ///// J13 J14 J15 J16 0 z 3z 5z 7z 9z Figure 6.10: Schedule for a specific problem instance of F3 || Cmax. In this instance, we have n = 3m = 12, m = 4, σ = 56, z = 56/4 = 14. Jobs Jn+1, . . . , J4m+1 = J13, . . . , J17 are the enforcers that are used in the proof as a frame- work in which 3-Partition must be included. A makespan of value < (2m + 1 = 9) · z cannot be achieved for these m + 1 = 5 jobs as each one of these jobs will incur a delay of at least z = 14 at the start of processor P3 and the sum of the processing times of the jobs Jn+1, . . . , J4m+1 on P3 is 2m · z = 8z. The question is: Can we schedule the first n jobs J1, . . . J12 without increasing the makespan? As the first n = 12 jobs have only operations on processor P2, we can schedule these jobs whenever we want on processor P2. On P2, there are m = 4 gaps of idle time of length z = 14 each (marked white in Figure 6.10) If we do not want to increase the makespan, there must be a way to split these n = 12 jobs into these gaps. The only way to achieve this is to solve a Partition problem and answer the question if there exists a partition of {1, . . . , n = 12} into m = 4 disjoint sets I1, . . . , I4 such that ∑ i∈I1 p2,i = ∑ i∈I2 p2,i = ∑ i∈I3 p2,i = ∑ i∈I4 p2,i = z = 14. One possible partition is {10, 4}, {8, 6}, {7, 3, 3, 1}, {7, 3, 2, 2} with the corresponding index sets I1 = {12, 7}, I2 = {11, 8}, I3 = {10, 6, 5, 1}, I4 = {9, 4, 3, 2}. 18 6. Algorithmic complexity Homework: Y=3-Dimensional Matching (3DM) 1. 3-Dimensional Matching (3DM) ∝ Partition [GJ79], pp. 60-62 2. 3-Dimensional Matching (3DM) ∝ Subset Sum [KT14], pp. 492-493 3. 3-Dimensional Matching (3DM) ∝ Zero-One Equations (ZOE) [DPV08], pp. 254– 255 Bibliography [Coo71] Cook, S.A., The complexity of theorem-proving procedures, Proceedings of the 3rd ACM Symposium on the Theory of Computing, 151–158, 1971. [DPV08] Dasgupta, S., Papadimitriou, C., Vazirani, U., Algorithms, McGrawHill, 2008. [GJS76] Garey, M.R., Johnson, D.S., Sethi, R., The complexity of flowshop and jobshop scheduling, Mathematics of Operations Research 1, 117–129, 1976. [GJ79] Garey, M.R., Johnson, D.S., Computers and Intractability: A Guide to the Theory of NP-Completeness, Bell Labs, 1979. [KT14] Kleinberg, J., Tardos, E., Algorithm Design, Pearson, 2014. 19 Foliennummer 1 NP-hard problems Approach 1: Take the loser‘s way out Approach 2: Prove that the problem is inherently intractable Approach 3: Prove that the problem is NP-complete Problemlösungsprozess NP-complete problems NP-Completeness P=NP? P=NP? Foliennummer 10 Problemlösungsprozess Optimization vs. Decision problems Complexity Classes Terminology Proving NP-completeness Proving NP-completeness Script