Week 2 Simplex
Today we will introduce the simplex algorithm for solving a linear programs in standard form1.
The main idea behind the algorithm is very simple indeed. It can be distilled into a single sentence: Starting from a basic feasible solution, find an adjacent basic feasible solution with better cost until it is not possible to improve the cost anymore.
However, the devil is in the details, and there are many details to work out in order to fully develop the algorithm. The aim of this lecture is to introduce the main ideas of the simplex algorithm without overwhelming you with too many technicalities. To that end, we will start by making a number of simplifying assumptions, which we will later remove one by one.
2.1 Simplifying Assumptions
Let us assume that the linear program in standard form we are to
solve satisfies the following properties:
1. We are provided with an initial feasible basis.
2. Every feasible basis B is non-degenerate, that is, xB = A−1b > 0. B
3. Its feasible region is bounded, that is, there exist C such that {x∈Rn :Ax=b,x≥0}⊆[0,C]n.
2.2 Developing the Simplex algorithm
We now outline a simplified version of the simplex algorithm for programs satisfying Assumptions (1)-(3).
Our first take on simplex is easy to understand. The algorithm keeps at all times a feasible basis B. In each iteration, it tries to find an index bi inside the basis and an index j outside the basis such that swapping bi with j takes us to another basic feasible solution of lesser cost. For when a swap is not forthcoming, the algorithm terminates and returns the current basis.
1 Recall that a linear program is said to be in standard form if it has the following structure
minimize subject to
and the rows of the constraint matrix
A are linearly independent.
Simplex hops from one basic feasible solution to the next until it finds an optimal solution:
© Copyright 2021 Julián Mestre.
This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.
week 2 – simplex
discrete optimization 2
1: 2: 3: 4: 5: 6: 7: 8: 9:
10: 11: 12: 13:
functionsimplex-take-1(A,b,c)
Let B = {b1,…,bm} be some feasible basis of A repeat
for j ∈ {1,…,n}\B and bi ∈ B do
D = B + j − bi
xB = A−1b B
if D is a basis of A then yD = A−1b
if yD ≥0andcD·yD
dB = −A−1Aj. B
We can imagine the algorithm as picking j, computing d, and then finding the largest θ such that y ≥ 0. This means that
θ= min −xbi . bi∈B:dbi<0 dbi
This expression is always strictly positive by our assumption that the instance is non-degenerate. Furthermore, by the same assumption, there is a unique index bi that attains the minimum, which is the index that will leave the basis.
Finally, the change in cost in going from B to D is just c·y−c·x = cD ·yD −cB ·xB = θ(cj −cTBA−1Aj).
The expressions in parentheses is called the reduced cost of j. As long as there is a j with reduced cost strictly negative, we are guaran- teed to improve the cost of the basic feasible solution induced by bringing j into the basis.
This simplifies how we choose the index j to bring into the basis. Once that choice has been made, we can compute dB, and from that, the index bi to take out of the basis.
Going from x to y along d.
2Whyisitthecasethatdj >0? What if xj > yj? Is this possible?
Foreverybi ∈ B,itmustbethecase that ybi ≥ 0. In other words,
xbi + θdbi ≥ 0. If dbi < 0, this means
θ ≤ − xbi . dbi
If dbi constrain θ.
≥ 0, this coordinate does not
week 2 – simplex
discrete optimization 3
1: functionsimplex-take-2(A,b,c)
2: 3: 4: 5: 6: 7: 8: 9:
B = {b1,...,bm} be some feasible basis of A. repeat
c ̄ T = c T − c T A − 1 A BB
if∃j:c ̄j <0then dB = −A−1Aj
B xbi bi = index in B s.t. dbi < 0 minimizing −db .
B = B + j − bi
until basis B hasn’t changed in this iteration return B
Lemma 2.1. If the input linear program fulfills Assumptions (1)-(3) then Algorithm 2 is equivalent to Algorithm 1.
Proof sketch. First, we argue that Algorithm 2 is well defined, mean- ing that in Line 7 there is always a bi such that dbi < 0. If is were not true then d ≥ 0 and thus x + θd ≥ 0 for all θ > 0, which would mean that the feasible region is not bounded.
Second, because the program is non-degenerate, we always have θ > 0, which means c·y < c·x.
Finally, we need to argue that D = B + j − bi is a basis. For the
sake of contradiction, suppose it was not. Then Aj can be written as
a linear combination of vectors indexed by B − bi; in other words,
A = α A .Recallthatd =−A−1A,therefore j k∈[m]−ikbk B Bj
dB = αkAB Abk = αkek,
k∈[m]−i k∈[m]−i
where ek is the unit vector with a 1 in the kth coordinate and 0 elsewhere. This means dbi = 0, a contradiction. Hence, D is a basis.
Lemma 2.2. Let B be a basis such that
A−1b≥0andc ̄T =cT −cBTA−1A≥0.
Then the basic feasible solution induced by B is optimal.
Proof. Let x be the basic feasible solution induced3 by B and let y
be any other feasible solution. Consider the direction u = x − y.
From the feasibility of x and y it follows that
0=Au=ABuB+
Now the different in cost between y and x is
Akuk andthus uB =−
T T−1
c·u=cB uB+ ckuk= (ck−cB AB Ak)uk= c ̄kuk. k∈/B k∈/B k∈/B
Noticethatforallk∈/Bwehavexk =0andyk ≥0andtherefore uk ≤ 0. By our assumption, all reduced costs are non-negative. Thus, c·x−c·y = c·u ≤ 0, and so c·x ≤ c·y. Since this holds for all feasible solutions y, it follows that x is optimal.
Therefore, provided Assumptions (1)-(3) hold. Algorithm 2 re- turns an optimal feasible solution after a finite number of steps.
3 Recall that the basic feasible solution
inducedbyBisxB =A−1bandxi =0
for i ∈/ B.
week 2 – simplex
discrete optimization 4
2.3 Removing the boundedness assumption.
The only place we needed the condition that the linear program is bounded was when we argued that the computation in Line 7 of simplex was well defined.
What are we to do then if we run into an index j with negative reduced cost such that dB = −A−1Aj ≥ 0? If that is the case, then
the linear problem has an unbounded objective4. If this is the case, then we can simply report that.
4 A linear program is said to have unbounded objective if for any fix value μ we can find a feasible solution whose costs is larger than μ.
IfdB ≥0thenforanyθ≥0wehave
x+θd ≥ 0, A(x+θd) = b, and
c·(x+θd) = c·x+c ̄jθ.
Therefore, for any μ we can find a θ such that x + θd is a feasible solution with cost less than μ.
1: functionsimplex(A,b,c)
2: 3: 4: 5: 6: 7: 8: 9:
Let B = {b1,...,bm} be some feasible basis of A repeat
c ̄ T = c T − c T A − 1 A BB
if ∃j : c ̄j < 0 then dB = −A−1Aj
if dB ≥ 0 then return “unbounded objective”
bi = index in B s.t. db
B = B + j − bi
< 0 minimizing − xbi dbi
until basis B hasn’t changed in this iteration return B
2.4 Removing the non-degeneracy assumption
Where did we make use of the non-degeneracy assumption? Ba-
sically, we use it to argue that θ = mini:d <0 − xbi is strictly bi dbi
positive, so the cost of the solution improves in each iteration. This, in turn, implies that we never consider the same basis twice, which means the algorithm always terminates.
Therefore, if we have a general linear program that has degen- erate bases, we must pay attention to which variable to bring into the basis and which variable to take out. Otherwise, we may find ourselves cycling through the same set of bases.
A pivoting rule is a procedure for choosing which j to bring into
and which bi to take out of the basis5. The rule is said to be anti-
cycling if we never go through the same basis twice. There are many
anti-cycling rules. Perhaps the most elegant is Bland’s rule that
asks us to pick j to be the smallest index with c ̄j < 0 and pick bi
to be the smallest index in the basis with dbi < 0 that minimizes
− xbi . Proving that the rule indeed works, is beyond the scope of dbi
this class.
2.5 Removing the need for an initial feasible basis
There are a few ways of getting around this restriction. One alterna- tive is to solve an auxiliary linear program where it is easy to find an initial basis6.
5 Lines 5 and 9 in simplex
6 Notice that it is crucial to be able to easily find the initial solution of the new program, otherwise, we would left with an infinite recursion!
week 2 – simplex
discrete optimization 5
Assuming b ≥ 0, we solve the following program:
minimize j yj
subject to Ax+y = b x≥0 y≥0
The new program is in standard form7. Furthermore, finding an initial basic feasible solution is trivial: The indices of the y variables form a basis, which induces the basic feasible solution x = 0 and
y = b. If the value of the new program is strictly greater than 0, then it must be the case that the original program was not feasible. Otherwise, the optimal solution (x, y) is such that y = 0, which means Ax = b and x ≥ 0.
Unfortunately, this is not enough, since what we need is a fea- sible basis. Luckily, the optimal solution (x, y) that the simplex algorithm finds in the auxiliary program has associated a basis B. If this basis does not use any variables from y then we are done. Oth- erwise, let C be the subset of B that correspond to indices of x. The columns of A indexed by C are clearly linearly independent. Now because rank(A) = m, there must be a way of extending C into a set of m linearly independent columns of A. This extended set can be our initial feasible basis for the original program.
2.6 Complexity analysis
Each iteration of the Simplex algorithm takes a polynomial time. Therefore, in order to prove that the algorithm runs in polynomial time, one would only need to argue that the algorithm runs for a polynomial number of iterations. Unfortunately, for many pivoting rules there are instances that show that the algorithm runs for an exponential number of iterations. In practice, Simplex is usually very efficient and only requires a few iterations find an optimal solution.
Designing a pivoting rule such that Simplex iterates for at most a polynomial time number of iterations, or showing that this is not possible, is a major open problem in Algorithms Theory.
1. Suppose we are given a polynomial time algorithm that given a linear program, it tests whether the program is feasible or not. Show how to use such an algorithm to solve linear programs.
7 We can re-write the constraints as x
The constraint matrix has m linearly
independent rows
Further good news is that there are other algorithms for solving linear programs that run in polynomial time in the worst case. Later in this class we will one of them: The Ellipsoid Algorithm.
week 2 – simplex
discrete optimization 6
2. Give an example of a linear program with multiple optimal basic feasible solutions.
3. Let x be a basic feasible solution associated with the basis B. Prove that if every non-basic variable (those not indexed by B) have reduced costs strictly positive then x is the unique optimal solution of the program.
4. Let x be a basic feasible solution of a linear program in standard form. Let B be the basis associated with x. Prove that every basic variable (those indexed by B) has reduced cost 0.
* 5. We would like to find a solution x ≥ 0 to a system of linear equa- tions Ax = 0 having the largest number of non-zero coordinates. Reduce this problem to a linear programming computation.
* 6. Given a polyhedron {x : Ax = b,x ≥ 0} in standard form, we define the graph G = (V,E) whose vertices are feasible bases,
V = BisabasisofA:A−1b≥0 , B
and two vertices are joined with an edge if their corresponding bases are adjacent,
E={(C,D)∈V×V :|C⊕D|=2}. Prove that the graph G is connected.
The set C ⊕ D is the is the symmetric difference of the two sets:
C⊕D = (C\D)∪(D\C).
week 2 – simplex
discrete optimization 7
Solutions of selected exercises
4. Let bk be the kth basic variable in B. Recall that the reduced cost of bk is c ̄b = cb − cT A−1Ab . Also recall that the kth column of
AB is Ab . Therefore, A−1Ab = ek. Therefore, c ̄b = cb − cT ek = kBkkkB
cbk −cbk =0.
5. Consider the following linear program
maximize 1 · y
subject to A(z + y) = 0 y≤1 z, y ≥ 0
Let OPT be the maximum number of non-zero entries that a solution to the system Ax = 0 can have, and let LP be the value of the above linear program.
Given a solution x to the first problem, we can construct a solu- tion (z, y) to the program whose value equals the number of non- zero entries of x. Assume all non-zero entries in x are great than 1, otherwise,wecanscaleupx. Foreachxi ≥ 1,wesetzi = xi −1 and yi = 1. It follows that OPT ≤ LP.
Let (y, z) be a solution to the linear program. Then x = y + z is a solution to the system Ax = 0 such that the number of non-zero entries of x is at least 1 · y. It follows that OPT ≥ LP.
Therefore, OPT = LP and an optimal solution (y, z) to the linear program induces an optimal solution x = y + z to the first problem.