Algorithmic Game Theory and Applications
Lecture 6:
The Simplex Algorithm
Kousha Etessami AGTA: Lecture 6
Copyright By PowCoder代写 加微信 powcoder
Recall our example
x + y <= 6
y <= 4 "vertex"
2 x + y= 8
2 x + y = 11
Note that:
• Starting at (0, 0), we can find the optimal vertex (5,1), by repeatedly moving from a vertex to a neighboring vertex (by crossing an “edge”) that improves the value of the objective function.
• We don’t seem to get “stuck” in any “locally optimal” vertex, at least in this trivial example.
• That’s the geometric idea of simplex. AGTA: Lecture 6
geometric idea of simplex
• Input: Given (f,Opt,C), and given some start “vertex” x ∈ K(C) ⊆ Rn.
(Never mind, for now, that we have no idea how to find x ∈ K(C) -or even whether C is Feasible!- let alone a “vertex” x.)
While (x has some “neighbor vertex”, x′ ∈ K(C), such that f(x′) > f(x))
– Pick such a neighbor x′. Let x := x′.
– (If neighbor at “infinity”, Output: “Unbounded”.)
Output: x∗ := x, and f(x∗) is optimal value. Question: Why should this work? Why don’t we
get “stuck” in some “local optimum”?
Key reason: The region K(C) is convex, meaning if x,y ∈ K(C) then every point z on the “line segment” between x and y is also in K(C). (Recall: K is convex iff x, y ∈ K ⇒ λx + (1 − λ)y ∈ K, for λ ∈ [0, 1].) On a convex region, a “local optimum” of a linear objective is always the “global optimum”.
Ok. The geometry sounds nice and simple. But realizing it algebraically is not a trivial matter!
AGTA: Lecture 6
LP’s in “Primal Form”
Using the simplification rules from the last lecture, we can convert any LP into the following form:
Maximize c1 x1 +c2 x2 +…+cn xn +d Subject to:
a1,1 x1 +a1,2 x2 +…+a1,n xn ≤ b1 a2,1 x1 +a2,2 x2 +…+a2,n xn ≤ b2 … … … … am,1 x1 +ai,2 x2 +…+am,n xn ≤ bm
x1,…,xn ≥0
Aside: You may be wondering why we are carrying
along the constant d in the objective f(x): It doesn’t affect the optimality of a solution.
(Although it does shift the value of a solution by d.) We do so for convenience, to become apparent later.
AGTA: Lecture 6
slack variables
We can add a “slack
” variable yi to each inequality,
to get equalities:
Maximize c1 x1 +c2 x2 +…+cn xn +d
Subject to:
a1,1 x1 +a1,2 x2 +…+a1,n xn +y1 = b1
a2,1 x1 +a2,2 x2 +…+a2,n xn +y2 = b2 … …
am,1 x1 +ai,2 x2 +…+am,n xn +ym = bm x1,…,xn ≥ 0; y1,…,ym ≥ 0
The two LPs are “equivalent”. (Explanation.) The new LP has some particularly nice properties:
1. Every equality constraint Ci has at least one variable on the left with coefficient 1 and which doesn’t appear in any other equality constraint.
2. Picking one such variable for each equality, we obtain a set of m variables B called a Basis. An obvious basis above is B = {y1, . . . , ym}.
3. Objective f(x) involves only non-Basis variables.
Let us call an LP in such a form a “dictionary”.
AGTA: Lecture 6
Basic Feasible Solutions
Rewrite our dictionary (renaming “yi”, “xn+i”) as: Maximize c1 x1 +c2 x2 +…+cn xn +d
Subject to:
xn+1 = b1 −a1,1 x1 −a1,2 x2 −…−a1,n xn
xn+2 = b2 −a2,1 x1 −a2,2 x2 −…−a2,n xn
xn+m = bm −am,1 x1 −ai,2 x2 −…−am,n xn
x1,…,xn+m ≥ 0
Suppose, somehow, bi ≥ 0 for all i = 1,…,m. Then we have what’s called a “feasible dictionary” and a feasible solution for it, namely,
let xn+i = bi, for i = 1,…,m, and
letxj =0,forj=1,…,n. Thevalueisf(0)=d!
(BFS),with basis B. Call this a basic feasible solution
Geometry: A BFS corresponds to a “vertex”. (But different Bases B may yield the same BFS!)
Question: How do we move from one BFS with basis B to a “neighboring” BFS with basis B′?
Answer: Pivoting!
AGTA: Lecture 6
Suppose our current dictionary basis (the variables on the left) is B = {xi1,…,xim}, with xir the variable on the left of constraint Cr.
The following pivoting procedure moves us from basis B to basis B′ := (B \ {xir}) ∪ {xj}.
Pivoting to add xj and remove xir from basis B:
1. Assuming Cr involves xj, rewrite Cr as xj = α.
2. Substitute α for xj in all other constraints Cl, obtaining Cl′.
3. The new constraints C′, have a new basis: B′ := (B \ {xir}) ∪ {xj}.
4. Also substitute α for xj in f (x), so that f (x) again only depends on variables not in the new basis B′.
This new basis B′ is a “possible neighbor” of B. However, not every such basis B′ is eligible!
AGTA: Lecture 6
sanity checks for pivoting
To check eligibility of a pivot, we have to make sure:
1. The new constants b′i remain ≥ 0, so we retain a
“feasible dictionary”, and thus B′ yields a BFS.
2. The new BFS must improve, or at least
, the value d′ = f(0) of the new objective function. (Recall, all non-basic variables
are set to 0 in a BFS, thus f(BFS) = f(0).)
3. We should also check for the following situations:
• Suppose all non-basic variables involved in f (x) have negative coefficients. Then any increase from 0 in these variables will decrease the objective. We are thus (it turns out) at an optimal BFS x∗. Output: Optimal solution: x∗ and f(x∗) = f(0) = d′.
• Suppose there is a non-basic variable xj in f(x) with coefficient cj > 0, and such that the coefficient of xj in every constraint Cr is also ≥ 0. Then we can increase xj, and the objective value, to “infinity” without violating any constraints. So, Output: “Feasible but Unbounded”.
AGTA: Lecture 6
must not decrease
finding and choosing eligible pivots
• In principle, we could exhaustively check the sanity conditions for eligibility of all potential pairs of entering and leaving variables. There are at most (n ∗ m) candidates.
• But, there are much more efficient ways to choose pivots, by inspection of the coefficients in the dictionary.
• We can also efficiently choose pivots according to lots of additional criteria, or pivoting rules, such as, e.g., “most improvement in objective value”, etc.
• There are many such “rules”, and it isn’t clear a priori what is “best”.
AGTA: Lecture 6
The Simplex Algorithm
Dantzig’s Simplex algorithm can be described as follows:
Input: a feasible dictionary;
1. Check if we are at an optimal solution, and if so, Halt and output the solution.
2. Check if we have an “infinity” neighbor, and if so Halt and output “Unbounded”.
3. Otherwise, choose an eligible pivot pair of variables, and Pivot!
Fact If this halts the output is correct: an output solution is an optimal solution of the LP.
Oops! We could cycle back to the same basis for ever, never strictly improving by pivoting.
There are several ways to address this problem……
AGTA: Lecture 6
how to prevent cycling
Several Solutions:
”: For all eligible pivot pairs (xi,xj), where xi is being added the basis and xj is being removed from it, choose the pair such that, first, i is as small as possible, and
• Carefully choose rules for variable pairs to pivot at, in a way that forces cycling to never happen. Fact: This can be done.
(For example, use “Bland’s rule
second, j is as small as possible.)
• Choose randomly among eligible pivots.
With probability 1, you’ll eventually get out and to an optimal BFS.
• “Perturb” the constraints slighty to make the LP “non-degenerate”. (More rigorously, implement this using, e.g., the “lexicographic method”.)
AGTA: Lecture 6
the geometry revisited
• Moving to a “neighboring” basis by pivoting roughly corresponds to moving to a neighboring “vertex”.
However, this is not literally true because several Bases can correspond to the same BFS, and thus to the same “vertex”.
We may not have any neighboring bases that strictly improve the objective, and yet still not be optimal, because all neighoring bases B′ describe the same BFS “from a different point of view”.
• pivoting rules can be designed so we never return to the same “point of view” twice.
• choosing pivots randomly guarantees that we eventually get out.
• properly “perturbing” the constraints makes sure every BFS corresponds to a unique basis (i.e., we are non-degenerate), and thus bases and “vertices” are in 1-1 correspondence.
AGTA: Lecture 6
Hold on! What about finding an initial BFS?
• So far, we have cheated: we have assumed we start with an initial “feasible dictionary”, and thus have an initial BFS.
• Recall, the LP may not even be feasible!
• Luckily, it turns out, it is as easy (using Simplex) to find whether a feasible solution exists (and if so to find a BFS) as it is to find the optimal BFS given an initial BFS…..
AGTA: Lecture 6
checking feasibility via simplex
Consider the following new LP:
Maximize −x0 Subject to:
a1,1 x1 +a1,2 x2 +…+a1,n xn −x0 ≤ b1 a2,1 x1 +a2,2 x2 +…+a2,n xn −x0 ≤ b2 … …
am,1 x1 +ai,2 x2 +…+am,n xn −x0 ≤ bm x0,x1,…,xn ≥ 0
• This LP is feasible: let x0 = −min{b1,…,bm,0}, xj =0,forj=1,…,n. Wecanalsosetupa feasible dictionary, and thus initial BFS, for it by introducing slack variables in an appropriate way.
• Key point: the original LP is feasible if and only if in an optimal solution to the new LP, x∗0 = 0.
• It also turns out, it is easy to derive a BFS for the original LP from an optimal BFS for this new LP.
• (In fact, finding an optimal solution given a feasible solution can also be reduced to checking whether a feasible solution exists.)
AGTA: Lecture 6
how efficient is simplex?
• Each pivoting iteration can be performed in O(mn) arithmetic operations.
Also, it can be shown that the coefficients never get “too large” (they stay polynomial- sized), as long as rational coefficients are kept in reduced form (e.g., removing common factors from numerator and denominator).
So, each pivot can be done in “polynomial time”.
• How many pivots are required to get to the optimal solution?
Unfortunately, it can be exponentially many!
• In fact, for most “pivoting rules” known, there exist worst case examples that force exponentially many iterations.
(E.g., the Klee-Minty (1972) “skewed hypercube examples”.)
• Fortunately, simplex tends to be very efficient in practice: requiring O(m) pivots on typical examples.
AGTA: Lecture 6
more on theoretical efficiency
• It is an open problem whether there exists a pivoting rule that achieves polynomially many pivots on all LPs.
• A randomized pivoting rule is known that requires mO(√n) expected pivots [Kalai’92], [Matousek- Sharir-Welzl’92].
• Is there, in every LP, a polynomial-length “path via edges” from every vertex to every other? Without this, there can’t be any polynomial pivoting rule. This “diameter” is conjecturally O(m), but the best known is mO(log n) [Kalai-Kleitman’92].
Hirsch conjecture: ≤ m − n, disproved [Santos’10].
• Breakthrough: [Khachian’79] proved the LP problem does have a polynomial time algorithm, using a completely different approach: The “Ellipsoid Algorithm”. The ellipsoid algorithm is theoretically very important, but it is not practical.
• Breakthrough: [Karmarkar’84] gave a completely different P-time algorithm, using “the interior- point method”. Karmarkar’s algorithm is competitive with simplex in many cases.
AGTA: Lecture 6
final remarks and food for thought
• Why is the Simplex algorithm so fast in practice?
Some explanation is offered by [Borgwardt’77]’s “average case” analysis of Simplex.
More convincing explanation is offered by [Spielman-Teng’2001]’s “smoothed analysis” of Simplex (not “light reading”).
• Ok. Enough about Simplex.
So we now have an efficient algorithm for, among other things, finding minimax solutions to 2-player zero-sum games.
• Next time, we will learn about the very important concept of Linear Programming Duality.
LP Duality is closely related to the Minimax theorem, but it has far reaching consequences in many subjects.
• Food for thought: Suppose you have a solution to an LP in Primal Form, and you want to convince someone it is optimal. How would you do it?
AGTA: Lecture 6
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com