程序代写代做代考 C Excel go finance DNA chain Bayesian algorithm graph case study data structure discrete mathematics assembly AI information theory game Introduction

Introduction
to Linear Optimization

ATHENA SCIENTIFIC SERIES
IN OPTIMIZATION AND NEURAL COMPUTATION
1. Dynamic Programming and Optimal Control, Vols. I and II, by Dim itri P. Bertsekas, 1995.
2. Nonlinear Programming, by Dimitri P. Bertsekas, 1995.
3. Neuro-Dynamic Programming, by Dimitri P. Bertsekas and John N.
Tsitsiklis, 1996.
4. ConstrainedOptimizationandLagrangeMultiplierMethods,byDim
itri P. Bertsekas, 1996.
5. Stochastic Optimal Control: The Discrete-Time Case, by Dimitri P.
Bertsekas and Steven E. Shreve, 1996.
6. Introduction to Linear Optimization, by Dimitris Bertsimas and John N. Tsitsiklis, 1997.

Introduction
to Linear Optimization
Dimitris Bertsimas John N. Tsitsiklis
Massachusetts Institute of Technology
~ Athena Scientific, Belmont, Massachusetts

Athena Scientific
Post Office Box 391 Belmont, Mass. 02178-9998 U.S.A.
Email: athenasc@world.std.com
WWW information and orders: http://world.std.com;-athenasc/
Cover Design: Ann Gallager
© 1997 Dimitris Bertsimas and John N. Tsitsiklis
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
Publisher’s Cataloging-in-Publication Data
Bertsimas, Dimitris, Tsitsiklis, John N.
Introduction to Linear Optimization
Includes bibliographical references and index
1. Linear programming. 2. Mathematical optimization. 3. Integer programming. 1. Title.
T57.74.B465 1997 519.7 96-78786
ISBN 1-886529-19-1

To Georgia,
and to George Michael, who left us so early
To Alexandra and Melina

Contents
Preface . . . . . . . . . . . . . . . . . . . . . xi 1. Introduction………… 1
1.1. Variants o f the linear programming problem . 2
1.2. Examples o f linear programming problems 6
1.3. Piecewise linear convex objective functions
1.4. Graphical representation and solution 21
1.5. Linear algebra background and notation 26
1.6. Algorithms and operation counts 32
1.7. Exercises . …….
1.8. History, notes, and sources
2. The geometry of linear programming 41
2.1. Polyhedraandconvexsets . . . . . . . . . . . 42
2.2. Extreme points, vertices, and basic feasible solutions
2.3. Polyhedra in standard form 53
2.4. Degeneracy . …….
2.5. Existence of extreme points . 62
2.6. Optimality of extreme points
2.7. Representation of bounded polyhedra* 67
2.8. Projections of polyhedra: Fourier-Motzkin elimination*
2.9. Summary . . . .
. 75
2.10. Exercises . .
2.11. Notes and sources
3. The simplex method
.
. . . . . . . 82 vii
3.1. Optimalityconditions
3.2. Development of the simplex method . . 87
3.3. Implementations o f the simplex method 94
79
81
15
34
70
38 46
58
65
75

viii
Contents
3.4. Anticycling: lexicography and Bland’s rule 108
3.5. Finding an initial basic feasible solution 111
3.6. Column geometry and the simplex method 119
3.7. Computational efficiency of the simplex method 124
3.8. Summary . . . .
3.9. Exercises . . . .
3.10. Notes and sources
4. Duality theory . .
4.1. Motivation . . . .
4.2. The dual problem .
4.3. The duality theorem
4.4. Optimal dual variables as marginal costs 155
4.5. Standard form problems and the dual simplex method 156
4.6. Farkas ‘ lemma and linear inequalities .
4.7. From separating hyperplanes to duality*
4.8. Conesandextremerays…..
4.9. Representation of polyhedra . . . .
4.10. General linear programming duality*
4.11. Summary….
4.12. Exercises . . . .
4.13. Notes and sources
5. Sensitivity analysis
. 165 169 174 179 183 186 187 199
201
140 . 142 . 146
5.1. Localsensitivityanalysis . . . . . . . . . . 202
5.2. Global dependence on the right-hand side vector 212
5.3. The set of all dual optimal solutions* . 215
5.4. Global dependence on the cost vector 216
5.5. Parametric programming 217
5.6. Summary…. 221
.. 222 5.8. Notes and sources 229
5.7. Exercises .
6. Large scale optimization 231
.
6.1. Delayed column generation 232
6.2. The cutting stock problem
6.3. Cutting plane methods . . 236
6.4. Dantzig-Wolfe decomposition
6.5. Stochastic programming and Benders decomposition
6.6. Summary…. 260
6.7. Exercises . . .
6.8. Notes and sources 263
.
260
128 129 137
139
234
239 254

Contents ix
7. Network flow problems 265
7.1. Graphs . . . . . . . . . . . . . . . 267
7.2. Formulation of the network flow problem 272
. 278 291 . 301 312 . 316 auction algorithm 325
7.3. The network simplex algorithm
7.4. The negative cost cycle algorithm
7.5. The maximum flow problem . .
7.6. Duality in network flow problems
7.7. Dual ascent methods* . . . .
7.8. The assignment problem and the
7.9. The shortest path problem . .
7.10. The minimum spanning tree problem . 343
7.11. Summary . . . . 345
7.12. Exercises . . . . 347
7.13. Notes and sources 356
Complexity of linear programming and the ellipsoid method . . . . . . . . . . . . . . . . . . . 359
8.1. Efficient algorithms and computational complexity . . 360
8.2. The key geometric result behind the ellipsoid method 363
8.3. The ellipsoid method for the feasibility problem 370
.
.
. 332
8.4. The ellipsoid method for optimization . . .
8.5. Problems with exponentially many constraints*
8.6. Summary . . . .
8.7. Exercises . . . .
8.8. Notes and sources
9. Interior point methods
9.1. The affine scaling algorithm
9.2. Convergence of affine scaling*
9.3. The potential reduction algorithm
9.4. The primal path following algorithm
9.5. The primal-dual path following algorithm
9.6. An overview
9.7. Exercises . . . .
9.8. Notes and sources
10. Integer programming formulations
. 378 380 387 388
393
395 404 409 419 431 438 440 448
451
452 461
10.1.Modelingtechniques . . . . . . . . .
10.2. Guidelines for strong formulations . . .
10.3. Modeling with exponentially many constraints .
10.4. Summary 472 10.5.Exercises . . . . . . . . . . . . . . . . . 472
8.
392
465

x
Contents
477
479
480 485 490 494 507 511 512 514 522 523 530
533
535 537
551 562 563 567
569 5 79
10.6.
11.
11.1. 11.2. 11.3. 11.4. 11.5. 11.6. 11.7. 11.8. 11.9. 11.10. 11.11.
12.
12.1. 12.2. 12.3. 12.4. 12.5. 12.6. 12.7. 12.8.
Notes and sources
Integer programming methods
Cutting plane methods Branch and bound
Dynamic programming Integer programming duality Approximation algorithms Local search
Simulated annealing Complexity theory Summary
Exercises
Notes and sources
The art in linear optimization
.
.
.
.
Modeling languages for linear optimization .
Linear optimization libraries and general observations The fleet assignment problem
The air traffic flow management problem
The job shop scheduling problem
Summary
Exercises
Notes and sources
References Index
534
544

Preface
The purpose of this book is to provide a unified, insightful, and modern treatment of linear optimization, that is, linear programming, network flow problems, and discrete linear optimization. We discuss both classical top ics, as well as the state of the art. We give special attention to theory, but also cover applications and present case studies. Our main objective is to help the reader become a sophisticated practitioner of (linear) optimiza tion, or a researcher. More specifically, we wish to develop the ability to formulate fairly complex optimization problems, provide an appreciation of the main classes of problems that are practically solvable, describe the available solution methods, and build an understanding of the qualitative properties of the solutions they provide.
Our general philosophy is that insight matters most. For the sub ject matter of this book, this necessarily requires a geometric view. On the other hand, problems are solved by algorithms, and these can only be described algebraically. Hence, our focus is on the beautiful interplay between algebra and geometry. We build understanding using figures and geometric arguments, and then translate ideas into algebraic formulas and algorithms. Given enough time, we expect that the reader will develop the ability to pass from one domain to the other without much effort.
Another of our objectives is to be comprehensive, but economical. We have made an effort to cover and highlight all of the principal ideas in this field. However, we have not tried to be encyclopedic, or to discuss every possible detail relevant to a particular algorithm. Our premise is that once mature understanding of the basic principles is in place, further details can be acquired by the reader with little additional effort.
Our last objective is to bring the reader up to date with respect to the state of the art. This is especially true in our treatment of interior point methods, large scale optimization, and the presentation of case studies that stretch the limits of currently available algorithms and computers.
The success of any optimization methodology hinges on its ability to deal with large and important problems. In that sense, the last chapter, on the art of linear optimization, is a critical part of this book. It will, we hope, convince the reader that progress on challenging problems requires both problem specific insight, as well as a deeper understanding of the underlying theory.
xi

xii Preface
In any book dealing with linear programming, there are some impor tant choices to be made regarding the treatment of the simplex method. Traditionally, the simplex method is developed in terms of the full simplex tableau, which tends to become the central topic. We have found that the full simplex tableau is a useful device for working out numerical examples. But other than that, we have tried not to overemphasize its importance.
Let us also mention another departure from many other textbooks. Introductory treatments often focus on standard form problems, which is sufficient for the purposes of the simplex method. On the other hand, this approach often leaves the reader wondering whether certain properties are generally true, and can hinder the deeper understanding of the subject. We depart from this tradition: we consider the general form of linear program ming problems and define key concepts (e.g., extreme points) within this context. (Of course, when it comes to algorithms, we often have to special ize to the standard form.) In the same spirit, we separate the structural understanding of linear programming from the particulars of the simplex method. For example, we include a derivation of duality theory that does not rely on the simplex method.
Finally, this book contains a treatment of several important topics that are not commonly covered. These include a discussion of the col umn geometry and of the insights it provides into the efficiency of the simplex method, the connection between duality and the pricing of finan cial assets, a unified view of delayed column generation and cutting plane methods, stochastic programming and Benders decomposition, the auction algorithm for the assignment problem, certain theoretical implications of the ellipsoid algorithm, a thorough treatment of interior point methods, and a whole chapter on the practice of linear optimization. There are also several noteworthy topics that are covered in the exercises, such as Leontief systems, strict complementarity, options pricing, von Neumann’s algorithm, submodular function minimization, and bounds for a number of integer programming problems.
Here is a chapter by chapter description of the book.
Chapter 1: Introduces the linear programming problem, together with a number of examples, and provides some background material on linear algebra.
Chapter 2: Deals with the basic geometric properties of polyhedra, focus ing on the definition and the existence of extreme points, and emphasizing the interplay betwen the geometric and the algebraic viewpoints.
Chapter 3: Contains more or less the classical material associated with the simplex method, as well as a discussion of the column geometry. It starts with a high-level and geometrically motivated derivation of the simplex method. It then introduces the revised simplex method, and concludes with the simplex tableau. The usual topics of Phase I and anticycling are

Preface xiii
also covered.
Chapter 4: It is a comprehensive treatment of linear programming du ality. The duality theorem is first obtained as a corollary of the simplex method. A more abstract derivation is also provided, based on the separat ing hyperplane theorem, which is developed from first principles. It ends with a deeper look into the geometry of polyhedra.
Chapter 5: Discusses sensitivity analysis, that is, the dependence of so lutions and the optimal cost on the problem data, including parametric programming. It also develops a characterization of dual optimal solutions as subgradients of a suitably defined optimal cost function.
Chapter 6: Presents the complementary ideas of delayed column gen eration and cutting planes. These methods are first developed at a high level, and are then made concrete by discussing the cutting stock prob lem, Dantzig-Wolfe decomposition, stochastic programming, and Benders decomposition.
Chapter 7: Provides a comprehensive review of the principal results and methods for the different variants of the network flow problem. It contains representatives from all major types of algorithms: primal descent (the simplex method), dual ascent (the primal-dual method), and approximate dual ascent (the auction algorithm). The focus is on the major algorithmic ideas, rather than on the refinements that can lead to better complexity estimates .
Chapter 8: Includes a discussion of complexity, a development of the el lipsoid method, and a proof of the polynomiality of linear programming. It also discusses the equivalence of separation and optimization, and provides examples where the ellipsoid algorithm can be used to derive polynomial time results for problems involving an exponential number of constraints.
Chapter 9: Contains an overview of all major classes of interior point methods, including affine scaling, potential reduction, and path following (both primal and primal-dual) methods. It includes a discussion of the underlying geometric ideas and computational issues, as well as convergence proofs and complexity analysis.
Chapter 10: Introduces integer programming formulations of discrete optimization problems. It provides a number of examples, as well as some intuition as to what constitutes a “strong” formulation.
Chapter 11: Covers the major classes of integer programming algorithms, including exact methods (branch and bound, cutting planes, dynamic pro gramming), approximation algorithms, and heuristic methods (local search and simulated annealing). It also introduces a duality theory for integer programming.
Chapter 12: Deals with the art in linear optimization, i.e., the process

xiv Preface
of modeling, exploiting problem structure, and fine tuning of optimization algorithms. We discuss the relative performance of interior point meth ods and different variants of the simplex method, in a realistic large scale setting. We also give some indication of the size of problems that can be currently solved.
An important theme that runs through several chapters is the model ing, complexity, and algorithms for problems with an exponential number constraints. We discuss modeling in Section 10.3, complexity in Section 8.5, algorithmic approaches in Chapter 6 and 8.5, and we conclude with a case study in Section 12.5.
There is a fair number of exercises that are given at the end of each chapter. Most of them are intended to deepen the understanding of the subject, or to explore extensions of the theory in the text, as opposed to routine drills. However, several numerical exercises are also included. Starred exercises are supposed to be fairly hard. A solutions manual for qualified instructors can be obtained from the authors.
We have made a special effort to keep the text as modular as possible, allowing the reader to omit certain topics without loss of continuity. For example, much of the material in Chapters 5 and 6 is rarely used in the rest of the book. Furthermore, in Chapter 7 (on network flow problems) , a reader who has gone through the problem formulation (Sections 7.1-7.2) can immediately move to any later section in that chapter. Also, the interior point algorithms of Chapter 9 are not used later, with the exception of some of the applications in Chapter 12. Even within the core chapters (Chapters 1-4) , there are many sections that can be skipped during a first reading. Some sections have been marked with a star indicating that they contain somewhat more advanced material that is not usually covered in an introductory course.
The book was developed while we took turns teaching a first-year graduate course at M.LT., for students in engineering and operations re search. The only prerequisite is a working knowledge of linear algebra. In fact, it is only a small subset of linear algebra that is needed (e.g., the concepts of subspaces, linear independence, and the rank of a matrix). However, these elementary tools are sometimes used in subtle ways, and some mathematical maturity on the part of the reader can lead to a better appreciation of the subject .
The book can be used to teach several different types of courses. The first two suggestions below are one-semester variants that we have tried at M.LT., but there are also other meaningful alternatives, depending on the students’ background and the course’s objectives.
(a) CovermostofChapters1-7,andiftimepermits,coverasmallnumber of topics from Chapters 9-12.
(b) An alternative could be the same as above, except that interior point

Preface xv algorithms (Chapter 9) are fully covered, replacing network flow prob
lems (Chapter 7).
(c) A broad overview course can be constructed by concentrating on the easier material in most of the chapters. The core of such a course could consist of Chapter 1, Sections 2.1-2.4, 3.1-3.5, 4.1-4.3, 5.1, 7.1- 7.3, 9.1, 10.1, some of the easier material in Chapter 11, and an application from Chapter 12.
(d) Finally, the book is also suitable for a half-course on integer pro gramming, based on parts of Chapters 1 and 8, as well as Chapters 10-12.
There is a truly large literature on linear optimization, and we make no attempt to provide a comprehensive bibliography. To a great extent, the sources that we cite are either original references of historical interest, or recent texts where additional information can be found. For those topics, however, that touch upon current research, we also provide pointers to recent journal articles.
We would like to express our thanks to a number of individuals. We are grateful to our colleagues Dimitri Bertsekas and Rob Freund, for many discussions on the subjects in this book, as well as for reading parts of the manuscript. Several of our students, colleagues, and friends have con tributed by reading parts of the manuscript, providing critical comments, and working on the exercises: Jim Christodouleas, Thalia Chryssikou, Austin Frakt, David Gamarnik, Leon Hsu, Spyros Kontogiorgis, Peter Mar bach, Gina Mourtzinou, Yannis Paschalidis, Georgia Perakis, Lakis Poly menakos, Jay Sethuraman, Sarah Stock, Paul Tseng, and Ben Van Roy. But mostly, we are grateful to our families for their patience, love, and support in the course of this long project.
Dimitris Bertsimas John N. Tsitsiklis Cambridge, January 1997

Chapter 1 Introd uction
Contents
1.1. Variants of the linear programming problem
1.2. Examples of linear programming problems
1.3. Piecewise linear convex objective functions
1.4. Graphical representation and solution
1.5. Linear algebra background and notation
1.6. Algorithms and operation counts
1.7. Exercises
1.8. History, notes, and sources
1

2 Chap. 1 Introduction
In this chapter, we introduce linear programming, the problem of mini mizing a linear cost function subject to linear equality and inequality con straints. We consider a few equivalent forms and then present a number of examples to illustrate the applicability of linear programming to a wide variety of contexts. We also solve a few simple examples and obtain some basic geometric intuition on the nature of the problem. The chapter ends with a review of linear algebra and of the conventions used in describing the computational requirements (operation count) of algorithms.
1 . 1 Variants of the linear programming problem
In this section, we pose the linear programming problem, discuss a few special forms that it takes, and establish some standard notation that we will be using. Rather than starting abstractly, we first state a concrete example, which is meant to facilitate understanding of the formal definition that will follow. The example we give is devoid of any interpretation. Later on, in Section 1.2, we will have ample opportunity to develop examples that arise in practical settings.
Example 1.1 The following is a linear programming problem:
minimize 2Xl X2+ :: subjectto Xl+ X2 +X4 2 3X2 X3 5 X3 + X4 2 3 Xl 20
Here Xl, X2, X3, and X4 are variables whose values are to be chosen to minimize the linear cost function 2Xl – X2 + 4X3, subject to a set of linear equality and inequality constraints. Some of these constraints, such as Xl 2 0 and X3 :: 0, amount t o simple restrictions o n the sign o f certain variables. The remaining constraints are of the form a/x:: b, a/x = b, or a/x 2 b, where a = (al,a2,a3,a4) is a given vectorl, x = (Xl,X2,X3,X4) is the vector of decision variables, a/x is their inner product 2::=1 aixi, and b is a given scalar. For example, in the first constraint, we have a = (1, 1, 0, 1) and b = 2.
We now generalize. In a general linear programming problem, we are
Cl,
CiXi
given a cost vector c
=
(
…,c ) and we seek to minimize a linear cost n
function c’x
=
2:�=1
over all n-dimensional vectors x
=
(
XI, …,x), n
1As discussed further in Section 1.5, all vectors are assumed to be column vectors, and are treated as such in matrix-vector products. Row vectors are indicated as transposes of (column) vectors. However, whenever we refer to a vector x inside the text, we use the more economical notation x = (Xl, …, Xn ), even though x is a column vector. The reader who is unfamiliar with our notation may wish to consult Section 1.5 before continuing.
X3 ::o.
4X3

Sec. 1.1 Variantsofthelinearprogrammingproblem 3
subject to a set of linear equality and inequality constraints. In particular, let MI, M2, M3 be some finite index sets, and suppose that for every i in any one of these sets, we are given an n-dimensional vector ai and a scalar bi, that will be used to form the ith constraint. Let also NI and N2 be subsets of {I, . . . , n} that indicate which variables Xj are constrained to be nonnegative or nonpositive, respectively. We then consider the problem
minimize c’x subject to a�x a�x a�x
Xj Xj
> bi, iEMI, < bi, iEM2, bi, iEM3, > 0, jENI, < 0, jEN2• (1.1) The variables Xl,...,Xn are called decision variables, and a vector x sat isfying all of the constraints is called a feasible solution or feasible vector. The set of all feasible solutions is called the feasible set or feasible region. If j is in neither NI nor N2, there are no restrictions on the sign of Xj, in which case we say that Xj is a free or unrestricted variable. The function c'x is called the objective function or cost function. A feasible solution x* that minimizes the objective function (that is, c'x* :s; c'x, for all feasible x) is called an optimal feasible solution or, simply, an optimal solution. The value of c'x* is then called the optimal cost. On the other hand, if for every real number K we can find a feasible solution x whose cost is less than K, we say that the optimal cost is below. (Sometimes, we will abuse terminology and say that the problem is unbounded.) We finally note that there is no need to study maximization problems separately, because maximizing c'x is equivalent to minimizing the linear cost function -c'x. An equality constraint a�x = bi is equivalent to the two constraints a�x :s; bi and a�x 2 bi• In addition, any constraint of the form a�x :s; bi can be rewritten as (-ai Xj :s; 0 are special cases of constraints of the form a�x 2 bi, where � is a - 00 or that the cost is unbounded )'x 2 -bi. Finally, constraints of the form Xj 2 0 or O. We conclude that the feasible set in a general linear unit vector and bi programming problem can be expressed exclusively in terms of inequality constraints of the form a�x 2 bi. Suppose that there is a total of m such constraints, indexed by i = 1,...,m, let b = (bl,...,bTn), and let A be the m x n matrix whose rows are the row vectors a�,..., a�, that is, A = a� [- -j. = a� Then, the constraints a�x 2 bi, i in the form Ax 2 b, and the linear programming problem can be written = 1,...,m, can be expressed compactly 4 as minimize c'x subject to Ax 2:: b. Chap. 1 Introduction (1.2) Inequalities such as Ax 2:: b will always be interpreted componentwise; that is, for every i, the ith component of the vector Ax, which is a�x, is greater than or equal to the ith component bi of the vector b. Example 1.2 The linear programming problem in Example 1.1 can be rewrit- ten as X3 2 which is of the same form as the problem (1.2), with c = (2, -1, 4, 0), -1 -1 0 -1 03-10 0 -3 1 0 0011 1000 00-10 andb= (-2,5,-5,3,0,0). Standard form problems A linear programming problem of the form minimize c'x subject to Ax b (1.3) x > 0,
is said to be in standard form. We provide an interpretation of problems in standardform. Supposethatxhasdimensionn andletAI,…,Anbethe
minimize 2XI subject to -Xl
X2 + 4X3
A=
X2 – X4 2 -2 3X2 X3 > 5 3X2 + X3 >
+X42 3 Xl 20
columns of A. Then, the constraint Ax LnAixi= b.
i=l
Intuitively, there are n available resource vectors AI’ . . . ‘ An, and a target vector b. We wish to “synthesize” the target vector b by using a non negative amount Xi of each resource vector Ai, while minimizing the cost l:�=l CiXi, where Ci is the unit cost of the ith resource. The following is a more concrete example.
= b can be written in the form
-5
X3
0,

Sec. 1.1 Variantsofthelinearprogrammingproblem 5
Example 1.3 (The diet problem) Suppose that there are n different foods and m different nutrients, and that we are given the following table with the nutritional content of a unit of each food:
food1 … foodn nutrient 1 all . . . aln
nutrient m aml . . . amn
Let A be the m x n matrix with entries aij. Note that the jth column Aj of this matrix represents the nutritional content of the jth food. Let b be a vector with the requirements of an ideal diet or, equivalently, a specification of the nutritional contents of an “ideal food.” We then interpret the standard form problem as the problem of mixing nonnegative quantities Xi of the available foods, to synthesize the ideal food at minimal cost. In a variant of this problem, the vector b specifies the minimal requirements of an adequate diet; in that case, the constraints Ax = b are replaced by Ax ;: b, and the problem is not in standard form.
Reduction to standard form
As argued earlier, any linear programming problem, including the standard form problem (1.3), is a special case of the general form (1.1). We now argue that the converse is also true and that a general linear programming problem can be transformed into an equivalent problem in standard form. Here, when we say that the two problems are equivalent, we mean that given a feasible solution to one problem, we can construct a feasible solution to the other, with the same cost. In particular, the two problems have the same optimal cost and given an optimal solution to one problem, we can construct an optimal solution to the other. The problem transformation we have in mind involves two steps:
(a) Elimination of free variables: Given an unrestricted variable Xj in a problem in general form, we replace it by xj – xj , where xj and xj are new variables on which we impose the sign constraints xj ;: 0 and xj 2 O. The underlying idea is that any real number can be written as the difference of two nonnegative numbers.
(b) Elimination of inequality constraints: Given an inequality constraint
of the form
Ln aijXj::;bi, j=1

6
Chap. 1 Introduction we introduce a new variable Si and the standard form constraints
Ln
j=l aijXj + Si bi,
Si > O.
Such a variable Si is called a slack variable. Similarly, an inequality constraint �7=1 aijXj ;: bi can be put in standard form by intro
ducing a surplus variable Si and the constraints �7=1 aijXj – Si
=
bi, Si ;: O.
We conclude that a general problem can be brought into standard form
and, therefore, we only need to develop methods that are capable of solving standard form problems.
Example 1.4 The problem
minimize 2X l subjectto Xl 3Xl Xl
+ 4X2
+ X2 2 3 + 2X2 14
2 0,
is equivalent to the standard form problem
minimize 2Xl + 4x� 4x;-
subjectto Xl+ x�-x;–X3 3
3Xl + 2x�-2x;- 14 Xl,X�,X;-,X3 2 o.
For example, given the feasible solution (Xl,X2) = (6, -2) to the original prob lem, we obtain the feasible solution (Xl , X�, x;-, X3) = (6, 0, 2, 1) to the standard form problem, which has the same cost. Conversely, given the feasible solution (Xl,x�,x;-,X3)= (8,1,6,0)tothestandardformproblem,weobtainthefeasible solution (Xl,X2) = (8, -5) to the original problem with the same cost.
In the sequel, we will often use the general form Ax ;: b to develop the theory of linear programming. However, when it comes to algorithms, and especially the simplex and interior point methods, we will be focusing
problems
In this section, we discuss a number of examples of linear programming problems. One of our purposes is to indicate the vast range of situations to which linear programming can be applied. Another purpose is to develop some familiarity with the art of constructing mathematical formulations of loosely defined optimization problems.
b, x ;: 0, which is computationally more 1.2 Examples of linear programming
on the standard form Ax convenient .
=

Sec. 1.2 Examplesoflinearprogrammingproblems 7
A production problem
A firm produces n different goods using m different raw materials. Let bi,
=
1, .. . , m, be the available amount of the ith raw material. The jth In this example, the choice of the decision variables is simple. Let Xj,
i
good, j = 1, …,n, requires aij units of the ith material and results in a revenue of Cj per unit produced. The firm faces the problem of deciding how much of each good to produce in order to maximize its total revenue.
. ,n, be the amount of the jth good. Then, the problem facing the
j
firm can be formulated as follows:
=
1, .
.
maximize CIXI + subject to ailXl +
+ CnXn
+ ainXn < bi, i = 1, . ..,m, Xj > 0,
Production planning by a computer manufacturer
The example that we consider here is a problem that Digital Equipment Corporation (DEC) had faced in the fourth quarter of 1988. It illustrates the complexities and uncertainties of real world applications, as well as the usefulness of mathematical modeling for making important strategic decisions .
In the second quarter of 1988, DEC introduced a new family of (single CPU) computer systems and workstations: GP-l, GP-2, and GP-3, which are general purpose computer systems with different memory, disk storage, and expansion capabilities, as well as WS-l and WS-2, which are work stations. In Table 1.1, we list the models, the list prices, the average disk usage per system, and the memory usage. For example, GP-l uses four
256K memory boards, and drive .
3 out of every 10 units are produced with a disk
# disk drives # 256K boards
0.3 4 1.7 2 02 1.4 2 01
j
=
1 , …,n.
Price
GP-1 $60,000
GP-2 $40,000
GP-3 $30,000
WS-1 $30,000
WS-2 $15,000
System
Table 1.1: Features of the five different DEC systems.
Shipments of this new family of products started in the third quarter and ramped slowly during the fourth quarter. The following difficulties were anticipated for the next quarter:

8 Chap. 1 Introduction (a) The in-house supplier of CPUs could provide at most 7,000 units, due
to debugging problems.
(b) The supply of disk drives was uncertain and was estimated by the manufacturer to be in the range of 3,000 to 7,000 units.
(c) The supply of 256K memory boards was also limited in the range of 8,000 to 16,000 units.
On the demand side, the marketing department established that the maximum demand for the first quarter of 1989 would be 1,800 for GP-1 systems, 300 for GP-3 systems, 3,800 systems for the whole GP family, and 3,200 systems for the WS family. Included in these projections were 500 orders for GP-2, 500 orders for WS-1, and 400 orders for WS-2 that had already been received and had to be fulfilled in the next quarter.
In the previous quarters, in order to address the disk drive shortage, DEC had produced GP-1, GP-3, and WS-2 with no disk drive (although 3 out of 10 customers for GP-1 systems wanted a disk drive), and GP-2, WS-1 with one disk drive. We refer to this way of configuring the systems as the constrained mode of production.
In addition, DEC could address the shortage of 256K memory boards by using two alternative boards, instead of four 256K memory boards, in the GP-1 system. DEC could provide 4,000 alternative boards for the next quarter.
It was clear to the manufacturing staff that the problem had become complex, as revenue, profitability, and customer satisfaction were at risk. The following decisions needed to be made:
(a) The production plan for the first quarter of 1989.
(b) Concerning disk drive usage, should DEC continue to manufacture products in the constrained mode, or should it plan to satisfy cus tomer preferences?
(c) Concerning memory boards, should DEC use alternative memory boards for its GP-1 systems?
(d) A final decision that had to be made was related to tradeoffs be tween shortages of disk drives and of 256K memory boards. The manufacturing staff would like to concentrate their efforts on either decreasing the shortage of disks or decreasing the shortage of 256K memory boards. Hence, they would like to know which alternative would have a larger effect on revenue.
In order to model the problem that DEC faced, we introduce variables
Xl, X2, X3, X4, X5, that represent the number (in thousands) of GP-1, GP- 2, GP-3, WS-1, and WS-2 systems, respectively, to be produced in the next quarter. Strictly speaking, since 1000Xi stands for number of units, it must be an integer. This can be accomplished by truncating each Xi after the third decimal point; given the size of the demand and the size of the

Sec. 1.2 Examplesoflinearprogrammingproblems 9
variables Xi, this has a negligible effect and the integrality constraint on 1000Xi can be ignored.
DEC had to make two distinct decisions: whether to use the con strained mode of production regarding disk drive usage, and whether to use alternative memory boards for the GP-1 system. As a result, there are four different combinations of possible choices.
We first develop a model for the case where alternative memory boards are not used and the constrained mode of production of disk drives is selected. The problem can be formulated as follows:
maximize 60Xl + 40X2 + 30X3 + 30X4 subject to the following constraints:
Xl+ X2+ X3+ X4 +X5 4Xl + 2X2 + 2X3 + 2X4 +Xs
X2 +X4
Xl < 1.8 X3 < 0.3 + 15xs (total revenue) (CPU availability) (256K availability) (disk drive availability) (max demand for GP-1) (max demand for GP-3) (max demand for GP) (max demand for WS) (min demand for GP-2) (min demand for WS- 1 ) (min demand for WS-2) Xl+ X2+ x3 X4 +Xs < 3.8 < 3.2 >0.5 >0.5
X2
X4
Xl, X2, X3, X4, X5 2: O.
<7 <8 <3 Xs > 0.4
Notice that the objective function is in millions of dollars. In some respects, this is a pessimistic formulation, because the 256K memory and disk drive availability were set to 8 and 3, respectively, which is the lowest value in the range that was estimated. It is actually of interest to determine the solution to this problem as the 256K memory availability ranges from 8 to 16, and the disk drive availability ranges from 3 to 7, because this provides valuable information on the sensitivity of the optimal solution on availability. In another respect, the formulation is optimistic because, for example, it assumes that the revenue from GP-1 systems is 60Xl for any Xl :-:; 1.8, even though a demand for 1,800 GP-1 systems is not guaranteed.
In order to accommodate the other three choices that DEC had, some of the problem constraints have to be modified, as follows. If we use the unconstrained mode of production for disk drives, the constraint X2+X4 :-:; 3 is replaced by
Furthermore, if we wish to use alternative memory boards in GP-1 systems, we replace the constraint 4Xl+2X2+2X3+2X4+Xs :-:; 8 by the two

10
Chap. 1 Introduction
constraints
t
Similarly, for nuclear power plants,
t
2Xl <4, 2X2+2X3+2X4+X5 <8. The four combinations of choices lead to four different linear programming problems, each of which needs to be solved for a variety of parameter values because, as discussed earlier, the right-hand side of some of the constraints is only known to lie within a certain range. Methods for solving linear programming problems, when certain parameters are allowed to vary, will be studied in Chapter 5, where this case study is revisited. Multiperiod planning of electric power capacity A state wants to plan its electricity capacity for the next T years. The state has a forecast of dt megawatts, presumed accurate, of the demand for electricity during year t= 1 , . . . , T . The existing capacity, which is in oil-fired plants, that will not be retired and will be available during year t, is et. There are two alternatives for expanding electric capacity: coal fired or nuclear power plants. There is a capital cost of Ct per megawatt of coal-fired capacity that becomes operational at the beginning of year t. The corresponding capital cost for nuclear power plants is nt. For various political and safety reasons, it has been decided that no more than 20% of the total capacity should ever be nuclear. Coal plants last for 20 years, while nuclear plants last for 15 years. A least cost capacity expansion plan is desired. The first step in formulating this problem as a linear programming problem is to define the decision variables. Let Xt and Yt be the amount of coal (respectively, nuclear) capacity brought on line at the beginning of year t. Let Wt and Zt be the total coal (respectively, nuclear) capacity available in year t. The cost of a capacity expansion plan is therefore, T Since coal-fired plants last for 20 years, we have Wt= L s=rnax{1,t-19} t= 1, . . . ,T. t = 1, . . . ,T. �)ctXt + ntYt). t=l Zt= L Ys, s=rnax{1,t-14} Sec. 1.2 Examplesoflinearprogrammingproblems 11 Since the available capacity must meet the forecasted demand, we require t= 1, . . . ,T. Finally, since no more than 20% ofthe total capacity should ever be nuclear, we have which can be written as Zt <0.2, Wt+Zt+et - Summarizing, the capacity expansion problem is as follows: --- T subjectto Wt- Zt - t=l t L Xs=0, s=max{1,t-19} t=1,...,T, t= 1, . . . ,T, t= 1, . . . ,T, t= 1,...,T, t= 1, . . . ,T. t L Ys = 0, s=max{1,t-14} We note that this formulation is not entirely realistic, because it disregards certain economies of scale that may favor larger plants. However, it can provide a ballpark estimate of the true cost. A scheduling problem In the previous examples, the choice of the decision variables was fairly straightforward. We now discuss an example where this choice is less obvi ous . A hospital wants to make a weekly night shift (12pm-8am) schedule for its nurses. The demand for nurses for the night shift on day j is an integer dj,j = 1,...,7. Every nurse works 5days in a row on the night shift. The problem is to find the minimal number of nurses the hospital needs to hire. One could try using a decision variable Yj equal to the number of nurses that work on day j. With this definition, however, we would not be able to capture the constraint that every nurse works 5 days in a row. For this reason, we choose the decision variables differently, and define Xj as minimize l)CtXt + ntYt) 12 Chap. 1 Introduction the number of nurses starting their week on day j. (For example, a nurse whose week starts on day 5 will work days 5, 6, 7, 1, 2.) We then have the following problem formulation: minimize subject to Xl +X2 +X3 +X4 +X5 +X6 +X7 Xl +X4 +X5 +X6 +X7 Xl + X2 +X5 +X6 +X7 Xl + X2 +X3 +X6 +X7 Xl + X2 +X3 +X4 +X7 Xl + X2 +X3 +X4 +X5 >dl >d2 > d3 > d4 > d5 > d6 > d7
Xj � 0,
X3 + X4 Xj integer.
X2 +X3 +X4
+X5 +X6 + X5 + X6
+ X7
This would be a linear programming problem, except for the constraint that each Xj must be an integer, and we actually have a linear integer program ming problem. One way of dealing with this issue is to ignore ( “relax” ) the integrality constraints and obtain the so-called linear programming re laxation of the original problem. Because the linear programming problem has fewer constraints, and therefore more options, the optimal cost will be less than or equal to the optimal cost of the original problem. If the optimal solution to the linear programming relaxation happens to be integer, then it is also an optimal solution to the original problem. If it is not integer, we can round each Xj upwards, thus obtaining a feasible, but not necessarily optimal, solution to the original problem. It turns out that for this partic ular problem, an optimal solution can be found without too much effort. However, this is the exception rather than the rule: finding optimal solu tions to general integer programming problems is typically difficult; some methods will be discussed in Chapter 11.
Choosing paths in a communication network
Consider a communication network consisting of n nodes. Nodes are con nected by communication links. A link allowing one-way transmission from node i to node j is described by an ordered pair (i,j). Let A be the set of all links. We assume that each link (i,j)EA can carry up to Uij bits per second. There is a positive charge Cij per bit transmitted along that link. Each node k generates data, at the rate of bk€ bits per second, that have to be transmitted to node £, either through a direct link (k, £) or by tracing a sequence of links. The problem is to choose paths along which all data reach their intended destinations, while minimizing the total cost. We allow the data with the same origin and destination to be split and be transmitted along different paths.
In order to formulate this problem as a linear programming problem, we introduce variables x’iJ indicating the amount of data with origin k and

Sec. 1.2 Examplesoflinearprogrammingproblems 13
destination £ that traverse lin{k (i,j). Let bk£ ifi
x�}?:O,
b�£ = _bkt’ <, if i = k, = £, 0, otherwise. Thus, b�£ is the net inflow at node i, from outside the network, of data with origin k and destination £. We then have the following formulation: minimize subjectto nn L L LCijx7} (i,j)EA k=l £=1 nn L Lx7} :: Uij, k=l £=1 i,k,£= 1,...,n, (i,j)EA, (i,j)EA, k,£=1,. .. ,n. The first constraint is a flow conservation constraint at node i for data with origin k and destination £. The expression x�£ tively, that leave node i along some link. The expression xk£ {j1(i,j)EA} b Xi 0 for all i. At an optimal solution to the reformulated problem, and for each i, we
=
0
0 or Xi
cost, in contradiction of optimality. Having guaranteed that either xi
must have either xi
xi and xi by the same amount and preserve feasibility, while reducing the
=
=
0, because otherwise we could reduce both
or xi
The formal correctness of the two reformulations that have been pre
sented here, and in a somewhat more general setting, is the subject of Exercise 1.5. We also note that the nonnegativity assumption on the cost coefficients Ci is crucial because, otherwise, the cost criterion is nonconvex.
Example 1.5 Consider the problem
minimize 21xI I + X2 subject to Xl + X2 2: 4.
=
0, the desired relation IXiI
=
xi + xi now follows.
=

Sec. 1.3 Piecewiselinearconvexobjectivefunctions 19 Our first reformulation yields
while the second yields
minimize 2xt + 2xI + X2 subjectto x+1 xl+X22:4 x+1 2:0
Xl > o.
We now continue with some applications involving piecewise linear
convex objective functions.
Data fitting
We are given m data points of the form (�, bi), i = 1, . . . ,m, where �E�n and biE�, and wish to build a model that predicts the value of the variable b from knowledge of the vector a. In such a situation, one often uses a linear model of the form b = a/x, where x is a parameter vector to be determined. Given a particular parameter vector x, the residual, or prediction error, at the ith data point is defined as Ibi-a�xl. Given a choice between alternative models, one should choose a model that “explains” the available data as best as possible, i.e., a model that results in small residuals.
One possibility is to minimize the largest residual. This is the problem of minimizing
•
with respect to x, subject to no constraints. Note that we are dealing here with a piecewise linear convex cost criterion. The following is an equivalent linear programming formulation:
minimize 2Zl + X2 subjectto Xl+X2 2: 4
Xl :: Zl -Xl :: Zl,
minimize subject to
z
a�x :: Z, z,
i = 1, . . . , m,
bi –
bi + a�x �
i 1, =
. . . ,m,
the decision variables being z and x.
In an alternative formulation, we could adopt the cost criterion
m
L I bi – a� x l · i=l
maxIbi- a�xl,

20 Cbap. 1 Introduction Since Ibi – a�xl is the smallest number Zi that satisfies bi – a�x ::; Zi and
-bi + a�x ::; Zi, we obtain the formulation
minimize Z1 + . . . + Zrn
subject to bi-a�x::;Zi, i=1,. . . ,m,
bi+a�x::;Zi, i=1,. . . ,m.
2In practice, one may wish to use the quadratic cost criterion 2:7:1(bi- a�x) , in order to obtain a “least squares fit.” This is a problem which is easier than linear programming; it can be solved using calculus methods, but its discussion is outside the scope of this book.
Optimal control of linear systems
Consider a dynamical system that evolves according to a model of the form
x(t + 1) Ax(t) + Bu(t), y(t) c’x(t).
Here x(t) is the state of the system at time t, y(t) is the system output, assumed scalar, and u(t) is a control vector that we are free to choose subject to linear constraints of the form Du(t) ::; d [these might include saturation constraints, i.e., hard bounds on the magnitude of each com ponent of u(t)]. To mention some possible applications, this could be a model of an airplane, an engine, an electrical circuit, a mechanical system, a manufacturing system, or even a model of economic growth. We are also given the initial state x(O). In one possible problem, we are to choose the values of the control variables u(O), . . . , u(T – 1) to drive the state x(T) to a target state, assumed for simplicity to be zero. In addition to zeroing the state, it is often desirable to keep the magnitude of the output small at all intermediate times, and we may wish to minimize
max ly(t)l. t=1,…,T-1
We then obtain the following linear programming problem:
minimize Z
subjectto -z::;y(t)::;z,
x(t+1) = Ax(t)+Bu(t), y(t)= c’x(t),
Du(t)::;d,
x(T) = 0,
x(O) = given.
t=1,…,T-1, t=0,. . .,T- 1, t=1,…,T-1, t=0,…,T- 1,

Sec. 1.4 Graphicalrepresentationandsolution 21
Additional linear constraints on the state vectors x(t), or a more general piecewise linear convex cost function of the state and the control, can also be incorporated.
Rocket control
Consider a rocket that travels along a straight path. Let Xt, Vt, and at be the position, velocity, and acceleration, respectively, of the rocket at time t. By discretizing time and by taking the time increment to be unity, we obtain an approximate discrete-time model of the form
Xt+l Xt +Vt, vt+l Vt +at·
We assume that the acceleration at is under our control, as it is determined bytherocketthrust. Inaroughmodel,themagnitudelatloftheaccelera tion can be assumed to be proportional to the rate of fuel consumption at
time tS.uppose that the rocket is initially at rest at the origin, that is, Xo = 0 and Vo = o. We wish the rocket to take off and “land softly” at unit dis tance from the origin after T time units, that is, XT = 1 and VT = o. Furthermore, we wish to accomplish this in an economical fashion. One possibility is to minimize the total fuel L'{‘;:�/ lat l spent subject to the pre ceding constraints. Alternatively, we may wish to minimize the maximum thrust required, which is maxt latl. Under either alternative, the problem can be formulated as a linear programming problem (Exercise 1.6).
1 .4 Graphical representation and solution
In this section, we consider a few simple examples that provide useful geo metric insights into the nature of linear programming problems. Our first example involves the graphical solution of a linear programming problem with two variables.
Example 1.6 Consider the problem
minimize -Xl X2 subject to Xl + 2X2 ::; 3 2Xl+ X2::;3
Xl,X2 2: o.
The feasible set is the shaded region in Figure 1.3. In order to find an optimal solution, we proceed as follows. For any given scalar z, we consider the set of all points whose cost c’x is equal to z; this is the line described by the equation -Xl – X2 = Z• Note that this line is perpendicular to the vector c = (-1, -1). Different values of z lead to different lines, all of them parallel to each other. In

22 Chap. 1 Introduction
Figure 1.3: Graphical solution of the problem in Example 1.6.
particular, increasing z corresponds to moving the line z = – X l – X 2 along the direction of the vector c. Since we are interested in minimizing z, we would like to move the line as much as possible in the direction of -c, as long as we do not leave the feasible region. The best we can do is z = -2 (see Figure 1.3), and the vector x = (1, 1) is an optimal solution. Note that this is a corner of the feasible set. (The concept of a “corner” will be defined formally in Chapter 2.)
For a problem in three dimensions, the same approach can be used except that the set of points with the same value of c’x is a plane, instead of a line. This plane is again perpendicular to the vector c, and the objective is to slide this plane as much as possible in the direction of -c, as long as we do not leave the feasible set.
Example 1.7 Suppose that the feasible set is the unit cube, described by the constraints0:SXi :S1,i=1,2,3,andthatc=(-1,-1,-1).Then,thevector x = (1, 1, 1) is an optimal solution. Once more, the optimal solution happens to be a corner of the feasible set (Figure 1.4).

Sec. 1.4 Graphicalrepresentationandsolution 23
Figure 1.4: The three-dimensional linear programming problem in Example 1.7.
In both of the preceding examples, the feasible set is bounded (does not extend to infinity), and the problem has a unique optimal solution. This is not always the case and we have some additional possibilities that are illustrated by the example that follows.
Example 1.8 Consider the feasible set in lR2 defined by the constraints
-Xl+X2 :s: 1 Xl > 0
X2 > 0,
(a) For the cost vector c = (1, 1), it is clear that x = (0, 0) is the unique
– c = (1,1,1)
which is shown in Figure 1.5. optimal solution.
(b) For the cost vector c = (1, 0), there are multiple optimal solutions, namely, every vector x of the form x = (0, X2), with 0 :s: X2 :s: 1, is optimal. Note that the set of optimal solutions is bounded.
(c) For the cost vector c = (0, 1), there are multiple optimal solutions, namely, everyvectorxoftheformx= (Xl,0),withXl 2’:0,isoptimal. Inthiscase, the set of optimal solutions is unbounded (contains vectors of arbitrarily large magnitude).
(d) Consider the cost vector c = (-1, -1). For any feasible solution (Xl, X2), we can always produce another feasible solution with less cost, by increasing the value of Xl. Therefore, no feasible solution is optimal. Furthermore, by considering vectors (Xl , X2) with ever increasing values of Xl and X2 , we can obtain a sequence of feasible solutions whose cost converges to – 00 . We therefore say that the optimal cost i s – 00 .
(e) Ifwe impose an additional constraint ofthe form Xl +X2 :s: -2, it is evident that no feasible solution exists.

24
Chap. 1 Introduction
c = (1,0)
Figure 1.5: The feasible set in Example 1.8. For each choice of c, an optimal solution is obtained by moving as much as possible in the direction of -c.
To summarize the insights obtained from Example 1.8, we have the following possibilities:
(a) There exists a unique optimal solution.
(b) There exist multiple optimal solutions; in this case, the set of optimal solutions can be either bounded or unbounded.
(c) The optimal cost is – 00 , and no feasible solution is optimal.
(d) The feasible set is empty.
In principle, there is an additional possibility: an optimal solution
does not exist even though the problem is feasible and the optimal cost is not -00; this is the case, for example, in the problem of minimizing l/x subject to x > 0 (for every feasible solution, there exists another with less cost, but the optimal cost is not -00). We will see later in this book that this possibility never arises in linear programming.
In the examples that we have considered, if the problem has at least one optimal solution, then an optimal solution can be found among the corners of the feasible set. In Chapter 2, we will show that this is a general feature of linear programming problems, as long as the feasible set has at least one corner.

Sec. 1.4 Graphicalrepresentationandsolution 25 Visualizing standard form problems
We now discuss a method that allows us to visualize standard form problems even if the dimension n of the vector x is greater than three. The reason for wishing to do so is that when n ::: 3, the feasible set of a standard form problem does not have much variety and does not provide enough insight into the general case. (In contrast, if the feasible set is described by constraints of the form Ax ?: b, enough variety is obtained even if x has dimension three.)
Suppose that we have a standard form problem, and that the matrix A has dimensions m x n. In particular, the decision vector x is of dimension n and we have m equality constraints. We assume that m ::: n and that the constraints Ax = b force x to lie on an (n – m)-dimensional set. (Intuitively, each constraint removes one of the “degrees of freedom” of x.) Ifwe “stand” on that (n – m)-dimensional set and ignore the m dimensions orthogonal to it, the feasible set is only constrained by the linear inequality constraintsXi?:0,i= 1,…,n. Inparticular,ifn-m= 2,thefeasible set can be drawn as a two-dimensional set defined by n linear inequality constraints. 3
To illustrate this approach, consider the feasible set in lR defined by
the constraints Xl + X2 + X3 2
= 1 and Xl,X ,X3 ?: 0 [Figure 1.6(a)], and note 3 and m = 1 . If we stand on the plane defined by the constraint
that n =
Xl + X2 + X3
two-dimensional space. Furthermore, each edge of the triangle corresponds to one of the constraints Xl,X2,X3 ?: 0; see Figure 1.6(b).
= 1, then the feasible set has the appearance of a triangle in
(a) (b)
Figure 1.6: (a) An n-dimensional view of the feasible set. (b) An (n – m)-dimensional view of the same set.

26 Chap. 1 Introduction 1.5 Linear algebra background and notation
This section provides a summary of the main notational conventions that we will be employing. It also contains a brief review of those results from linear algebra that are used in the sequel.
Set theoretic notation
If8isasetandxisanelementof8,wewritexE8. Asetcanbe
specified in the form 8
having property P. The cardinality of a finite set 8 is denoted by 181. The
and
[a, bl = {xE� I a :: x :: b}, (a,b) = {xE�I a 0 means that every component of x is nonnegative (respectively, positive) . If A is a matrix, the inequalities A � 0 and A > 0 have a similar meaning.
Matrix inversion
-.
a�x Ax= . ,
Let A be a square matrix. If there exists a square matrix B of the same dimensions satisfying AB = BA = I, we say that A is invertible or non singular. Such a matrix B, called the inverse of A, is unique and is de noted by A-1. We note that (A,)-1 = (A-1)’. Also, if A and B are invertible matrices of the same dimensions, then AB is also invertible and
(AB)-1=B-1A 1 l K n
Given a finite collection of vectors X , . . . ,x ElR , we say that they are linearly dependent if there exist real numbers al , . . . , aK , not all of them zero, such that z:J:=1 akxk = 0; otherwise, they are called linearly independent. An equivalent definition of linear independence requires that
a�x
�x

Sec. 1 . 5 Linear algebra background and notation 29 none of the vectors Xl , . . . , xK is a linear combination of the remaining
vectors (Exercise 1.18). We have the following result.
Theorem 1.2 Let A be a square matrix. Then, the following state
ments are equivalent:
(a) The matrix A is invertible.
(b) The matrix A’ is invertible.
(c) The determinant of A is nonzero.
(d) The rows of A are linearly independent.
(e) The columns of A are linearly independent.
(f) For every vector b, the linear system Ax = b has a unique solution.
(g) There exists some vector b such that the linear system Ax = b has a unique solution.
Assuming that A is an invertible square matrix, an explicit formula for the solution x = A-lb of the system Ax = b, is given by Cramer’s rule. Specifically, the jth component of x is given by
X =det(Aj), j det(A)
where Aj is the same matrix as A, except that its jth column is replaced by b. Here, as well as later, the notation det(A) is used to denote the determinant of a square matrix A.
Subspaces and bases
A nonempty subset S of 3tn is called a subspace of 3tn if ax+ by E S for everyx,yESandeverya,bE3t. If,inaddition,S=1=3tn,wesaythatSis a proper subspace. Note that every subspace must contain the zero vector.
The span of a finite number of vectors xl , . . . , xK in 3tn is the subspace of 3tn defined as the set of all vectors y of the form y = ��=l akxk, where each ak is a real number. Any such vector y is called a linear combination of xl , . . . , xK
Given a subspace S of 3tn, with S =1= {O}, a basis of S is a collection of vectors that are linearly independent and whose span is equal to S. Every basis of a given subspace has the same number of vectors and this number is called the dimension of the subspace. In particular, the dimension of 3tn is equal to n and every proper subspace of 3tn has dimension smaller than n. Note that one-dimensional subspaces are lines through the origin; two-dimensional subspaces are planes through the origin. Finally, the set {O} is a subspace and its dimension is defined to be zero.

30 Chap. 1 Introduction If 8 is a proper subspace of �n, then there exists a nonzero vector a
‘x
The result that follows provides some important facts regarding bases
which is orthogonal to 8, that is, a
if 8 has dimension m ‘1 , . . . , Ak are arbitrary scalars. For this case, 80 can be identified with the span of the vectors Xl, . . . , xk, and 8 is an affine subspace. If the vectors xl , . . . , xk are linearly independent , their span has dimension k, and the affine subspace 8 also has dimension k.
For a second example, we are given an m x n matrix A and a vector b E �m, and we consider the set
8= {xE�nIAx=b},
which we assume to be nonempty. Let us fix some xO such that Axo = b. An arbitrary vector x belongs to 8 if and only if Ax = b = Axo, or
o. Hence, x E 8 if and only if x – xO belongs to the subspace
A(x – XO)
80= {yIAy = O}. Weconcludethat8= {y+xOlyE80},and 8 is an affine subspace of �n. If A has m linearly independent rows, its nullspace 80 has dimension n – m. Hence, the affine subspace 8 also has dimension n – m. Intuitively, if a� are the rows of A, each one of the constraints a�x = bi removes one degree of freedom from x, thus reducing the dimension from n to n – mj see Figure 1.7 for an illustration.
=
Figure 1.7: Consider a set S in � defined by a single equality constraint a/x = b. Let xO be an element of S. The vector a is perpendicular to S. If Xl and x2 are linearly independent vectors that are orthogonal to a, then every x E S is of the form x = XO + A1Xl + A2X2. In particular, S is a two-dimensional affine subspace.

32 Chap. 1 Introduction 1 . 6 Algorithms and operation counts
Optimization problems such as linear programming and, more generally, all computational problems are solved by algorithms. Loosely speaking, an algorithm is a finite set of instructions of the type used in common pro gramming languages (arithmetic operations, conditional statements, read and write statements, etc.). Although the running time of an algorithm may depend substantially on clever programming or on the computer hard ware available, we are interested in comparing algorithms without having to examine the details of a particular implementation. As a first approx imation, this can be accomplished by counting the number of arithmetic operations (additions, multiplications, divisions, comparisons) required by an algorithm. This approach is often adequate even though it ignores the fact that adding or multiplying large integers or high-precision floating point numbers is more demanding than adding or multiplying single-digit integers. A more refined approach will be discussed briefly in Chapter 8.
Example 1.9
(a) Let a and b be vectors in �n. The natural algorithm for computing a’b requires n multiplications and n-l additions, for a total of 2n-l arithmetic operations .
(b) Let A and B be matrices of dimensions n x n. The traditional way of computing AB forms the inner product of a row of A and a column of B to obtain an entry of AB. Since there are n2 entries to be evaluated, a total of (2n – 1)n2 arithmetic operations are involved.
In Example 1.9, an exact operation count was possible. However,
for more complicated problems and algorithms, an exact count is usually very difficult. For this reason, we will settle for an estimate of the rate of growth of the number of arithmetic operations, as a function of the problem parameters. Thus, in Example 1.9, we might be content to say that the number of operations in the computation of an inner product increases linearly with n, and the number of operations in matrix multiplication increases cubically with n. This leads us to the order of magnitude notation that we define next.
Definition 1.2 Let f and g be functions that map positive numbers to positive numbers.
(a) We write f(n) = O (g(n)) if there exist positive numbers no and c such that f(n) :; cg(n) for all n ;: no.
(b) We write f(n) = n(g(n)) if there exist positive numbers no and c such that f(n) ;: cg(n) for all n ;: no.
(c) We write f(n) = 8(g(n)) if both f(n) = O(g(n)) and f(n) n(g(n)) hold.

Sec. 1 . 6 Algorithms an d operation counts 33
Forexample,wehave3n3+n2+10= 8(n3),nlogn= O(n2),and nlogn= O(n).
While the running time of the algorithms considered in Example 1.9 is predictable, the running time of more complicated algorithms often depends on the numerical values of the input data. In such cases, instead of trying to estimate the running time for each possible choice of the input, it is customary to estimate the running time for the worst possible input data of a given “size.” For example, if we have an algorithm for linear programming, we might be interested in estimating its worst-case running time over all problems with a given number of variables and constraints. This emphasis on the worst case is somewhat conservative and, in practice, the “average” running time of an algorithm might be more relevant . However, the average running time is much more difficult to estimate, or even to define, and for this reason, the worst-case approach is widely used.
Example 1.10 (Operation count of linear system solvers and matrix inversion) Consider the problem of solving a system of n linear equations in n unknowns. The classical method that eliminates one variable at a time (Gaussian elimination) is known to require O(n3) arithmetic operations in order to either compute a solution or to decide that no solution exists. Practical methods for matrix inversion also require O(n3) arithmetic operations. These facts will be of use later on.
Is the O(n3) running time of Gaussian elimination good or bad? Some perspective into this question is provided by the following observation: each time that technological advances lead to computer hardware that is faster by a factor of 8 (presumably every few years) , we can solve problems of twice the size than earlier possible. A similar argument applies to algorithms whose running time is O(nk) for some positive integer k. Such algorithms are said to run in polynomial time.
Algorithms also exist whose running time is O(2cn), where n is a parameter representing problem size and c is a constant; these are said to
take at least exponential time. For such algorithms and if c
that computer hardware becomes faster by a factor of 2, we can increase the value of n that we can handle only by 1. It is then reasonable to expect that no matter how much technology improves, problems with truly large values of n will always be difficult to handle.
Example 1.11 Suppose that we have a choice of two algorithms. The running time of the first is IOn/100 (exponential) and the running time of the second is lOn3 (polynomial). For very small n, e.g., for n = 3, the exponential time algorithm is preferable. To gain some perspective as to what happens for larger n, suppose that we have access to a workstation that can execute 107 arithmetic operations per second and that we are willing to let it run for 1000 seconds. Let us figure out what size problems can each algorithm handle within this time frame. The equation IOn/100 = 107 X 1000 yields n = 12, whereas the equation
=
1, each time

34 Chap. 1 Introduction lOn3 = 107 X 1000 yields n = 1000, indicating that the polynomial time algorithm
allows us to solve much larger problems.
The point ofview emerging from the above discussion is that, as a first cut, it is useful to juxtapose polynomial and exponential time algorithms, the former being viewed as relatively fast and efficient, and the latter as relatively slow. This point of view is justified in many – but not all – contexts and we will be returning to it later in this book.
1 . 7 Exercises
Exercise 1.1* Suppose that a function I : lRn f-+ lR is both concave and convex. Prove that I is an affine function.
Exercise 1.2 Suppose that Jr , . . . , 1m are convex functions from lRn into lR and letI(x)= 2::1li(X).
(a) Show that if each Ii is convex, so is I.
(b) Show that if each Ii is piecewise linear and convex, so is I.
Exercise 1.3 Consider the problem of minimizing a cost function of the form c’x + I(d’x), subject to the linear constraints Ax � b. Here, d is a given vector and the function I : lR f-+ lR is as specified in Figure 1.8. Provide a linear programming formulation of this problem.
Figure 1.8: The function I of Exercise 1.3.
Exercise 1.4 Consider the problem
minimize 2Xl + 31×2 – 101
subjectto IXI+21+IX21::5, and reformulate it as a linear programming problem.
1
I(x)
x

Sec. 1.7 Exercises 35 Exercise 1.5 Consider a linear optimization problem, with absolute values, of
the following form:
minimize c’x + d’Y subject to Ax+ By :: b
Yi= lXii, ‘v’i. Assume that all entries of B and d are nonnegative.
(a) Provide two different linear programming formulations, along the lines dis cussed in Section 1.3.
(b) Show that the original problem and the two reformulations are equivalent in the sense that either all three are infeasible, or all three have the same optimal cost.
(c) Provide an example to show that if B has negative entries, the problem may have a local minimum that is not a global minimum. (It will be seen in Chapter 2 that this is never the case in linear programming problems. Hence, in the presence of such negative entries, a linear programming re formulation is implausible.)
Exercise 1.6 Provide linear programming formulations of the two variants of the rocket control problem discussed at the end of Section 1.3.
Exercise 1.7 (The moment problem) Suppose that Z is a random variable taking values in the set 0, 1, . . . , K, with probabilities Po,PI, . . . ,PK, respectively. We are given the values of the first two moments E[Z] = ‘E:=o kPk and E[Z2] = ‘E:=o k2Pk of Z and we would like to obtain upper and lower bounds on the value
of the fourth moment E[Z4] = ‘E:=o k4pk of Z. Show how linear programming can be used to approach this problem.
Exercise 1.8 (Road lighting) Consider a road divided into n segments that is illuminated by m lamps. Let Pj be the power of the jth lamp. The illumination Ii of the ith segment is assumed to be ‘E;’=I aijpj , where aij are known coefficients. Let Ii be the desired illumination of road i.
We are interested in choosing the lamp powers Pj so that the illuminations Ii are close to the desired illuminations Ii. Provide a reasonable linear program ming formulation of this problem. Note that the wording of the problem is loose and there is more than one possible formulation.
Exercise 1.9 Consider a school district with I neighborhoods, J schools, and G grades at each school. Each school j has a capacity of Gjg for grade g. In each neighborhood i, the student population of grade i is Big. Finally, the distance of school j from neighborhood i is dij . Formulate a linear programming problem whose objective is to assign all students to schools, while minimizing the total distance traveled by all students. (You may ignore the fact that numbers of students must be integer.)
Exercise 1.10 (Production and inventory planning) A company must de liver di units of its product at the end of the ith month. Material produced during

36 Chap. 1 Introduction
a month can be delivered either at the end of the same month or can be stored as inventory and delivered at the end of a subsequent month; however, there is a storage cost of Cl dollars per month for each unit of product held in inventory. The year begins with zero inventory. If the company produces Xi units in month i and Xi+l units in month i + 1, it incurs a cost of c2 lxi+l – xi i dollars, reflecting the cost of switching to a new production level. Formulate a linear programming problem whose objective is to minimize the total cost of the production and in ventory schedule over a period of twelve months. Assume that inventory left at
the end of the year has no value and does not incur any storage costs.
Exercise 1.11 (Optimal currency conversion) Suppose that there are N available currencies, and assume that one unit of currency i can be exchanged for ‘rij units of currency j. (Naturally, we assume that ‘rij > 0.) There also certain regulations that impose a limit Ui on the total amount of currency i that can be exchanged on any given day. Suppose that we start with B units of currency 1 and that we would like to maximize the number of units of currency N that we end up with at the end of the day, through a sequence of currency transactions. Provide a linear programming formulation of this problem. Assume that for any sequence iI,…,ikofcurrencies,wehave’ri1i2’ri2i3…’rik-1ik’riki1 :S1,whichmeansthat wealth cannot be multiplied by going through a cycle of currencies.
Exercise 1.12 (Chebychev center) Consider a set P described by linear inequality constraints, that is, P = {x E 1Rn I a;x :S bi , i = 1, . . . , m}. A ball with center y and radius ‘r is defined as the set of all points within (Euclidean) distance ‘r from y. We are interested in finding a ball with the largest possible radius, which is entirely contained within the set P. (The center of such a ball is called the Chebychev cente’r of P.) Provide a linear programming formulation of this problem.
Exercise 1.13 (Linear fractional programming) Consider the problem
c’x + d
minimize
f’x + g subject to Ax :S b
f’x + g > O.
Suppose that we have some prior knowledge that the optimal cost belongs to an interval [K, L] . Provide a procedure, that uses linear programming as a subrou tine, and that allows us to compute the optimal cost within any desired accuracy. Hint: Consider the problem of deciding whether the optimal cost is less than or equal to a certain number.
Exercise 1.14 A company produces and sells two different products. The de mand for each product is unlimited, but the company is constrained by cash availability and machine capacity.
Each unit of the first and second product requires 3 and 4 machine hours, respectively. There are 20,000 machine hours available in the current production period. The production costs are $3 and $2 per unit of the first and second product, respectively. The selling prices of the first and second product are $6 and $5.40 per unit, respectively. The available cash is $4,000; furthermore, 45%

Sec. 1 . 7 Exercises 37
o f the sales revenues from the first product and 3 0 % o f the sales revenues from the second product will be made available to finance operations during the current period.
(a) Formulate a linear programming problem that aims at maximizing net in come subject to the cash availability and machine capacity limitations.
(b) Solve the problem graphically to obtain an optimal solution.
(c) Suppose that the company could increase its available machine hours by 2,000, after spending $400 for certain repairs. Should the investment be made?
Exercise 1.15 A company produces two kinds of products. A product of the first type requires 1/4 hours of assembly labor, 1/8 hours of testing, and $1.2 worth of raw materials. A product of the second type requires 1/3 hours of assembly, 1/3 hours of testing, and $0.9 worth of raw materials. Given the current personnel of the company, there can be at most 90 hours of assembly labor and 80 hours of testing, each day. Products of the first and second type have a market value of $9 and $8, respectively.
(a) Formulate a linear programming problem that can be used to maximize the daily profit of the company.
(b) Consider the following two modifications to the original problem:
(i) Suppose that up to 50 hours of overtime assembly labor can be sched
uled, at a cost of $7 per hour.
(ii) Suppose that the raw material supplier provides a 10% discount if the daily bill is above $300.
Which of the above two elements can be easily incorporated into the lin ear programming formulation and how? If one or both are not easy to incorporate, indicate how you might nevertheless solve the problem.
Exercise 1.16 A manager of an oil refinery has 8 million barrels of crude oil A and 5 million barrels of crude oil B allocated for .production during the coming month. These resources can be used to make either gasoline, which sells for $38 per barrel, or home heating oil, which sells for $33 per barrel. There are three production processes with the following characteristics:
Input crude A Input crude B Output gasoline Output heating oil Cost
Process 1 3
5
4
3
$51
Process 2 1
1
1
1
$11
Process 3 5
3
3
4
$40
All quantities are in barrels. For example, with the first process, 3 barrels of crude A and 5 barrels of crude B are used to produce 4 barrels of gasoline and

38 Chap. 1 Introduction
3 barrels of heating oil. The costs in this table refer to variable and allocated overhead costs, and there are no separate cost items for the cost of the crudes. Formulate a linear programming problem that would help the manager maximize net revenue over the next month.
Exercise 1.17 (Investment under taxation) An investor has a portfolio of ndifferentstocks. HehasboughtSisharesofstockiatpricepi,i=1,…,n. The current price of one share of stock i is qi. The investor expects that the price of one share of stock i in one year will be Ti. If he sells shares, the investor pays transaction costs at the rate of 1% of the amount transacted. In addition, the investor pays taxes at the rate of 30% on capital gains. For example, suppose that the investor sells 1,000 shares of a stock at $50 per share. He has bought these shares at $30 per share. He receives $50,000. However, he owes 0.30x(50,000 – 30,000) = $6,000 on capital gain taxes and 0.01 x (50,000) = $500 on transaction costs. So, by selling 1,000 shares of this stock he nets 50,000 – 6,000 – 500 = $43,500. Formulate the problem of selecting how many shares the investor needs to sell in order to raise an amount of money K, net of capital gains and transaction costs, while maximizing the expected value of his portfolio next year.
Exercise 1.18 Show that the vectors in a given finite collection are linearly independent if and only if none of the vectors can be expressed as a linear com bination of the others.
Exercise 1.19 Suppose that we are given a set of vectors in !Rn that form a basis, and let y be an arbitrary vector in !Rn. We wish to express y as a linear combination of the basis vectors. How can this be accomplished?
Exercise 1.20
(a) LetS={AxIxE!Rn},whereAisagivenmatrix. ShowthatSisa subspace of !Rn.
(b) Assume that S is a proper subspace of !Rn. Show that there exists a matrix B such that S = {y E !Rn I By = O}. Hint: Use vectors that are orthogonal to S to form the matrix B.
(c) Suppose that V is an m-dimensional affine subspace of !Rn, with m < n. Show that there exist linearly independent vectors al, . . . , an-rn, and scalars bl, . . . , bn-rn, such that V = {y l a�y = bi' i = 1, . . . , n - m}. 1.8 History, notes, and sources The word "programming" has been used traditionally by planners to de scribe the process of operations planning and resource allocation. In the 1940s, it was realized that this process could often be aided by solving op timization problems involving linear constraints and linear objectives. The term "linear programming" then emerged. The initial impetus came in the aftermath of World War II, within the context of military planning prob lems. In 1947, Dantzig proposed an algorithm, the simplex method, which Sec. 1 . 8 History, notes, and sources 39 made the solution of linear programming problems practical. There fol lowed a period of intense activity during which many important problems in transportation, economics, military operations, scheduling, etc., were cast in this framework. Since then, computer technology has advanced rapidly, the range of applications has expanded, new powerful methods have been discovered, and the underlying mathematical understanding has become deeper and more comprehensive. Today, linear programming is a routinely used tool that can be found in some spreadsheet software packages. Dantzig's development of the simplex method has been a defining moment in the history of the field, because it came at a time of grow ing practical needs and of advances in computing technology. But, as is the case with most "scientific revolutions," the history of the field is much richer. Early work goes back to Fourier, who in 1824 developed an algo rithm for solving systems of linear inequalities. Fourier's method is far less efficient than the simplex method, but this issue was not relevant at the time. In 1910, de la Vallee Poussin developed a method, similar to the sim plex method, for minimizing maxi Ibi - a�xl, a problem that we discussed in Section 1.3. In the late 1930s, the Soviet mathematician Kantorovich became in terested in problems of optimal resource allocation in a centrally planned economy, for which he gave linear programming formulations. He also pro vided a solution method, but his work did not become widely known at the time. Around the same time, several models arising in classical, Walrasian, economics were studied and refined, and led to formulations closely related to linear programming. Koopmans, an economist, played an important role and eventually (in 1975) shared the Nobel Prize in economic science with Kantorovich . On the theoretical front, the mathematical structures that under lie linear programming were independently studied, in the period 1870- 1930, by many prominent mathematicians, such as Farkas, Minkowski, Caratheodory, and others. Also, in 1928, von Neumann developed an im portant result in game theory that would later prove to have strong con nections with the deeper structure of linear programming. Subsequent to Dantzig's work, there has been much and important research in areas such as large scale optimization, network optimization, interior point methods, integer programming, and complexity theory. We defer the discussion of this research to the notes and sources sections of later chapters. For a more detailed account of the history of linear programming, the reader is referred to Schrijver (1986), Orden (1993), and the volume edited by Lenstra, Rinnooy Kan, and Schrijver (1991) (see especially the article by Dantzig in that volume) . There are several texts that cover the general subject of linear pro gramming, starting with a comprehensive one by Dantzig (1963). Some more recent texts are Papadimitriou and Steiglitz (1982), Chvatal (1983), Murty (1983), Luenberger (1984), Bazaraa, Jarvis, and Sherali (1990). Fi- 40 Chap. 1 Introduction nally, Schrijver (1986) is a comprehensive, but more advanced reference on the subject. 1.1. The formulation of the diet problem is due to Stigler (1945). 1.2. The case study on DEC's production planning was developed by Fre und and Shannahan (1992). Methods for dealing with the nurse scheduling and other cyclic problems are studied by Bartholdi, Orlin, and Ratliff (1980). More information on pattern classification can be found in Duda and Hart (1973), or Haykin (1994). 1.3. A deep and comprehensive treatment of convex functions and their properties is provided by Rockafellar (1970). Linear programming arises in control problems, in ways that are more sophisticated than what is described here; see, e.g., Dahleh and Diaz-Bobillo (1995). 1.5. For an introduction to linear algebra, see Strang (1988). 1.6. For a more detailed treatment of algorithms and their computational requirements, see Lewis and Papadimitriou (1981), Papadimitriou and Steiglitz (1982), or Cormen, Leiserson, and Rivest (1990). 1.7. Exercise 1.8 is adapted from Boyd and Vandenberghe (1995). Ex ercises 1.9 and 1.14 are adapted from Bradley, Hax, and Magnanti (1977). Exercise 1.11 is adapted from Ahuja, Magnanti, and Orlin (1993). Chapter 2 The geometry of linear • programmIng Contents 2.1. Polyhedra and convex sets 2.2. Extreme points, vertices, and basic feasible solutions 2.3. Polyhedra in standard form 2.4. Degeneracy 2.5. Existence of extreme points 2.6. Optimality of extreme points 2.7. Representation of bounded polyhedra* 2.8. Projections of polyhedra: Fourier-Motzkin elimination* 2.9. Summary 2.10. Exercises 2.11. Notes and sources 41 42 Chap. 2 The geometry of linear programming In this chapter, we define a polyhedron as a set described by a finite number of linear equality and inequality constraints. In particular, the feasible set in a linear programming problem is a polyhedron. We study the basic geometric properties of polyhedra in some detail, with emphasis on their "corner points" (vertices). As it turns out, common geometric intuition derived from the familiar three-dimensional polyhedra is essentially correct when applied to higher-dimensional polyhedra. Another interesting aspect ofthe development in this chapter is that certain concepts (e.g., the concept of a vertex) can be defined either geometrically or algebraically. While the geometric view may be more natural, the algebraic approach is essential for carrying out computations. Much of the richness of the subject lies in the interplay between the geometric and the algebraic points of view. Our development starts with a characterization of the corner points of feasible sets in the general form {x I Ax ;: b}. Later on, we focus on the case where the feasible set is in the standard form {x I Ax = and we derive a simple algebraic characterization of the corner points. The latter characterization will play a central role in the development of the simplex method in Chapter 3. The main results of this chapter state that a nonempty polyhedron has at least one corner point if and only if it does not contain a line, and if this is the case, the search for optimal solutions to linear programming problems can be restricted to corner points. These results are proved for the most general case of linear programming problems using geometric arguments. The same results will also be proved in the next chapter, for the case of standard form problems, as a corollary of our development of the simplex method. Thus, the reader who wishes to focus on standard form problems may skip the proofs in Sections 2.5 and 2.6. Finally, Sections 2.7 and 2.8 can also be skipped during a first reading; any results in these sections that are needed later on will be rederived in Chapter 4, using different techniques. 2.1 Polyhedra and convex sets In this section, we introduce some important concepts that will be used to study the geometry of linear programming, including a discussion of convexity. Hyperplanes, halfspaces, and polyhedra We start with the formal definition of a polyhedron. b, x ;: O}, Definition 2.1 A polyhedron is a set that can be described in the form {x E mn lR I Ax ;: b}, where A is an m x n matrix and b is a vector in lR . Sec. 2. 1 Polyhedra and convex sets 43 As discussed in Section 1 . 1 , the feasible set of any linear programming problem can be described by inequality constraints of the form Ax � b, b, x � o} is also a polyhedron and will be referred to as a polyhedron in standard form. A polyhedron can either "extend to infinity," or can be confined in a finite region. The definition that follows refers to this distinction. Definition 2.2 A set S c ?Rn is bounded if there exists a constant K such that the absolute value of every component of every element ofS is less than or equal to K. The next definition deals with polyhedra determined by a single linear constraint . Definition 2.3 Let a be a nonzero vector in ?Rn and let b be a scalar. (a) The set {x E �n I a/x = b} is called a hyperplane. (b) The set {x E �n I a/x � b} is called a halfspace. Note that a hyperplane is the boundary of a corresponding halfspace. In addition, the vector a in the definition of the hyperplane is perpendicular to the hyperplane itself. [To see this, note that if x and y belong to the same hyperplane, then a/x = a/yo Hence, a/(x - y) = 0 and therefore a is orthogonal to any direction vector confined to the hyperplane.] Finally, note that a polyhedron is equal to the intersection of a finite number of halfspaces; see Figure 2.1. Convex Sets We now define the important notion of a convex set. Definition 2.4 A set S c ?Rn is convex if for any x, y E S, and any AE[O,IJ,wehaveAX+(1- A)yES. NotethatifAE [0,1]'thenAx+(1-A)yisaweightedaverageof the vectors x, y, and therefore belongs to the line segment joining x and y. Thus, a set is convex if the segment joining any two of its elements is contained in the set; see Figure 2.2. Our next definition refers to weighted averages of a finite number of vectors; see Figure 2.3. and is therefore a polyhedron. In particular, a set of the form {x E n �I Ax = 44 Chap. 2 The geometry of linear programming (a) (b) Figure 2.1: (a) A hyperplane and two halfspaces. (b) The poly hedron{xI�x;:bi, i= 1,...,5}istheintersectionoffivehalfs paces. Note that each vector ai is perpendicular to the hyperplane {x I �x = bi}. Definition 2.5 Let Xl , . . . , xk b e vectors in �n and let >‘1 , . . . , Ak be nonnegative scalars whose sum is unity.
(a) The vector L:�=l AiXi is said to be a convex combination of the vectors Xl,…,xk
(b) Theconvexhullofthevectorsxl,…,xkisthesetofallconvex combinations of these vectors.
The result that follows establishes some important facts related to convexity.
Theorem 2.1
(a) The intersection of convex sets is convex.
(b) Every polyhedron is a convex set.
(c) A convex combination of a finite number of elements of a convex set also belongs to that set.
(d) The convex hull of a finite number of vectors is a convex set.

Sec. 2. 1 Polyhedra and convex sets 45
Figure 2.2: The set S is convex, but the set Q is not, because the segment joining x and y is not contained in Q.
Figure 2.3: The convex hull of seven points in R2 .
Proof.
(a) Let Si, i E I, be convex sets where I is some index set, and suppose that x and y belong to the intersection niEISi. Let A E [0, 1]. Since each Si is convex and contains x, y, we have AX+ (1 – A)y E Si, which proves that AX + (1 – A)y also belongs to the intersection of the sets Si. Therefore, niEISi is convex.
(b) Let a be a vector and let b a scalar. Suppose that x and y satisfy a’x � b and a’y � b, respectively, and therefore belong to the same halfspace. LetAE[0,1]. Then,a'(Ax+(1-A)Y)�Ab+(1-A)b= b, which proves that AX + (1 – A)y also belongs to the same halfspace. Therefore a halfspace is convex. Since a polyhedron is the intersection of a finite number of halfspaces, the result follows from part (a) .
(c) A convex combination of two elements of a convex set lies in that
-_____ x3

46
Chap. 2 The geometry of linear programming
set, by the definition of convexity. Let us assume, as an induction hypothesis, that a convex combination of k elements of a convex set belongstothatset. Considerk+1elementsxl,…,xk+1ofaconvex set S and let ).1, . . . , ).k+1 be nonnegative scalars that sum to 1. We assume, without loss of generality, that ).k+1 f=. 1. We then have
(2.1)
The coefficients ).d(l – ).k+1), i = 1, . . . , k, are nonnegative and sum to unity; using the induction hypothesis, L7=1 ).ixi/(1 – ).k+1) E S. Then, the fact that S is convex and Eq. (2.1) imply that L7�11 ).iXi E S, and the induction step is complete. k
2.2
(d) Let S be the convex hull of the vectors xl,…,x and let y �k . �k . =
L.i=l (iX’, Z = L.i=l {}iX’ be two elements of S, where (i 2:: 0, {}i 2:: 0,
7(} andL7=1(i L=1i
1. Let ). E [0, 1]. Then, kikiki
=
We note that the coefficients ).(i + (1 – ).){}i, i = 1, . . . , k, are non negativeandsumtounity. Thisshowsthat).y+(1-).)zisaconvex combination of Xl , . . . , xk and, therefore, belongs to S. This estab
=
).y+(1-).)z= ).L(iX +(1-),)L{}iX = L().(i+(1-).){}i)X. i=l i=l
lishes the convexity of S.
Extreme points, vertices, and basic
feasible solutions
D
We observed in Section 1.4 that an optimal solution to a linear programming problem tends to occur at a “corner” of the polyhedron over which we are optimizing. In this section, we suggest three different ways of defining the concept of a “corner” and then show that all three definitions are equivalent.
Our first definition defines an extreme point of a polyhedron as a point that cannot be expressed as a convex combination of two other elements of the polyhedron, and is illustrated in Figure 2.4. Notice that this definition is entirely geometric and does not refer to a specific representation of a polyhedron in terms of linear constraints.
Definition 2.6 Let P be a polyhedron. A vector x E P is an ex tremepointofPifwecannotfindtwovectorsy,zEP, bothdifferent
fromx, andascalar).E[0,1]’suchthatx=
).y+ (1 – ).)z.

Sec. 2.2 Extreme points, vertices, and basic feasible solutions 47
fI'”
Y
Figure 2.4: The vector w is not an extreme point because it is a convex combination of v and u. The vector x is an extreme point: if x = .>..y + (1 – .>..)z and ‘>” E [0, 1]’ then either y f/. P, or z f/. P, or x = y, or x = z.
An alternative geometric definition defines a vertex of a polyhedron P as the unique optimal solution to some linear programming problem with feasible set P.
Definition 2.7 Let P be a polyhedron. A vector x E P is a vertex of P if there exists some e such that e/x < e'y for all y satisfying y E P and y i= x. In other words, x is a vertex of P if and only if P is on one side of a hyperplane (the hyperplane {y I e'y = e/x}) which meets P only at the point x; see Figure 2.5. The two geometric definitions that we have given so far are not easy to work with from an algorithmic point of view. We would like to have a definition that relies on a representation of a polyhedron in terms of linear constraints and which reduces to an algebraic test. In order to provide such a definition, we need some more terminology. Consider a polyhedron P C and inequality constraints a�x > bi, a�x bi and, provided that E is small, we
y=1=x*. IfYEP,
E P, y =1= x*,
Z
(cf. Definition 2.6).
will also have a�y > bi. (It suffices to choose E so that Ela�dl ijajx=LAijbj, i= 1,. . . ,m.
j=l j=l
Consider now an element y of Q. We will show that it belongs to P. Indeed,
for any i,
which establishes that y E P and Q C P. o
Notice that the polyhedron Q in Theorem 2.5 is in standard form; namely, Q = {x I Dx = f, x 2: O} where D is a k x n submatrix of A, with rank equal to k, and f is a k-dimensional subvector of h. We conclude that as long as the feasible set is nonempty, a linear programming problem in standard form can be reduced to an equivalent standard form problem (with the same feasible set) in which the equality constraints are linearly independent .
kk a�y=LAijajy =LAijbj=bi,
j=l j=l

58 Chap. 2 The geometry of linear programming Example 2.3 Consider the (nonempty) polyhedron defined by the constraints
2XI + X2 + X3 2 Xl + X2 1 1
The corresponding matrix A has rank two. This is because the last two rows (1, 1, 0) and (1, 0, 1) are linearly independent, but the first row is equal to the sum of the other two. Thus, the first constraint is redundant and after it is eliminated, we still have the same polyhedron.
2.4 Degeneracy
According to our definition, at a basic solution, we must have n linearly independent active constraints. This allows for the possibility that the number of active constraints is greater than n . (Of course, in n dimensions, no more than n of them can be linearly independent.) In this case, we say that we have a degenerate basic solution. In other words, at a degenerate basic solution, the number of active constraints is greater than the minimum necessary.
Definition 2.10 A basic solution x E lRn is said to be degenerate if more than n of the constraints are active at x.
In two dimensions, a degenerate basic solution is at the intersection of three or more lines; in three dimensions, a degenerate basic solution is at the intersection of four or more planes; see Figure 2.9 for an illustration. It turns out that the presence of degeneracy can strongly affect the behavior of linear programming algorithms and for this reason, we will now develop some more intuition.
Example 2.4 Consider the polyhedron P defined by the constraints
Xl+X2+2X3 S 8 X2+6X3 S 12
Xl S 4
X2 S 6 Xl,X2,X3 > o.
The vector x = (2, 6, 0) is a nondegenerate basic feasible solution, because there are exactly three active and linearly independent constraints, namely, Xl + X2 + 2X3 S 8, X2 S 6, and X3 ::: o. The vector x = (4,0,2) is a degenerate basic feasible solution, because there are four active constraints, three of them linearly independent, namely, Xl + X2 + 2X3 S 8, X2 + 6X3 S 12, Xl S 4, and X2 ::: o.

Sec. 2.4 Degeneracy 59
B
(a) (b)
Figure 2.9: The points A and C are degenerate basic feasible solutions. The points B and E are nondegenerate basic feasible solutions. The point D is a degenerate basic solution.
Degeneracy in standard form polyhedra
At a basic solution of a polyhedron in standard form, the m equality con straints are always active. Therefore, having more than n active constraints is the same as having more than n – m variables at zero level. This leads us to the next definition which is a special case of Definition 2.10.
Definition 2.11 Consider the standard form polyhedron P = {xE
p=
3?n I Ax
number of rows of A. The vector x is a degenerate basic solution if more than n – m of the components of x are zero.
=
b,x�O}andletxbeabasicsolution. Letm bethe
Example 2.5 Consider once more the polyhedron of Example 2.4. By intro ducing the slack variables X4 , • • • , X7 , we can transform it into the standard form
{x = (Xl, . . . , X7) I Ax = b, x � O}, where
1210o 16o1o o0o01 10o0o
Consider the basis consisting of the linearly independent columns AI, A2, A3, A7. To calculate the corresponding basic solution, we first set the nonbasic variables X4, X5, and X6 to zero, and then solve the system Ax = b for the remaining variables, to obtain x = (4, 0, 2, 0, 0, 0, 6). This is a degenerate basic feasible solution, because we have a total of four variables that are zero, whereas

60 Chap. 2 The geometry of linear programming
n – m = 7 – 4 = 3. Thus, while we initially set only the three nonbasic variables to zero, the solution to the system Ax = b turned out to satisfy one more of the constraints (namely, the constraint X2 � 0) with equality. Consider now the basis consisting of the linearly independent columns AI, A3, A4, and A7. The corresponding basic feasible solution is again x = (4, 0, 2, 0, 0, 0, 6).
The preceding example suggests that we can think of degeneracy in the following terms. We pick a basic solution by picking n linearly indepen dent constraints to be satisfied with equality, and we realize that certain other constraints are also satisfied with equality. If the entries of A or b were chosen at random, this would almost never happen. Also, Figure 2.10 illustrates that if the coefficients of the active constraints are slightly perturbed, degeneracy can disappear (cf. Exercise 2.18). In practical prob lems, however, the entries of A and b often have a special (nonrandom) structure, and degeneracy is more common than the preceding argument would seem to suggest.
Figure 2.10: Small changes in the constraining inequalities can remove degeneracy.
In order to visualize degeneracy in standard form polyhedra, we as
n -m =2andwedrawthefeasiblesetasasubsetofthe
sume that
two-dimensional set defined by the equality constraints Ax = b; see Fig ure 2 . 1 1 . At a nondegenerate basic solution, exactly n – m of the constraints Xi ;: 0 are active; the corresponding variables are nonbasic. In the case of a degenerate basic solution, more than n – m of the constraints Xi ;: 0 are active, and there are usually several ways of choosing which n – m variables to call nonbasic; in that case, there are several bases corresponding to that same basic solution. (This discussion refers to the typical case. However, there are examples of degenerate basic solutions to which there corresponds only one basis.)
Degeneracy is not a purely geometric property
We close this section by pointing out that degeneracy of basic feasible solu tions is not, in general, a geometric (representation independent) property,

Sec. 2.4 Degeneracy 61
Figure 2.11: An (n – m)-dimensional illustration of degener acy. Here, n = 6 and m = 4. The basic feasible solution A is nondegenerate and the basic variables are Xl , X2 , X3 , X6 . The ba sic feasible solution B is degenerate. We can choose Xl , X6 as the nonbasic variables. Other possibilities are to choose Xl , Xs , or to choose Xs , X6 . Thus, there are three possible bases, for the same basic feasible solution B.
but rather it may depend on the particular representation of a polyhedron. To illustrate this point, consider the standard form polyhedron (cf. Figure 2.12)
Wehaven =3,m =2andn-m =1. Thevector(1,1,0)isnondegenerate because only one variable is zero. The vector (0,0,1) is degenerate because two variables are zero. However, the same polyhedron can also be described
(0,0,1)
Figure 2.12: An example of degeneracy in a standard form problem.

62 Chap. 2 The geometry of linear programming
Figure 2.13: The polyhedron P contains a line and does not have an extreme point, while Q does not contain a line and has extreme points.
in the (nonstandard) form
p={(Xl,X2,X3) IXl-X2=0, Xl+X2+2X3=2, Xl20, X32O}.
The vector (0,0,1) is now a nondegenerate basic feasible solution, because there are only three active constraints.
For another example, consider a nondegenerate basic feasible solution x*ofastandardformpolyhedronP={xIAx=b, x2O},whereA is of dimensions m x n. In particular, exactly n – m of the variables xi are equal to zero. Let us now represent Pin the form P= {x I Ax 2 b, -Ax 2 -b, x 2 O}. Then, at the basic feasible solution x*, we have n – m variables set to zero and an additional 2m inequality constraints are satisfied with equality. We therefore have n + m active constraints and x* is degenerate. Hence, under the second representation, every basic feasible solution is degenerate.
We have established that a degenerate basic feasible solution under one representation could be nondegenerate under another representation. Still, it can be shown that if a basic feasible solution is degenerate under one particular standard form representation, then it is degenerate under every standard form representation of the same polyhedron (Exercise 2.19).
2 . 5 Existence of extreme points
We obtain in this section necessary and sufficient conditions for a polyhe dron to have at least one extreme point. We first observe that not every polyhedron has this property. For example, if n > 1, a halfspace in �n is a polyhedron without extreme points. Also, as argued in Section 2.2 (cf. the discussion after Definition 2.9), if the matrix A has fewer than n rows, then the polyhedron {x E �n I Ax 2 b} does not have a basic feasible solution.

Sec. 2.5 Existenceofextremepoints 63
It turns out that the existence of an extreme point depends on whether a polyhedron contains an infinite line or not; see Figure 2.13. We need the following definition.
Definition 2.12 A polyhedron P C lRn contains a line ifthere exists avectorxE PandanonzerovectordE lRnsuchthatx+AdE Pfor all scalars A.
We then have the following result.
Theorem 2.6 Suppose that the polyhedron P = {x E lRn I a�x ?:: bi, i = 1, . . . , m} is nonempty. Then, the following are equivalent:
(a) The polyhedron P has at least one extreme point.
(b) The polyhedron P does not contain a line.
(c) There exist n vectors out of the family aI, . . . , am, which are linearly independent.
Proof. (b) =? (a)
We first prove that if P does not contain a line, then it has a basic feasible solution and, therefore, an extreme point. A geometric interpretation of this proof is provided in Figure 2.14.
Let x be an element of P and let I = {i I a�x = bi}. If n of the vectors
ai, i E I, corresponding to the active constraints are linearly independent,
then x is, by definition, a basic feasible solution and, therefore, a basic
feasible solution exists. If this is not the case, then all of the vectors � ,
i E I, lie in a proper subspace of lRn and there exists a nonzero vector
n
lR such that a�d = 0, for every i E I. Let us consider the line
dE
consisting of all points of the form y = x + Ad, where A is an arbitrary scalar. For i E I, we have a�y = a�x+Aa�d = a�x = bi. Thus, those constraints that were active at x remain active at all points on the line. However, since the polyhedron is assumed to contain no lines, it follows that as we vary A, some constraint will be eventually violated. At the point where some constraint is about to be violated, a new constraint must become active, and we conclude that there exists some A* and some j tf. I such that aj(x + A*d) = bj.
We claim that aj is not a linear combination of the vectors ai, i E I. Indeed, we have ajx -I- bj (because j tf. I) and aj(x + A*d) = bj (by the definition of A*). Thus, ajd -I- o. On the other hand, a�d = 0 for every i E I (by the definition of d) and therefore, d is orthogonal to any linear combination of the vectors �, i E I. Since d is not orthogonal to aj, we

64
Chap. 2 The geometry of linear programming
Figure 2.14: Starting from an arbitrary point of a polyhedron, we choose a direction along which all currently active constraints remain active. We then move along that direction until a new constraint is about to be violated. At that point, the number of linearly independent active constraints has increased by at least one. We repeat this procedure until we end up with n linearly independent active constraints, at which point we have a basic feasible solution.
conclude that aj is a not a linear combination of the vectors ai, i E I. Thus, by moving from x to x + .A*d, the number of linearly independent active constraints has been increased by at least one. By repeating the same argument, as many times as needed, we eventually end up with a point at which there are n linearly independent active constraints. Such a point is, by definition, a basic solution; it is also feasible since we have stayed within the feasible set.
(a) =} (c)
If P has an extreme point x, then x is also a basic feasible solution (cf. The orem 2.3), and there exist n constraints that are active at x, with the corresponding vectors ai being linearly independent.
(c) =} (b)
Suppose that n of the vectors ai are linearly independent and, without loss of generality, let us assume that al , . . . , an are linearly independent. Suppose that P contains a line x + .Ad, where d is a nonzero vector. We then have a�(x+ .Ad) � bi for all i and all .A. We conclude that a�d = 0 for all i. (If a�d <0, we can violate the constraint by picking .A very large; a symmetric argument applies if a�d > 0.) Since the vectors ai, i = 1, . . . , n, are linearly independent, this implies that d = o. This is a contradiction and establishes that P does not contain a line. 0
Notice that a bounded polyhedron does not contain a line. Similarly,

Sec. 2. 6 Optimality of extreme points 65
the positive orthant {x I x � O} does not contain a line. Since a polyhedron in standard form is contained in the positive orthant, it does not contain a line either. These observations establish the following important corollary of Theorem 2.6.
Corollary 2.2 Every nonempty bounded polyhedron and every nonempty polyhedron in standard form has at least one basic feasi ble solution.
2.6 Optimality of extreme points
Having established the conditions for the existence of extreme points, we will now confirm the intuition developed in Chapter 1: as long as a linear programming problem has an optimal solution and as long as the feasible set has at least one extreme point, we can always find an optimal solution within the set of extreme points of the feasible set. Later in this section, we prove a somewhat stronger result, at the expense of a more complicated proof.
Theorem 2 . 7 Consider the linear programming problem of minimiz ing c’x over a polyhedron P. Suppose that P has at least one extreme point and that there exists an optimal solution. Then, there exists an optimal solution which is an extreme point of P.
Proof. (See Figure 2.15 for an illustration.) Let Q be the set of all optimal solutions, which we have assumed to be nonempty. Let P be of the form P = {x E lRn I Ax � b} and let v be the optimal value of the cost c’x. Then,Q={xElRnIAx�b, c’x=v},whichisalsoapolyhedron. Since
p
Q
Figure 2.15: Illustration of the proof of Theorem 2.7. Here, Q is the set of optimal solutions and an extreme point x· of Q is also an extreme point of P.
x*

66 Chap. 2 The geometry of linear programming
Q c P, and since P contains no lines (cf. Theorem 2.6) , Q contains no lines either. Therefore, Q has an extreme point.
Let x* be an extreme point of Q. We will show that x* is also an extreme point of P. Suppose, in order to derive a contradiction, that x* is not an extreme point of P. Then, there exist y E P, z E P, such that
e’z
=
=
=
=
v and therefore
Ay+(1-A)Z. It follows Ae’y + (1 – A)e’z. Furthermore, since v is the optimal
y -I- x*, z -I- x*, and some A E [0, 1] such that x*
=
The above theorem applies to polyhedra in standard form, as well as to bounded polyhedra, since they do not contain a line.
Our next result is stronger than Theorem 2.7. It shows that the existence of an optimal solution can be taken for granted, as long as the optimal cost is finite.
Theorem 2.8 Consider the linear programming problem of minimiz ing e’x over a polyhedron P. Suppose that P has at least one extreme point. Then, either the optimal cost is equal to -00, or there exists an extreme point which is optimal.
Proof. The proof is essentially a repetition of the proof of Theorem 2.6. The difference is that as we move towards a basic feasible solution, we will also make sure that the costs do not increase. We will use the following terminology: an element x of P has rank k if we can find k, but not more than k, linearly independent constraints that are active at x.
Let us assume that the optimal cost is finite. Let P = {x E �n I Ax2″: b}andconsidersomexEPofrankk 0 and j tt I such that
aj(x + A*d)
proof of Theorem 2.6, aj is linearly independent from ai, i E I, and the rank ofyis at least k+1.
=
bj• We let y
=
x+A*dand note that e’y< e'x. As in the = bi, i E I. If the entire half Sec. 2. 7 Representation of bounded polyhedra* 67 Suppose now that e'd = o. We consider the line y = x + Ad, where A is an arbitrary scalar. Since P contains no lines, the line must eventually exit P and when that is about to happen, we are again at a vector y of rank greater than that of x. Furthermore, since e'd = 0, we have e'y = e'x. In either case, we have found a new point y such that e'y ::; e'x, and whose rank is greater than that of x. By repeating this process as many times as needed, we end up with a vector w of rank n (thus, w is a basic feasible solution) such that e'w ::; e'x. Let WI,. . . ,wT be the basic feasible solutions in P and let w* be a basic feasible solution such that e'w* ::; e'wi for all i. iWe have already shown that for every x there exists some i such that e'w ::; e'x. It follows that e'w* ::; e'x for all xEP, and the basic feasible solution w* is optimal. D For a general linear programming problem, if the feasible set has no extreme points, then Theorem 2.8 does not apply directly. On the other hand, any linear programming problem can be transformed into an equivalent problem in standard form to which Theorem 2.8 does apply. This establishes the following corollary. Corollary 2.3 Consider the linear programming problem ofminimiz ing e'x over a nonempty polyhedron. Then, either the optimal cost is equal to - 00 or there exists an optimal solution. The result in Corollary 2.3 should be contrasted with what may hap pen in optimization problems with a nonlinear cost function. For example, in the problem of minimizing l/x subject to x ::: 1, the optimal cost is not -00, but an optimal solution does not exist. 2.7 Representation of bounded polyhedra* So far, we have been representing polyhedra in terms of their defining in equalities. In this section, we provide an alternative, by showing that a bounded polyhedron can also be represented as the convex hull of its ex treme points. The proof that we give here is elementary and constructive, and its main idea is summarized in Figure 2.16. There is a similar repre sentation of unbounded polyhedra involving extreme points and "extreme rays" (edges that extend to infinity). This representation can be developed using the tools that we already have, at the expense of a more complicated proof. A more elegant argument, based on duality theory, will be presented in Section 4.9 and will also result in an alternative proof of Theorem 2.9 below . 68 Chap. 2 The geometry of linear programming Figure 2.16: Given the vector z, we express it as a convex com bination of y and u. The vector u belongs to the polyhedron Q whose dimension is lower than that of P. Using induction on di u Theorem 2.9 A nonempty and bounded polyhedron is the convex hull of its extreme points. Proof. Every convex combination of extreme points is an element of the polyhedron, since polyhedra are convex sets. Thus, we only need to prove the converse result and show that every element of a bounded polyhedron can be represented as a convex combination of extreme points. Let us assume that the result is true for all polyhedra of dimension less thank. LetP={xElRnIa�x?:bi,i=1,...,m}beanonemptybounded k-dimensional polyhedron. Then, P is contained in a k-dimensional affine subspace S of lRn, which can be assumed to be of the form w h e r e X l , . . . , x k a r e s o m e v e c t o r s i n lR n . L e t f1 ' . . . , fn - k b e n - k l i n e a r l y independent vectors that are orthogonal to xl , . . . , xk . Let 9i = flxo , for mension, we can express the vector extreme points of Q. These are also extreme points of P. We define the dimension of a polyhedron P integer k such that P is contained in some k-dimensional affine subspace of lRn. (Recall from Section 1.5, that a k-dimensional affine subspace is a translation of a k-dimensional subspace. ) Our proof proceeds by induction on the dimension of the polyhedron P. If P is zero-dimensional, it consists of a single point. This point is an extreme point of P and the result is true. as a convex combination of n C lR as the smallest Sec. 2. 7 Representation of bounded polyhedra* 69 i=1,...,n-k. Then,everyelementxofSsatisfies i = 1, . . . , n - k. (2.3) Since P e S, the same must be true for every element of P. LetZbeanelementofP. IfzisanextremepointofP,thenz is a trivial convex combination of the extreme points of P and there is nothing more to be proved. If z is not an extreme point of P, let us choose an arbitrary extreme point y of P and form the half-line consisting of all points of the form z + A(Z - y), where A is a nonnegative scalar. Since P is bounded, this half-line must eventually exit P and violate one of the constraints, say the constraint a�*x � bi*. By considering what happens when this constraint is just about to be violated, we find some A* � 0 and U = Z + A* (Z - Y) , U E P, such that and Since the constraint a�*x � bi* is violated if A grows beyond A* , it follows that a�* (z - y) di + fix, dj+fjx> Xn,
dj+fjx;?: di+fix, ifain>0andajn<0, (2.7) 0 > dk+f£x,
Here, each di, dj, dk is a scalar, and each fi’ fj, fk is a vector in
?Rn-1. n 1
2. Let Q be the polyhedron in ?R – defined by the constraints
o ;?: dk+f£x, ifakn
Example 2.7 Consider the polyhedron defined by the constraints
Xl+X2 ?: 1 Xl+X2+2xs ?: 2 2Xl + 3xs ?: 3 Xl – 4xs > 4
-2Xl+X2-Xs ?: 5. We rewrite these constraints in the form
0 ?: 1 – Xl – X2
Xs ?: 1-(xl/2)-(x2/2) Xs ?: 1-(2Xl/3)
-1+(xl/4) ?: Xs -5-2Xl+X2 ?: Xs·
Then, the set Q is defined by the constraints
o ?: 1 – Xl – X2 -1+xl/4 ?: 1-(xl/2)-(x2/2)
if ain > 0, ifajn<0, ifakn=O. (2.4) (2.5) (2.6) = O. (2.8) i= 1,...,m; Sec. 2.8 Projections ofpolyhedra: Fourier-Motzkin elimination* 73 -1+xl/4 2 1-(2Xl/3) -5-2Xl+X2 > 1-(xl/2)-(X2/2) -5-2Xl+X2 2 1-(2Xl/3).
Theorem 2.10 The polyhedron Q constructed by the elimination al gorithm is equal to the projection IIn-1 (P) of P.
Proof. Ifx E IIn-1(P), there exists some Xn such that (x,xn) E P. In particular, the vector x = (x, xn) satisfies Eqs. (2.4)-(2.6), from which it follows immediately that x satisfies Eqs. (2.7)-(2.8), and x E Q. This shows that IIn-I(P) C Q.
We will now prove that Q C IIn-I(P). Let x E Q. It follows from Eq. (2.7) that
Let Xn be any number between the two sides of the above inequality. It then follows that (x, xn) satisfies Eqs. (2.4)-(2.6) and, therefore, belongs to the polyhedron P. D
Notice that for any vector x = (Xl,. . . ,xn), we have
Accordingly, for any polyhedron P, we also have
By generalizing this observation, we see that if we apply the elimination al gorithm k times, we end up with the set IIn-k(P); ifwe apply it n-l times, we end up with III (P) . Unfortunately, each application of the elimination algorithm can increase the number of constraints substantially, leading to a polyhedron II1(P) described by a very large number of constraints. Of course, since III (P) is one-dimensional, almost all of these constraints will be redundant, but this is of no help: in order to decide which ones are redundant, we must, in general, enumerate them.
The elimination algorithm has an important theoretical consequence: since the projection Ilk (P) can be generated by repeated application of the elimination algorithm, and since the elimination algorithm always produces a polyhedron, it follows that a projection IIk(P) of a polyhedron is also a polyhedron. This fact might be considered obvious, but a proof simpler than the one we gave is not apparent. We now restate it in somewhat different language.

74 Chap. 2 The geometry of linear programming
Corollary 2.4 Let P C lRn+k be a polyhedron. Then, the set {xElRn I there exists y E lRk such that (x, y) E P}
is also a polyhedron.
A variation of Corollary 2.4 states that the image of a polyhedron under a linear mapping is also a polyhedron.
Corollary2.5LetPC lRnbeapolyhedronandletAbeanm x n matrix. Then, the set Q = {Ax I xEP} is also a polyhedron.
Proof. WehaveQ={yElRmIthereexistsxElRnsuchthatAx= y, x E Pl. Therefore, Q is the projection of the polyhedron {(x,y) E lRn+mIAx=y, xEP}ontotheycoordinates. D
Corollary 2.6 The convex hull ofa finite number ofvectors is a poly hedron.
Proof. The convex hull
of a finite number of vectors Xl , . . . , xk is the image of the polyhedron
under the linear mapping that maps (>’1, . . . , Ak) to E�l AiXi and is, there fore, a polyhedron. D
We finally indicate how the elimination algorithm can be used to solve linear programming problems. Consider the problem of minimizing c’x subject to x belonging to a polyhedron P. We define a new variable Xo and introduce the constraint Xo = c’x. If we use the elimination algorithm n times to eliminate the variables Xl, . . . ,Xn, we are left with the set
Q = {xo I there exists x E P such that Xo = c’x},
and the optimal cost is equal to the smallest element of Q. An optimal solution x can be recovered by backtracking (Exercise 2.21).

Sec. 2.9 Summary 75 2.9 Summary
We summarize our main conclusions so far regarding the solutions to linear programming problems.
(a) If the feasible set is nonempty and bounded, there exists an optimal solution. Furthermore, there exists an optimal solution which is an extreme point.
(b) I f the feasible set i s unbounded, there are the following possibilities:
(i) There exists an optimal solution which is an extreme point.
(ii) There exists an optimal solution, but no optimal solution is an extreme point. (This can only happen if the feasible set has no extreme points; it never happens when the problem is in standard form.)
(iii) The optimal cost is -00.
Suppose now that the optimal cost is finite and that the feasible set contains at least one extreme point. Since there are only finitely many extreme points, the problem can be solved in a finite number of steps, by enumerating all extreme points and evaluating the cost of each one. This is hardly a practical algorithm because the number of extreme points can increase exponentially with the number of variables and constraints. In the next chapter, we will exploit the geometry of the feasible set and develop the simplex method, a systematic procedure that moves from one extreme point to another, without having to enumerate all extreme points.
An interesting aspect of the material in this chapter is the distinction between geometric (representation independent) properties of a polyhedron and those properties that depend on a particular representation. In that respect, we have established the following:
(a) Whether or not a point is an extreme point (equivalently, vertex, or basic feasible solution) is a geometric property.
(b) Whether or not a point is a basic solution may depend on the way that a polyhedron is represented.
(c) Whether or not a basic or basic feasible solution is degenerate may depend on the way that a polyhedron is represented.
2 . 1 0 Exercises
Exercise 2.1 For each one ofthe following sets, determine whether it is a poly hedron.
(a) The set of all (x, y) E ))(2 satisfying the constraints
xcos()+ysin(} S 1, \;f()E[0,7r/2], x 2: 0,
y 2: 0.

76 Chap. 2 The geometry of linear programming (b) The set of all x :E iR satinsfying the constraint x2 – 8x + 15 :S O.
iRn … iR be a convex function and let c be some constant.
(c) The empty set. Exercise 2.2 Let f
ShowthatthesetS= {xEiR If(x):SA
c} is convex.
Exercise 2.3 (Basic feasible solutions in standard fOrIn polyhedra with upper bounds) Consider a polyhedron defined by the constraints Ax = b and
o :S x :S u, and assume that the matrix
a procedure analogous to the one in Section 2.3 for constructing basic solutions, and prove an analog of Theorem 2.4.
Exercise 2.4 We know that every linear programming problem can be con verted to an equivalent problem in standard form. We also know that nonempty polyhedra in standard form have at least one extreme point. We are then tempted to conclude that every nonempty polyhedron has at least one extreme point. Ex plain what is wrong with this argument. A
Exercise 2.5 (ExtreIne points of isoInorphic polyhedra) A mapping f is
called affine if it is of the form f(x) = Ax + b, where
vector. Let P and Q be polyhedra in iRn and iRm, respectively. We say that P
:
and Q are isomorphic if there exist affine mappings f
such that g(J(x») = x for all x E P, and f(g(y») = y for all y E Q. (Intuitively, isomorphic polyhedra have the same shape.)
(a)
(b)
If P and Q are isomorphic, show that there exists a one-to-one correspon dence between their extreme points. In particular, if f and 9 are as above, show that x is an extreme point of P if and onlAy if f(x) is an extreme point of Q. A
(Introducing slack variables leads to an isoInorphic polyhedron)
Let P = {x E iRn I Ax 2: b, x 2: O}, where
is a matrix of dimensions
kxn. LetQ={(x, and Q are isomorphic.
Exercise 2.6 (Caratheodory’s theoreIn) Let vectors in iRm.
(a) Let
2: o}. Show that P
be a collection of
z
x – z = b, x 2: 0, z
H )EiRn I
AI,…,An
has linearly independent rows. Provide
{�AiAi I AI,…,An
with
c=
2: 0, and with at most m of the coefficients
o}.
AiAi,
2:
Show that any element of C can be expressed in the form L�=l
Ai
Ai
being nonzero. Hint:
(b) Let P be the convex hull of the vectors Ai:
Consider the polyhedron
P
Q
P
is a matrix and b is a
…
Q and 9 :
…

Sec. 2.10 Exercises 77
ShowthatanyelementofPcanbeexpressedintheformL:�lAiAi, where L:�l Ai = 1 and Ai ::: 0 for all i, with at most m + 1 of the coefficients Ai being nonzero.
Exercise2.7Supposethat{xE�nIa�x:::bi, i= 1,…,m}and{xE�nI g�x ::: hi , i = 1 , . . . , k} are two representations of the same nonempty polyhedron. Suppose that the vectors aI , . . . , am span �n . Show that the same must be true
for the vectors gl, . . . , gk.
Exercise 2.8 Consider the standard form polyhedron {x I Ax = b, x ::: O}, and assume that the rows of the matrix A are linearly independent. Let x be a basic solution, and let J = {i I Xi =1= O}. Show that a basis is associated with the basic solution x if and only if every column Ai, i E J, is in the basis.
Exercise 2 . 9 Consider the standard form polyhedron {x I Ax = b, x ::: O } , and assume that the rows of the matrix A are linearly independent.
(a) Suppose that two different bases lead to the same basic solution. Show that the basic solution is degenerate.
(b) Consider a degenerate basic solution. Is it true that it corresponds to two or more distinct bases? Prove or give a counterexample.
(c) Suppose that a basic solution is degenerate. Is it true that there exists an adjacent basic solution which is degenerate? Prove or give a counterexam ple .
Exercise 2.10 Consider the standard form polyhedron P = {x I Ax = b, x ::: O}. Suppose that the matrix A has dimensions m x n and that its rows are linearly independent. For each one of the following statements, state whether it is true or false. If true, provide a proof, else, provide a counterexample.
(a) If n = m + 1, then P has at most two basic feasible solutions.
(b) The set of all optimal solutions is bounded.
(e) At every optimal solution, no more than m variables can be positive.
(d) If there is more than one optimal solution, then there are uncountably
many optimal solutions.
(e) If there are several optimal solutions, then there exist at least two basic feasible solutions that are optimal.
(f) Consider the problem of minimizing max{e’x, d’x} over the set P. If this problem has an optimal solution, it must have an optimal solution which is an extreme point of P.
Exercise 2.11 Let P = {x E �n I Ax ::: b}. Suppose that at a particular basic feasible solution, there are k active constraints, with k > n. Is it true that there exist exactly (�) bases that lead to this basic feasible solution? Here
(�)= k!/(n!(k-n)!)isthenumberofwaysthatwecanchoosenoutofkgiven items .
Exercise 2.12 Consider a nonempty polyhedron P and suppose that for each variable Xi we have either the constraint Xi ::: 0 or the constraint Xi ::; O. Is it true that P has at least one basic feasible solution?

78 Chap. 2 The geometry of linear programming
Exercise 2.13 Consider the standard form polyhedron P = {x I Ax = b, x 2: o}. Suppose that the matrix A, of dimensions m x n, has linearly independent rows, and that all basic feasible solutions are nondegenerate. Let x be an element of P that has exactly m positive components.
(a) Show that x is a basic feasible solution.
(b) Show that the result of part (a) is false if the nondegeneracy assumption is
removed.
Exercise 2.14 Let P be a bounded polyhedron in )Rn, let a be a vector in )Rn,
and let b be some scalar. We define
Q = {x E P l a’x = b}.
Show that every extreme point of Q is either an extreme point of P or a convex combination of two adjacent extreme points of P.
Exercise 2.15 (Edges joining adjacent vertices) Consider the polyhedron P={xE)RnIa�x2:bi,i=1,…,m}. Supposethatuandvaredistinct basic feasible solutions that satisfy a�u = a�v = bi, i = 1,. . . ,n – 1, and that the vectors al,…,an-l are linearly independent. (In particular, u and v are adjacent.) LetL={Au+(1-A)VI° ::;A::;I}bethesegmentthatjoinsuand v. ProvethatL={zEPla�z=bi, i=1,…,n-1}.
Exercise 2.16 Consider the set {x E )Rn
Could this b e the feasible set o f a problem i n standard form?
Exercise 2.17 Consider the polyhedron {x E )Rn I Ax ::; b, x 2: o} and a
z
nondegenerate basic feasible solution x*. We introduce slack variables
and 2: o} in standardform. Showthat(x*,b-Ax*)isanondegeneratebasicfeasiblesolution
z
Exercise 2.18 Consider a polyhedron P = {x I Ax 2: b}. Given any E > 0,
show that there exists some b with the following two properties:
(a) The absolute value of every component of b – b is bounded by E .
(b) Every basic feasible solution in the polyhedron P = {x I Ax 2: b} is nondegenerate.
Exercise 2.19* Let P c )Rn be a polyhedron in standard form whose definition involves m linearly independent equality constraints. Its dimension is defined as the smallest integer k such that P is contained in some k-dimensional affine subspace of )Rn .
(a) Explain why the dimension of P is at most n – m.
(b) Suppose that P has a nondegenerate basic feasible solution. Show that the
dimension of P is equal to n – m.
(c) Supposethatxisadegeneratebasicfeasiblesolution. Showthatxisdegen erate under every standard form representation of the same polyhedron (in the same space )Rn). Hint: Using parts (a) and (b), compare the number of equality constraints in two representations of P under which x is degenerate and nondegenerate, respectively. Then, count active constraints.
construct a corresponding polyhedron {(x, for the new polyhedron.
) I Ax +
z
= b, x 2: 0, z
IXl=…=Xn-l=0, 0::;Xn::;I}.

Sec. 2. 1 1 Notes and sources 79 Exercise 2.20 * Consider the Fourier-Motzkin elimination algorithm.
(a) Suppose that the number m of constraints defining a polyhedron P is even. Show, by means of an example, that the elimination algorithm may produce a description of the polyhedron IIn-l (P) involving as many as m2/4 linear constraints, but no more than that.
(b) Show that the elimination algorithm produces a description of the one dimensional polyhedron III(P) involving no more than m2n-1/22n-2 con straints.
(c) Letn = 2P+p+2,wherepisanonnegativeinteger. Considerapolyhedron in Rn defined by the 8(�) constraints
1 :s: i < j < k :s: n, where all possible combinations are present. Show that after p eliminations, we have at least constraints. (Note that this number increases exponentially with n.) Exercise 2.21 Suppose that Fourier-Motzkin elimination is used in the manner described at the end of Section 2.8 to find the optimal cost in a linear programming problem. Show how this approach can be augmented to obtain an optimal solution as well. Exercise2.22LetPandQbepolyhedrainRn.LetP+Q= {x+y IxE P, y E Q}. (a) Show that P + Q is a polyhedron. (b) Show that every extreme point of P + Q is the sum of an extreme point of P and an extreme point of Q. 2 . 1 1 Notes and sources The relation between algebra and geometry goes far back in the history of mathematics, but was limited to two and three-dimensional spaces. The insight that the same relation goes through in higher dimensions only came in the middle of the nineteenth century. 2.2. Our algebraic definition of basic (feasible) solutions for general poly hedra, in terms of the number of linearly independent active con straints, is not common. Nevertheless, we consider it to be quite central, because it provides the main bridge between the algebraic and geometric viewpoint, it allows for a unified treatment, and shows that there is not much that is special about standard form problems. 2.8. Fourier-Motzkin elimination is due to Fourier (1827), Dines (1918), and Motzkin (1936). 22P+2 Chapter 3 The simplex method Contents 3.1. Optimality conditions 3.2. Development of the simplex method 3.3. Implementations of the simplex method 3.4. Anticycling: lexicography and Bland's rule 3.5. Finding an initial basic feasible solution 3.6. Column geometry and the simplex method 3.7. Computational efficiency of the simplex method 3.8. Summary 3.9. Exercises 3 .10. Notes and sources 81 82 Chap. 3 The simplex method We saw in Chapter 2, that if a linear programming problem in standard form has an optimal solution, then there exists a basic feasible solution that is optimal. The simplex method is based on this fact and searches for an op timal solution by moving from one basic feasible solution to another, along the edges of the feasible set, always in a cost reducing direction. Eventu ally, a basic feasible solution is reached at which none of the available edges leads to a cost reduction; such a basic feasible solution is optimal and the algorithm terminates. In this chapter, we provide a detailed development of the simplex method and discuss a few different implementations, includ ing the simplex tableau and the revised simplex method. We also address some difficulties that may arise in the presence of degeneracy. We provide an interpretation of the simplex method in terms of column geometry, and we conclude with a discussion of its running time, as a function of the dimension of the problem being solved. Throughout this chapter, we consider the standard form problem minimize c'x subject to Ax b x > 0,
and we let P be the corresponding feasible set. We assume that the dimen sions of the matrix A are m x n and that its rows are linearly independent. We continue using our previous notation: Ai is the ith column of the matrix A, and a� is its ith row.
3.1 Optimality conditions
Many optimization algorithms are structured as follows: given a feasible solution, we search its neighborhood to find a nearby feasible solution with lower cost. If no nearby feasible solution leads to a cost improvement, the algorithm terminates and we have a locally optimal solution. For general optimization problems, a locally optimal solution need not be (globally) optimal. Fortunately, in linear programming, local optimality implies global optimality; this is because we are minimizing a convex function over a convex set (cf. Exercise 3.1). In this section, we concentrate on the problem of searching for a direction of cost decrease in a neighborhood of a given basic feasible solution, and on the associated optimality conditions.
Suppose that we are at a point x E P and that we contemplate moving away from x, in the direction of a vector d E �n. Clearly, we should only consider those choices of d that do not immediately take us outside the feasible set. This leads to the following definition, illustrated in Figure 3.1.

Sec. 3. 1 Optimality conditions 83
Figure 3.1: Feasible directions at different points of a polyhedron.
Definition 3.1 Let x be an element of a polyhedron P. A vector d E �n is said to be a feasible direction at x, if there exists a positive scalar e for which x + ed E P.
Let x be a basic feasible solution to the standard form problem, let B(l), . . . , B(m) be the indices of the basic variables, and let B = [AB(l) . . . AB(m)] be the corresponding basis matrix. In particular, we have Xi=0foreverynonbasicvariable,whilethevectorXB= (xB(l)’…,xB(m)) of basic variables is given by
XB = B-lb.
We consider the possibility of moving away from x, to a new vector x + ed, by selecting a nonbasic variable Xj (which is initially at zero level) , and increasing it to a positive value e, while keeping the remaining nonbasic variables at zero. Algebraically, dj = 1, and di = 0 for every nonbasic index i other than j. At the same time, the vector XB of basic variables changes to XB + edB, where dB = (dB(l),dB(2), • • . , dB(m)) is the vector with those components of d that correspond to the basic variables.
Given that we are only interested in feasible solutions, we require A(x + ed) = b, and since x is feasible, we also have Ax = b. Thus, for the equality constraints to be satisfied for e > 0, we need Ad = o. Recall now that dj = 1, and that di = 0 for all other nonbasic indices i. Then,
m
0 = Ad = LAidi = LAB(i)dB(i) + Aj = BdB + Aj. i=l i=l
Since the basis matrix B is invertible, we obtain dB = -B-lAj.
n
(3.1)

84 Chap. 3 The simplex method
The direction vector d that we have just constructed will be referred to as the jth basic direction. We have so far guaranteed that the equality constraints are respected as we move away from x along the basic direction d. How about the nonnegativity constraints? We recall that the variable Xj is increased, and all other nonbasic variables stay at zero level. Thus, we need only worry about the basic variables. We distinguish two cases:
(a) Suppose that x is a nondegenerate basic feasible solution. Then, XB > 0, from which it follows that XB + OdB ?: 0, and feasibility is maintained, when 0 is sufficiently small. In particular, d is a feasible direction.
(b) Suppose now that x is degenerate. Then, d is not always a feasible di rection. Indeed, it is possible that a basic variable XB(i) is zero, while the corresponding component dB(i) of dB = -B-1Aj is negative. In that case, if we follow the jth basic direction, the nonnegativity con straint for XB(i) is immediately violated, and we are led to infeasible solutions; see Figure 3.2.
We now study the effects on the cost function ifwe move along a basic direction. If d is the jth basic direction, then the rate c’d of cost change along the direction d is given by C�dB +Cj, where CB = (CB(l), ” ” CB(m))’ Using Eq. (3.1), this is the same as Cj – C�B-1Aj. This quantity is im portant enough to warrant a definition. For an intuitive interpretation, Cj is the cost per unit increase in the variable Xj, and the term -C�B-1Aj is the cost of the compensating change in the basic variables necessitated by the constraint Ax = h.
Definition 3.2 Let x be a basic solution, let B be an associated basis matrix, and let CB be the vector of costs of the basic variables. For each j, we define the reduced cost (;j of the variable Xj according to the formula
– C j – c ,B B – 1 A j . Cj =
Example 3.1 Consider the linear programming problem
minimize CIXI + C2X2 + C3X3 + C4X4 subjectto Xl + X2 + X3 + X4 2 2XI + 3X3 + 4X4 2
Xl,X2,X3,X4 2: O.
The first two columns of the matrix A are Al = (1, 2) and A2 = (1, 0). Since they are linearly independent, we can choose Xl and X2 as our basic variables. The corresponding basis matrix is

Sec. 3.1 Optimalityconditions 85
Figure 3.2: Let n = 5, n – m = 2. As discussed in Section 1.4,we can visualize the feasible set by standing on the two-dimensional set defined by the constraint Ax = b, in which case, the edges of the feasible set are associated with(the nonn)egativity constraints Xi 2: O. At the nondegenerate basic feasible solution E, the vari
ables Xl and X3 are at zero level
positive basic variables. The first basic direction is obtained by increasing X l , while keeping the other nonbasic variable X3 at zero level. This is the direction corresponding to the edge EF. Con sider now the degenerate basic feasible solution F and let X3, X5 be the nonbasic variables. Note that X4 is a basic variable at zero level. A basic direction is obtained by increasing X3, while keeping the other nonbasic variable X5 at zero level. This is the direction corresponding to the line FG and it takes us outside the feasible set. Thus, this basic direction is not a feasible direction.
We set X3 = X4 = 0, and solve for Xl, X2, to obtain Xl = 1and X2 = 1.We have thus obtained a nondegenerate basic fea1sible solution.
A basic direction corresponding to an increase in the nonbasic variable X3 ,
is constructed as follows. We have d3 = the basic variables is obtained using Eq.
The cost of moving along this basic direction is c’d = is the same as the reduced cost of the variable X3 .
-3Cl/2
+ c2
+C3. This
Consider now Definition 3.2 for the case of a basic variable. Since B
is the matrix [AB(l) ‘ ” AB(rn)l, we have B-1[AB(1) ‘ ” AB(rn)l
= I, where
nonbasic
and X2, X4, X5 are
(3.1):
and d4 = O. The direction of change of
1/2][1] [-3/2]
-1/2 3
/2
1/2 ‘

86 Chap. 3 The simplex method
I is the m x m identity matrix. In particular, B-1AB(i) is the ith column of the identity matrix, which is the ith unit vector ei. Therefore, for every basic variable XB(i) , we have
-C B ( i )
=
CB(i) – CBB- AB(i)
=
CB(i) – CBei
=
CB(i) – CB(i)
=0
,
,1’
and we see that the reduced cost of every basic variable is zero.
Our next result provides us with optimality conditions. Given our interpretation of the reduced costs as rates of cost change along certain
directions, this result is intuitive.
Theorem 3.1 Consider a basic feasible solution x associated with a basis matrix B, and let c be the corresponding vector ofreduced costs.
(a) Ifc�0,thenxisoptimal.
(b) Ifx is optimal and nondegenerate, then c � o.
Proof.
(a) We assume that c � 0, we let y be an arbitrary feasible solution, and we define d = y – x. Feasibility implies that Ax = Ay = b and, therefore, Ad = o. The latter equality can be rewritten in the form
BdB+LAidi= 0, iEN
where N is the set of indices corresponding to the nonbasic variables under the given basis. Since B is invertible, we obtain
dB= -LB-1Aidi, iEN
and
For any nonbasic index i E N, we must have Xi = 0 and, since y is feasible, Yi �’O. Thus, di’� 0 and cidi � 0, for all i E N. We conclude that c (y -x) = cd � 0, and since y was an arbitrary feasible solution, x is optimal.
(b) Suppose that x is a nondegenerate basic feasible solution and that Cj <0 for some j . Since the reduced cost of a basic variable is always zero, Xj must be a nonbasic variable and Cj is the rate of cost change along the jth basic direction. Since x is nondegenerate, the jth basic direction is a feasible direction of cost decrease, as discussed earlier. By moving in that direction, we obtain feasible solutions whose cost is less than that of x, and x is not optimal. D c'd= C�dB+LCidi= L(Ci-C�B-1Ai)di= LCidi. iEN iEN iEN Sec. 3.2 Development of the simplex method 87 Note that Theorem 3.1 allows the possibility that x is a (degenerate) optimal basic feasible solution, but that Cj <0 for some nonbasic index j. There is an analog of Theorem 3.1 that provides conditions under which a basic feasible solution x is a unique optimal solution; see Exercise 3.6. A related view of the optimality conditions is developed in Exercises 3.2 and 3.3. According to Theorem 3.1, in order to decide whether a nondegenerate basic feasible solution is optimal, we need only check whether all reduced costs are nonnegative, which is the same as examining the n - m basic directions. If x is a degenerate basic feasible solution, an equally simple computational test for determining whether x is optimal is not available (see Exercises 3.7 and 3.8). Fortunately, the simplex method, as developed in subsequent sections, manages to get around this difficulty in an effective manner . Note that in order to use Theorem 3.1 and assert that a certain ba sic solution is optimal, we need to satisfy two conditions: feasibility, and nonnegativity of the reduced costs. This leads us to the following definition. Definition 3.3 A basis matrix B is said to be optimal if: (a) B-1b � 0, and (b) c' = c'-c�B-lA�0'. Clearly, if an optimal basis is found, the corresponding basic solution is feasible, satisfies the optimality conditions, and is therefore optimal. On the other hand, in the degenerate case, having an optimal basic feasible solution does not necessarily mean that the reduced costs are nonnegative. 3.2 Development of the simplex method We will now complete the development of the simplex method. Our main task is to work out the details of how to move to a better basic feasible solution, whenever a profitable basic direction is discovered. Let us assume that every basic feasible solution is nondegenerate. This assumption will remain in effect until it is explicitly relaxed later in this section. Suppose that we are at a basic feasible solution x and that we have computed the reduced costs Cj of the nonbasic variables. If all of them are nonnegative, Theorem 3.1 shows that we have an optimal solution, and we stop. If on the other hand, the reduced cost Cj of a nonbasic variable Xj is negative, the jth basic direction d is a feasible direction of cost decrease. [This is the direction obtained by letting dj = 1, di = 0 for i -=I- B(1), . . . , B(m),j, and dB = -B-1Aj.] While moving along this direction d, the nonbasic variable Xj becomes positive and all other nonbasic 88 Chap. 3 The simplex method variables remain at zero. We describe this situation by saying that Xj (or Aj ) enters or is brought into the basis. Once we start moving away from x along the direction d, we are tracing points of the form x + Od, where 0 :2: O. Since costs decrease along the direction d, it is desirable to move as far as possible. This takes us to the point x + O*d, where 0* = max {0 :2: 0 I x + Od E p}. The resulting cost change is O*c'd, which is the same as O*Cj. We now derive a formula for 0*. Given that Ad = 0, we have A(x+ Od) = Ax = b for all 0, and the equality constraints will never be violated. Thus, x + Od can become infeasible only if one of its components becomes negative. We distinguish two cases: (a) Ifd:2: 0,thenx+Od:2: ° forall0:2: 0,thevectorx+Odnever becomes infeasible, and we let 0* = 00. (b) If di <0 for some i, the constraint Xi + Odi :2: 0 becomes 0 � -xi/di. This constraint on 0 must be satisfied for every i with di 0, because XS(i) > 0 for all i, as a consequence of nondegeneracy.
Example 3.2 This is a continuation of Example 3.1 from the previous section, dealing with the linear programming problem
minimize C1Xl + C2X2 + C3X3 + C4X4 subjectto Xl + X2 + X3 + 2 2Xl + 3X3 + 4X4 2
Xl,X2,X3,X4 �O.
Let us again consider the basic feasible solution x = (1, 1, 0, 0) and recall that the reduced cost C3 of the nonbasic variable X3 was found to be -3cd2 + c2/2 + C3. Suppose that c = (2, 0, 0, 0), in which case, we have C3 = -3. Since C3 is negative, we form the corresponding basic direction, which is d = (-3/2, 1/2, 1, 0), and considervectorsoftheformx+9d,with9�O. As9increases,theonlycomponent of x that decreases is the first one (because dl < 0) . The largest possible value - {i=l,...,mldB(;) O. Let £ be a minimizing index in Eq. (3.2), that is,
_ (_)
XB(R) XB(i)
. = ()*,
= min
dB(£) {i=l,…,mldB(i)O} Ui m
The simplex method is initialized with an arbitrary basic feasible solution, which, for feasible standard form problems, is guaranteed to exist. The following theorem states that, in the nondegenerate case, the simplex method works correctly and terminates after a finite number of iterations.
Theorem 3.3 Assume that the feasible set is nonempty and that ev ery basic feasible solution is nondegenerate. Then, the simplex method terminates after a finite number of iterations. At termination, there are the following two possibilities:
(a) We have an optimal basis B and an associated basic feasible solution which is optimal.
(b) We have found a vector d satisfying Ad = 0, d � 0, and c’d <0, and the optimal cost is -00. Proof. I f the algorithm terminates due t o the stopping criterion i n Step 2, then the optimality conditions in Theorem 3.1 have been met, B is an optimal basis, and the current basic feasible solution is optimal. If the algorithm terminates because the criterion in Step 3 has been met, then we are at a basic feasible solution x and we have discovered a nonbasic variable Xj such that Cj <° and such that the corresponding basic At each iteration, the algorithm moves by a positive amount ()* along a direction d that satisfies c'd <0. Therefore, the cost of every successive basic feasible solution visited by the algorithm is strictly less than the cost of the previous one, and no basic feasible solution can be visited twice. Since there is a finite number of basic feasible solutions, the algorithm must eventually terminate. D Theorem 3.3 provides an independent proof of some of the results of Chapter 2 for nondegenerate standard form problems. In particular, it shows that for feasible and nondegenerate problems, either the optimal min XB(i) direction d satisfies Ad 0andd�O. Inparticular,x+()dEPforall Cj <0, by taking () arbitrarily large, the cost can be = () > 0. Since c’d
made arbitrarily negative, and the optimal cost is -00.
=

92 Chap. 3 The simplex method
cost is -00, or there exists a basic feasible solution which is optimal (cf. Theorem 2.8 in Section 2.6). While the proof given here might appear more elementary, its extension to the degenerate case is not as simple.
The simplex method for degenerate problems
We have been working so far under the assumption that all basic feasible solutions are nondegenerate. Suppose now that the exact same algorithm is used in the presence of degeneracy. Then, the following new possibilities may be encountered in the course of the algorithm.
(a) If the current basic feasible solution x is degenerate, ()* can be equal to zero, in which case, the new basic feasible solution y is the same as x. This happens if some basic variable XB(l) is equal to zero and the corresponding component dB(l) of the direction vector d is negative. Nevertheless, we can still define a new basis B, by replacing AB(l) with Aj [ef. Eqs. (3.3)-(3.4)], and Theorem 3.2 is still valid.
(b) Even if ()* is positive, it may happen that more than one of the original basic variables becomes zero at the new point x + ()* d. Since only one of them exits the basis, the others remain in the basis at zero level, and the new basic feasible solution is degenerate.
Basis changes while staying at the same basic feasible solution are not in vain. As illustrated in Figure 3.3, a sequence of such basis changes may lead to the eventual discovery of a cost reducing feasible direction. On the other hand, a sequence of basis changes might lead back to the initial basis, in which case the algorithm may loop indefinitely. This undesirable phenomenon is called cycling. An example of cycling is given in Section 3.3, after we develop some bookkeeping tools for carrying out the mechanics of the algorithm. It is sometimes maintained that cycling is an exceptionally rare phenomenon. However, for many highly structured linear program ming problems, most basic feasible solutions are degenerate, and cycling is a real possibility. Cycling can be avoided by judiciously choosing the variables that will enter or exit the basis (see Section 3.4). We now discuss the freedom available in this respect.
Pivot Selection
The simplex algorithm, as we described it, has certain degrees of freedom: in Step 2, we are free to choose any j whose reduced cost Cj is negative; also, in Step 5, there may be several indices £ that attain the minimum in the definition of ()*, and we are free to choose any one of them. Rules for making such choices are called pivoting rules.
Regarding the choice of the entering column, the following rules are some natural candidates:

Sec. 3.2 Development of the simplex method
93
1
Figure 3.3: We visualize a problem in standard form, with n – m = 2, by standing on the two-dimensional plane defined by the equality constraints Ax = b. The basic feasible solution x is degenerate. If X4 and X5 are the nonbasic variables, then the two corresponding basic directions are the vectors g and f. For either of these two basic directions, we have e* = O. However, if we perform a change of basis, with X4 entering the basis and X6 exiting, the new nonbasic variables are X5 and X6 , and the two basic directions are h and -g. (The direction -g is the one followed if X6 is in creased while X5 is kept at zero.) In particular, we can now follow direction h to reach a new basic feasible solution y with lower cost.
(a) Choose a column Aj, with Cj <0, whose reduced cost is the most negative. Since the reduced cost is the rate of change of the cost function, this rule chooses a direction along which costs decrease at the fastest rate. However, the actual cost decrease depends on how far we move along the chosen direction. This suggests the next rule. (b) Choose a column with Cj <0 for which the corresponding cost de crease e* ICj I is largest. This rule offers the possibility of reaching optimality after a smaller number of iterations. On the other hand, the computational burden at each iteration is larger, because we need to compute e* for each column with Cj 0.) This replaces Ui by zero.
(b) We divide the fth row by u£. This replaces u£ by one.
In words, we are adding to each row a multiple of the fth row to
u
multiply by Q), we obtain B-1. We conclude that all it takes to generate B – 1 , is to start with B – 1 and apply the sequence of elementary row oper
replace the fth column
mentary row operations is equivalent to left-multiplying B-1B by a certain invertible matrix Q. Since the result is the identity, we have QB-1B = I, which yields QB-1 = B-1. The last equation shows that if we apply the same sequence of row operations to the matrix B – 1 (equivalently, left-
by the fth unit vector e£. This sequence of ele
ations described above. 2 Example 3.4 Let
[-!-2 2
B-1 =
and suppose that f. = 3. Thus, our objective is to transform the vector
unit vector e3 = (0, 0, 1). We multiply the third row by
row by
We obtain
[�:
-1
3.
3
�1′
to the and add it to the first row. We subtract the third row from the second row. Finally, we divide the third
B-1 =
2. -2-1
u
When the matrix B-1 is updated in the manner we have described, we ob tain an implementation of the simplex method known as the revised simplex method, which we summarize below.
An iteration of the revised simplex method
1. In a typical iteration, we start with a basis consisting of the basic columns AB(l),. . • ,AB(m), an associated basic feasible solution x, and the inverse B-1 of the basis matrix.
2. Compute the row vector p’ = c’aB-1 and then compute the re duced costs Cj = Cj – p’Aj. If they are all nonnegative, the current basic feasible solution is optimal, and the algorithm ter minates; else, choose some j for which Cj O} Ui
()* = min
5. Let f. be such that ()* = XB(eJiUe. Form a new basis by replacing ABee) with Aj . If y is the new basic feasible solution, the values of the new basic variables are Yj = ()* and YB(i) = XB(i) – ()*Ui, i =J f..
6. Formthem x (m+1)matrix[B-1Iu]. Addtoeachoneof its rows a multiple of the f.th row to make the last column equal
m
to the unit vector ee. The first matrix B-1.
columns of the result is the
The full tableau implementation
We finally describe the implementation of simplex method in terms of the so-called full tableau. Here, instead of maintaining and updating the matrix
B-l, we maintain and update the
m x (n+1)matrix
with columns B-lb and B-1AI” ‘” B-1An. This matrix is called the
simplex tableau. Note that the column B-lb, called the zeroth column,
contains the values of the basic variables. The column B-1Ai is called the
ith column of the tableau. The column
variable that enters the basis is called the pivot column. If the f.th basic variable exits the basis, the f.th row of the tableau is called the pivot row. Finally, the element belonging to both the pivot row and the pivot column is called the pivot element. Note that the pivot element is Ue and is always
positive (unless condition in Step 3).
1
u = B- Aj corresponding to the
u :”: 0, in which case the algorithm has met the termination
The information contained in the rows of the tableau admits the fol lowing interpretation. The equality constraints are initially given to us in the form b = Ax. Given the current basis matrix B, these equality constraints can also be expressed in the equivalent form
which is precisely the information in the tableau. In other words, the rows of the tableau provide us with the coefficients of the equality constraints B-lb = B-IAx. 1
At the end ofeach iteration, we need to update the tableau B- [b I A] and compute B – 1 [b I A] . This can be accomplished by left-multiplying the

Sec. 3.3 Implementations ofthe simplex method 99
simplex tableau with a matrix Q satisfying QB-1 = B-1. As explained earlier, this is the same as performing those elementary row operations that turn B-1 to B-1; that is, we add to each row a multiple of the pivot row to set all entries of the pivot column to zero, with the exception of the pivot element which is set to one.
Regarding the determination of the exiting column AB(R) and the stepsize ()* , Steps 4 and 5 in the summary of the simplex method amount to the following: XB(i)/Ui is the ratio of the ith entry in the zeroth column of the tableau to the ith entry in the pivot column of the tableau. We only consider those i for which Ui is positive. The smallest ratio is equal to ()* and determines f.
It is customary to augment the simplex tableau by including a top row, to be referred to as the zeroth row. The entry at the top left corner contains the value -e�xB’ which is the negative of the current cost. (The reason for the minus sign is that it allows for a simple update rule, as will be seen shortly.) The rest of the zeroth row is the row vector of reduced costs, that is, the vector c’ = e’ – e�B-1A. Thus, the structure of the tableau is:
or, in more detail,
-e�B-1b e’ – e�B-1A B-1b B-1A
-e�XB C1 … XB(l) I
en
I B-1An I
B-1A1 XBCm) I
. .
.
The rule for updating the zeroth row turns out to be identical to the rule used for the other rows of the tableau: add a multiple of the pivot row to the zeroth row to set the reduced cost of the entering variable to zero. We will now verify that this update rule produces the correct results for the zeroth row.
At the beginning of a typical iteration, the zeroth row is of the form
[0 I e’l – g'[b I Al,
where g’ = e�B-1. Hence, the zeroth row is equal to [0 I e’l plus a linear combination of the rows of [b I Al. Let column j be the pivot column, and row f be the pivot row. Note that the pivot row is of the form h'[b I Al, where the vector h’ is the fth row of B-1 . Hence, after a multiple of the

100 Chap. 3 The simplex method pivot row is added to the zeroth row, that row is again equal to [0 I e’l plus
a (different) linear combination of the rows of [b I A], and is of the form [0Ie’l- p'[bIA],
for some vector p. Recall that our update rule is such that the pivot column entry of the zeroth row becomes zero, that is,
cB(f) – p’AB(l) = Cj – p’Aj = O.
Consider now the B(i)th column for i =I- £. (This is a column corresponding to a basic variable that stays in the basis.) The zeroth row entry of that column is zero, before the change of basis, since it is the reduced cost of a basic variable. Because B-1AB(i) is the ith unit vector and i =I- £, the entry in the pivot row for that column is also equal to zero. Hence, adding a multiple of the pivot row to the zeroth row of the tableau does not affect the zeroth row entry of that column, which is left at zero. We conclude
that the vector p satisfies CB(i) –
p’AB(i) =
0 for every column AB(i) in
the new basis. This implies that �
with our update rule, the updated zeroth row of the tableau is equal to
as desired.
We can now summarize the mechanics of the full tableau implemen
tation.
An iteration of the full tableau implementation
1. A typical iteration starts with the tableau associated with a basis
matrix B and the corresponding basic feasible solution x.
2. Examine the reduced costs in the zeroth row of the tableau. If they are all nonnegative, the current basic feasible solution is optimal, and the algorithm terminates; else, choose some j for which Cj v or u < v. Q. Sec. 3.4 Anticyc1ing: lexicography and Bland's rule 109 For example, (0,2,3,0) >L (0,2,1,4), L
(0,4,5,0) < (1,2,1,2). Lexicographic pivoting rule 1. Choose an entering column Aj arbitrarily, as long as its reduced cost Cj is negative. Let u tableau. = B-1Aj be the jth column of the 2. For each i with Ui > 0, divide the ith row of the tableau (includ ing the entry in the zeroth column) by Ui and choose the lexico graphically smallest row. If row f is lexicographically smallest, then the fth basic variable xB(l) exits the basis.
Example 3.7 Consider the following tableau (the zeroth row is omitted), and suppose that the pivot column is the third one (j = 3).
1053 246-1 3079
Note that there is a tie in trying to determine the exiting variable because XB(l)/Ul = 1/3 and XB(3)/U3 = 3/9 = 1/3. We divide the first and third rows of the tableau by Ul = 3 and U3 = 9, respectively, to obtain:
1/3 0 5/3 1
****
1/3 0 7/9 1
The tie between the first and third rows is resolved by performing a lexicographic comparison. Since 7/9 < 5/3, the third row is chosen to be the pivot row, and the variable XB(3) exits the basis. We note that the lexicographic pivoting rule always leads to a unique choice for the exiting variable. Indeed, if this were not the case, two of the rows in the tableau would have to be proportional. But if two rows of the matrix B-1A are proportional, the matrix B-1A has rank smaller than m and, therefore, A also has rank less than m, which contradicts our standing assumption that A has linearly independent rows. 110 Chap. 3 The simplex method Theorem 3.4 Suppose that the simplex algorithm starts with all the rows in the simplex tableau, other than the zeroth row, lexicographi cally positive. Suppose that the lexicographic pivoting rule is followed. Then: (a) Every row of the simplex tableau, other than the zeroth row, remains lexicographically positive throughout the algorithm. (b) The zeroth row strictly increases lexicographically at each itera tion . (c) The simplex method terminates after a finite number of itera tions. Proof. (a) Suppose that all rows of the simplex tableau, other than the zeroth row, are lexicographically positive at the beginning of a simplex iter ation. Suppose that Xj enters the basis and that the pivot row is the .eth row. According to the lexicographic pivoting rule, we have Uc > 0 and
(.ethrow) � (ithrow), ifi=I-.eandUi >O.
(3.5)
Uc Ui
To determine the new tableau, the .eth row is divided by the positive pivot element Uc and, therefore, remains lexicographically positive. Consider the ith row and suppose that Ui < O. In order to zero the (i, j)th entry of the tableau, we need to add a positive multiple of the pivot row to the ith row. Due to the lexicographic positivity of both rows, the ith row will remain lexicographically positive after this addition. Finally, consider the ith row for the case where Ui > 0 and i =I- .e. We have
U-
(new ith row) = (old ith row) – � (old .eth row) .
Because of the lexicographic inequality (3.5), which is satisfied by the old rows, the new ith row is also lexicographically positive.
(b) At the beginning of an iteration, the reduced cost in the pivot column is negative. In order to make it zero, we need to add a positive multiple of the pivot row. Since the latter row is lexicographically positive, the zeroth row increases lexicographically.
(c) Since the zeroth row increases lexicographically at each iteration, it never returns to a previous value. Since the zeroth row is determined completely by the current basis, no basis can be repeated twice and the simplex method must terminate after a finite number of iterations.
o
The lexicographic pivoting rule is straightforward to use if the simplex method is implemented in terms of the full tableau. It can also be used
Uc

Sec. 3. 5 Finding an initial basic feasible solution 111
in conjunction with the revised simplex method, provided that the inverse basis matrix B-1 is formed explicitly (see Exercise 3.16). On the other hand, in sophisticated implementations of the revised simplex method, the matrix B-1 is never computed explicitly, and the lexicographic rule is not really suitable.
We finally note that in order to apply the lexicographic pivoting rule, an initial tableau with lexicographically positive rows is required. Let us assume that an initial tableau is available (methods for obtaining an initial tableau are discussed in the next section). We can then rename the vari
m
m
The smallest subscript pivoting rule, also known as Bland’s rule, is as fol lows .
Smallest subscript pivoting rule
Find the smallest j for which the reduced cost Cj i s negative and
have the column Aj enter the basis.
2. Out of all variables Xi that are tied in the test for choosing an
exiting variable, select the one with the smallest value of i.
This pivoting rule is compatible with an implementation of the re vised simplex method in which the reduced costs of the nonbasic variables are computed one at a time, in the natural order, until a negative one is discovered. Under this pivoting rule, it is known that cycling never occurs and the simplex method is guaranteed to terminate after a finite number of iterations.
3.5 Finding an initial basic feasible solution
In order to start the simplex method, we need to find an initial basic feasible solution. Sometimes this is straightforward. For example, suppose that we are dealing with a problem involving constraints of the form Ax :: b, where b ?: O. We can then introduce nonnegative slack variables s and rewrite the constraints in the form Ax + s = b. The vector (x, s) defined by x = 0 and s = b is a basic feasible solution and the corresponding basis matrix is the identity. In general, however, finding an initial basic feasible solution is not easy and requires the solution of an auxiliary linear programming problem, as will be seen shortly.
ables so that the basic variables are the first
ones . This is equivalent
to rearranging the tableau so that the first
unit vectors. The resulting tableau has lexicographically positive rows, as desired.
Bland’s rule
columns of B-1A are the
m
1.

112 Chap. 3 The simplex method Consider the problem
minimize c’x subject to Ax b
x > o.
By possibly multiplying some of the equality constraints by -1, we can assume, without loss of generality, that b � o. We now introduce a vector y E lRm of artificial variables and use the simplex method to solve the auxiliary problem
minimize Yl+Y2 + ···+Ym subjectto Ax+y b
x>0
> o.
Initialization is easy for the auxiliary problem: by letting x = 0 and y = b, we have a basic feasible solution and the corresponding basis matrix is the identity.
If x is a feasible solution to the original problem, this choice of x together with y = 0, yields a zero cost solution to the auxiliary problem. Therefore, if the optimal cost in the auxiliary problem is nonzero, we con clude that the original problem is infeasible. If on the other hand, we obtain
x
At this point, we have accomplished our objectives only partially. We have a method that either detects infeasibility or finds a feasible solution to the original problem. However, in order to initialize the simplex method for the original problem, we need a basic feasible solution, an associated basis matrix B, and – depending on the implementation – the corresponding tableau. All this is straightforward if the simplex method, applied to the auxiliary problem, terminates with a basis matrix B consisting exclusively of columns of A. We can simply drop the columns that correspond to the artificial variables and continue with the simplex method on the original problem, using B as the starting basis matrix.
Driving artificial variables out of the basis
The situation is more complex if the original problem is feasible, the simplex method applied to the auxiliary problem terminates with a feasible solution x* to the original problem, but some of the artificial variables are in the final basis. (Since the final value of the artificial variables is zero, this implies that we have a degenerate basic feasible solution to the auxiliary problem.) Let k be the number of columns of A that belong to the final basis (k < m) and, without loss of generality, assume that these are the columns AB(l),...,AB(k). (In particular, XB(l)'...'XB(k) are the only variables a zero cost solution to the auxiliary problem, it must satisfy y = 0, and is a feasible solution to the original problem. y Sec. 3.5 Finding an initial basic feasible solution 113 thatcanbeatnonzerolevel.) NotethatthecolumnsAB(l),...,AB(k)must be linearly independent since they are part of a basis. Under our standard assumption that the matrix A has full rank, the columns of A span �m, m- linearly independent columns, that is, a basis consisting k additional columns AB(k+ ),...,AB( ) of A, to m The procedure we have just described is called driving the artificial variables out ofthe basis, and depends crucially on the assumption that the and we can choose 1 m obtain a set of exclusively of columns of A. With this basis, all nonbasic components of x* are at zero level, and it follows that x* is the basic feasible solution associated with this new basis as well. At this point, the artificial variables and the corresponding columns of the tableau can be dropped. m. Suppose that the lth basic variable is an artificial variable, which is in the basis at zero level. We examine the lth row of the tableau and find some j such that the lth entry of B-1Aj is nonzero. We claim that Aj is linearly independent from the columns AB(l) , . . . , AB(k) . To see this, note that B-1AB(i) = ei, i = 1,...,k, and since k < l, the lth entry of these vectors is zero. It follows that the lth entry of any linear combination of the vectors B-1AB(l),...,B-1AB(k) is also equal to zero. Since the lth entry of B-1Aj is nonzero, this vector is not a linear combination ofthe vectors B-1AB(1),...,B-1AB(k). Equivalently, Aj is not a linear combination of the vectors AB(l) , . . . , AB(k) , which proves our claim. We now bring Aj into the basis and have the lth basic variable exit the basis. This is accomplished in the usual manner: perform those elementary row operations that replace B-1Aj by the lth unit vector. The only difference from the usual mechanics of the simplex method is that the pivot element (the lth entry of B-1Aj) could be negative. Because the lth basic variable was zero, adding a multiple of the lth row to the other rows does not change the values of the basic variables. This means that after the change of basis, we are still at the same basic feasible solution to the auxiliary problem, but we have reduced the number of basic artificial variables by one. We repeat this procedure as many times as needed until all artificial variables are driven out of the basis. Let us now assume that the lth row of B-1A is zero, in which case the above described procedure fails. Note that the lth row of B-1A is equal to g'A, where g' is the lth row of B-1. Hence, g'A = 0' for some nonzero vector g, and the matrix A has linearly dependent rows. Since we are dealing with a feasible problem, we must also have g'b = o. Thus, the constraint g'Ax = g'h is redundant and can be eliminated (cf. Theorem 2.5 in Section 2.3). Since this constraint is the information provided by the lth row of the tableau, we can eliminate that row and continue from there. matrix A has rank basis for �m using the columns of A is impossible and there exist redundant equality constraints that must be eliminated, as described by Theorem 2.5 in Section 2.3. All of the above can be carried out mechanically, in terms of the simplex tableau, in the following manner. After all, if A has rank less than constructing a m, 114 Chap. 3 The simplex method Example 3.8 Consider the linear programming problem: minimize subject to + + X3 + + 3 ++2 + 9X3 5 3X3 + X4 1 Xl , . . . , X4 � o. In order to find a feasible solution, we form the auxiliary problem minimize subject to A basic feasible solution to the auxiliary problem is obtained by letting (X5,X6,X7,xs)=b=(3,2,5,1). Thecorrespondingbasismatrixistheidentity. Furthermore, we have of the original variables Xi, which is - X5 + + X7 + Xs + + 3X3 + X5 3 +++2 + 9X3 + X7 5 1 3X3 + X4 Xl , . . . , Xs � o. CB = (1, 1, 1, 1). We evaluate the reduced cost of each one i X3 X5 X7 C�A , and form the initial tableau: -11 0 -8 -21 -1 0 0 0 0 312301000 2 -1 2 6 0 0 1 0 0 504900010 1 0 0 3 1* 0 0 0 1 We bring X4 into the basis and have Xs exit the basis. The basis matrix B is still the identity and only the zeroth row of the tableau changes. We obtain: Xl X2 X3 X4 X5 X6 X7 Xs -10 0 -8 -18 0 0 0 0 1 312301000 2 -1 2 6 0 0 1 0 0 504900010 1 0 0 3* 1 0 0 0 1 6X3 Xl X2 Xl 2X2 3X3 -Xl 2X2 4X2 6X3 Xl 2X2 -Xl 2X2 4X2 X6 X6 + Xs X4 Xl X2 X6 Xs X7 = Xs = Sec. 3.5 Finding an initial basic feasible solution 115 We now bring X3 into the basis and have X4 exit the basis. The new tableau is: Xl X2 X3 X4XsX6X7 Xs -4 0 -8 0 6 0 0 0 7 Xs= 2 1 2 0 -1 1 0 0 -1 0 -1 2* 0 -2 0 1 0 -2 2 0 4 0 -3 0 0 1 -3 1/3 0 0 1 1/3 0 0 0 1/3 We now bring X2 into the basis and X6 exits. Note that this is a degenerate pivot with (J* = O. The new tableau is: Xl X2X3 X4Xs X6X7 Xs -4 -4 0 0 -2 0 4 0 -1 Xs= 2 2* 0 0 1 1 -1 0 1 X2= 0 -1/2 1 0 -1 0 1/2 0 -1 X7=2 20010-211 1/3 0 0 1 1/3 0 0 0 1/3 We now have Xl enter the basis and Xs exit the basis. We obtain the following tableau : Xl X2 X3 X4 Xs X6 X7 Xs 000002201 1 1 0 0 1/2 1/2 -1/2 0 1/2 1/2 0 1 0 1/4 1/4 0 00000-1-110 1/3 0 0 1 1/3 0 0 0 1/3 Note that the cost in the auxiliary problem has dropped to zero, indicating that we have a feasible solution to the original problem. However, the artificial variable X7 is still in the basis, at zero level. In order to obtain a basic feasible solution to the original problem, we need to drive X7 out of the basis. Note that X7 is the third basic variable and that the third entry ofthe columns B-1Aj, j = 1, . . . ,4, associated with the original variables, is zero. This indicates that the matrix A has linearly dependent rows. At this point, we remove the third row of the tableau, because it corresponds to a redundant constraint, and also remove all of the artificial variables. This leaves us with the following initial tableau for the -3/4 -3/4 116 Chap. 3 The simplex method original problem: We may now compute the reduced costs of the original variables, fill in the ze roth row of the tableau, and start executing the simplex method on the original problem. We observe that in this example, the artificial variable Xg was unnecessary. Instead of starting with Xg = 1, we could have started with X4 = 1 thus elimi nating the need for the first pivot. More generally, whenever there is a variable that appears in a single constraint and with a positive coefficient (slack variables being the typical example), we can always let that variable be in the initial basis and we do not have to associate an artificial variable with that constraint. The two-phase simplex method We can now summarize a complete algorithm for linear programming prob lems in standard form. XlX2X3 X4 ***** Xl= X3= 1/3 0 0 1 1/3 1 1 0 0 1/2 1/2 0 1 0 Phase I: 1. By multiplying some of the constraints by - 1 , change the prob lem so that b ;: O. 2. Introduce artificial variables Y1 , . . . , Ym , if necessary, and apply the simplex method to the auxiliary problem with cost L::1 Yi . 3. If the optimal cost in the auxiliary problem is positive, the orig inal problem is infeasible and the algorithm terminates. 4. If the optimal cost in the auxiliary problem is zero, a feasible solution to the original problem has been found. If no artificial variable is in the final basis, the artificial variables and the cor responding columns are eliminated, and a feasible basis for the original problem is available. 5. If the eth basic variable is an artificial one, examine the eth entry of the columns B-1Aj, j = 1,. . . , n. If all of these entries are zero, the eth row represents a redundant constraint and is elimi nated. Otherwise, if the eth entry of the jth column is nonzero, apply a change of basis (with this entry serving as the pivot -3/4 Sec. 3.5 Finding an initial basic feasible solution 117 element): the £th basic variable exits and Xj enters the basis. Repeat this operation until all artificial variables are driven out of the basis. Phase II: 1. Let the final basis and tableau obtained from Phase I be the initial basis and tableau for Phase II. 2. Compute the reduced costs of all variables for this initial basis, using the cost coefficients of the original problem. 3. Apply the simplex method to the original problem. The above two-phase algorithm is a complete method, in the sense that it can handle all possible outcomes. As long as cycling is avoided (due to either nondegeneracy, an anticycling rule, or luck), one of the following possibilities will materialize: (a) If the problem is infeasible, this is detected at the end of Phase I. (b) If the problem is feasible but the rows of A are linearly dependent, this is detected and corrected at the end of Phase I, by eliminating redundant equality constraints. (c) If the optimal cost is equal to -00, this is detected while running Phase II. (d) Else, Phase II terminates with an optimal solution. The big-M method We close by mentioning an alternative approach, the big-M method, that combines the two phases into a single one. The idea is to introduce a cost function of the form j=l i=l where M is a large positive constant, and where Yi are the same artificial variables as in Phase I simplex. For a sufficiently large choice of M, if the original problem is feasible and its optimal cost is finite, all of the artificial variables are eventually driven to zero (E�ercise 3.26), which takes us back to the minimization of the original cost function. In fact , there is no reason for fixing a numerical value for M. We can leave M as an undetermined parameter and let the reduced costs be functions of M. Whenever M is compared to another number (in order to determine whether a reduced cost is negative), M will be always treated as being larger. 118 Chap. 3 The simplex method Example 3.9 We consider the same linear programming problem as in Exam ple 3.8: minimize subject to Xl + X2 + X3 Xl + + 3 ++2 +5 +X4 1 X l , . . . , X4 2: o . We use the big-M method in conjunction with the following auxiliary problem, in which the unnecessary artificial variable Xs is omitted. minimize subject to Xl + X2 + Xl + 2X2 + -Xl+2X2+ 4X2 + X3 9X3 + MX5 + MX6 + MX7 +3 +X6 2 + X7 5 1 3X3 + X4 Xl, . . . ,X7 2: o. A basic feasible solution to the auxiliary problem is obtained by letting (X5,X6,X7,X4) = b = (3,2,5,1). The corresponding basis matrix is the identity. Furthermore, we have CB = (M, M, M, 0). We evaluate the reduced cost of each one of the original variables Xi, which is Ci - C�Ai, and form the initial tableau: X6= Xl X2 X3 X4 X5 X6 X7 -10M 1 -8M+l -18M+1 0 0 0 0 31 2 30100 2 -1 2 6 0 0 1 0 50490001 10 0 3*1000 The reduced cost of X3 is negative when M is large enough. We therefore bring X3 into the basis and have X4 exit. Note that in order to set the reduced cost of X3 to zero, we need to multiply the pivot row by 6M - 1/3 and add it to the zeroth row. The new tableau is: X6= X3= Xl X2 X3 X5 X6 -4M - 1/3 1 -8M + l 0 6M - 1/3 0 0 0 2120-1100 0 -1 2* 0 -2 0 1 0 20 40 -3001 1/3 0 0 1 1/3 0 0 0 The reduced cost of X2 is negative when M is large enough. We therefore bring X2 into the basis and X6 exits. Note that this is a degenerate pivot with e* = o. 2X2 3X3 -Xl 2X2 4X2 9X3 3X3 X5 3X3 6X3 6X3 X5 = X7 = X4 X7 X4 = X5 = X7 = Sec. 3. 5 Column geometry and the simplex method 119 The new tableau is: X3 X4X5 X7 1321 -4M - - -4M + - 0 0 -2M + - 0 4M - - 0 3232 2 2*00 11 -10 0 -1/2 1 0 -1 0 1/2 0 2 20010-21 1/3 001 1/30 00 We now have Xl enter and X5 exit the basis. We obtain the following tableau: X3 X4 X5 X7 We now bring X4 into the basis and X3 exits. The new tableau is: Xl X2 X3 X4 X5 X6 X7 0 0 1/4 0 2M-3/4 2M+1/4 0 1/2 1 0-3/2 0 1/2 -1/2 0 X2= 01 0 1/4 1/40 X7=00000-1 -11 X4=10031 000 With M large enough, all of the reduced costs are nonnegative and we have an optimal solution to the auxiliary problem. In addition, all of the artificial variables have been driven to zero, and we have an optimal solution to the original problem . 3.6 Column geometry and the simplex method In this section, we introduce an alternative way of visualizing the workings of the simplex method. This approach provides some insights into why the 2M + 1/4 0 -1/2 0 1/4 0 0 0 0 0 0 -1 -1 1 1/30011/3* 000 -11/6 0 0 0 -1/12 1 1 0 0 1/2 1/2 0 1 0 -3/4 2M - 3/4 1/2 1/4 X2 = X7 = Xl X2 X6 Xl X2 X6 X2 = X7 = -7/4 9/4 5/4 120 Chap. 3 simplex method appears to be efficient in practice. We consider the problem minimize c'x subject to Ax b e'x 1 x > 0,
The simplex method
(3.6)
where A is an m x n matrix and e is the n-dimensional vector with all components equal to one. Although this might appear to be a special type of a linear programming problem, it turns out that every problem with a bounded feasible set can be brought into this form (Exercise 3.28). The constraint e’x = 1 is called the convexity constraint. We also introduce
z
In order to capture this problem geometrically, we view the horizontal plane as an m-dimensional space containing the columns of A, and we view the vertical axis as the one-dimensional space associated with the cost components Ci. Then, each point in the resulting three-dimensional space corresponds to a point (Ai,Ci); see Figure 3.5. z),
In this geometry, our objective is to construct a vector (b, which is a convex combination of the vectors (Ai , Ci ) , such that z is as small as possible. Note that the vectors of the form (b, z ) lie on a vertical line, which we call the requirement line, and which intersects the horizontal plane at b. If the requirement line does not intersect the convex hull of the points (Ai, Ci), the problem is infeasible. If it does intersect it, the problem is feasible and an optimal solution corresponds to the lowest point in the intersection of the convex hull and the requirement line. For example, in Figure 3.6, the requirement line intersects the convex hull of the points (Ai,Ci); the point G corresponds to an optimal solution, and its height is the optimal cost.
We now need some terminology.
Definition 3.6 l . k+l n
(a) A collection ofvectors y , . . , y in � are said to be affinely independent ifthevectorsyl_ yk+1,y2_yk+1,…,yk_ yk+1 are linearly independent. (Note that we must have k :: n.)
(b) The convex hull of k + 1 affinely independent vectors in �n is called a k-dimensional simplex.
z = c’x. IfAI,A2,…,A arethen
an auxiliary variable
columns of A, we are dealing with the problem of minimizing
the nonnegativity constraints x � 0, the convexity constraint L:�=1 Xi = 1, and the constraint
defined by
z
n
subject to

Sec. 3. 6
Column geometry and the simplex method 121
z
Figure 3.5: The column geometry.
Thus, three points are either collinear or they are affinely independent and determine a two-dimensional simplex (a triangle) . Similarly, four points either lie on the same plane, or they are affinely independent and determine a three-dimensional simplex (a pyramid).
Let us now give an interpretation of basic feasible solutions to prob lem (3.6) in this geometry. Since we have added the convexity constraint, we have a total of m + 1 equality constraints. Thus, a basic feasible solution is associated with a collection ofm+1 linearly independent columns (Ai, 1) ofthe linear programming problem (3.6). These are in turn associated with m + 1 of the points (Ai , Ci ) , which we call basic points; the remaining points (Ai , Ci ) are called the nonbasic points. It is not hard to show that the m + 1 basic points are affinely independent (Exercise 3.29) and, therefore, their convex hull is an m-dimensional simplex, which we call the basic simplex. Let the requirement line intersect the m-dimensional basic simplex at some
z
Let us now interpret a change of basis geometrically. In a change of basis, a new point (Aj , Cj ) becomes basic, and one of the currently basic points is to become nonbasic. For example, in Figure 3.6, if C, D, F, are the current basic points, we could make point B basic, replacing F
)
.
The vector of weights Xi used in expressing (b,
z
z
point (b,
combination of the basic points, is the current basic feasible solution, and represents its cost. For example, in Figure 3.6, the shaded triangle CDF is the basic simplex, and the point H corresponds to a basic feasible solution associated with the basic points C, D, and F.
) as a convex

122
Chap. 3 The simplex method
z
B
Figure 3.6: Feasibility and optimality in the column geometry.
(even though this turns out not to be profitable). The new basic simplex would be the convex hull of B, C, D, and the new basic feasible solution would correspond to point I. Alternatively, we could make point E basic, replacing C, and the new basic feasible solution would now correspond to point G. After a change of basis, the intercept of the requirement line with the new basic simplex is lower, and hence the cost decreases, if and only if the new basic point is below the plane that passes through the old basic points; we refer to the latter plane as the dual plane. For example, point E is below the dual plane and having it enter the basis is profitable; this is not the case for point B. In fact, the vertical distance from the dual plane to a point (Aj, Cj) is equal to the reduced cost of the associated variable Xj (Exercise 3.30); requiring the new basic point to be below the dual plane is therefore equivalent to requiring the entering column to have negative reduced cost.
We discuss next the selection of the basic point that will exit the basis. Each possible choice of the exiting point leads to a different basic simplex. These m basic simplices, together with the original basic simplex (before the change of basis) form the boundary (the faces) of an (m + 1) dimensional simplex. The requirement line exits this (m + 1)-dimensional simplex through its top face and must therefore enter it by crossing some other face. This determines which one of the potential basic simplices will be obtained after the change of basis. In reference to Figure 3.6, the basic

Sec. 3. 6 Column geometry and the simplex method 123
points C , D , F , determine a two-dimensional basic simplex. If point E is to become basic, we obtain a three-dimensional simplex (pyramid) with vertices C, D, E, F. The requirement line exits the pyramid through its top face with vertices C, D, F. It enters the pyramid through the face with vertices D, E, F; this is the new basic simplex.
We can now visualize pivoting through the following physical analogy. Think of the original basic simplex with vertices C, D, F, as a solid object anchored at its vertices. Grasp the corner of the basic simplex at the vertex C leaving the basis, and pull the corner down to the new basic point E. While so moving, the simplex will hinge, or pivot, on its anchor and stretch down to the lower position. The somewhat peculiar terms (e.g. , “simplex” , “pivot” ) associated with the simplex method have their roots in this column geometry.
Example 3.10 Consider the problem illustrated in Figure 3.7, in which m = 1, and the following pivoting rule: choose a point (Ai,Ci) below the dual plane to become basic, whose vertical distance from the dual plane is largest. According to Exercise 3.30, this is identical to the pivoting rule that selects an entering variable with the most negative reduced cost. Starting from the initial basic simplex consisting of the points (A3, C3), (A6, C6), the next basic simplex is determined by the points (A3, C3), (A5, C5), and the next one by the points (A5, C5), (As, cs). In particular, the simplex method only takes two pivots in this case. This example indicates why the simplex method may require a rather small number of pivots, even when the number of underlying variables is large.
z6 3
next basis
b
Figure 3.7: The simplex method finds the optimal basis after two iterations. Here, the point indicated by a number i corre sponds to the vector (Ai,Ci).
•1 •7

124 Chap. 3 The simplex method 3 . 7 Computational efficiency of the simplex
method
The computational efficiency of the simplex method is determined by two factors:
(a) the computational effort at each iteration;
(b) the number of iterations.
The computational requirements of each iteration have already been dis cussed in Section 3.3. For example, the full tableau implementation needs O(mn) arithmetic operations per iteration; the same is true for the revised simplex method in the worst case. We now turn to a discussion of the number of iterations.
The number of iterations in the worst case
Although the number of extreme points of the feasible set can increase exponentially with the number of variables and constraints, it has been observed in practice that the simplex method typically takes only O(m) pivots to find an optimal solution. Unfortunately, however, this practical observation is not true for every linear programming problem. We will describe shortly a family of problems for which an exponential number of pivots may be required.
Recall that for nondegenerate problems, the simplex method always moves from one vertex to an adjacent one, each time improving the value of the cost function. We will now describe a polyhedron that has an ex ponential number of vertices, along with a path that visits all vertices, by taking steps from one vertex to an adjacent one that has lower cost. Once such a polyhedron is available, then the simplex method – under a pivoting rule that traces this path – needs an exponential number of pivots.
Consider the unit cube in �n, defined by the constraints i= 1,…,no
The unit cube has 2n vertices (for each i, we may let either one of the two constraints 0 ::; Xi or Xi ::; 1 become active). Furthermore, there exists a path that travels along the edges of the cube and which visits each vertex exactly once; we call such a path a spanning path. It can be constructed according to the procedure illustrated in Figure 3.8.
Let us now introduce the cost function -Xn- Half of the vertices of the cube have zero cost and the other half have a cost of -1. Thus, the cost cannot decrease strictly with each move along the spanning path, and we do not yet have the desired example. However, if we choose some E E (0, 1/2) and consider the perturbation of the unit cube defined by the constraints
E ::; Xl ::; 1, (3.7)
EXi l, i= 2,…,n, (3.8) –
EXi-1 ::; Xi ::; 1
–

Sec. 3. 7 Computational efficiency of the simplex method 125
(a) (b)
Figure 3.8: (a) A spanning path pz in the two-dimensional cube. (b) A spanning path P3 in the three-dimensional cube. Notice that this path is obtained by splitting the three-dimensional cube into two two-dimensional cubes, following path pz in one of them, moving to the other cube, and following pz in the reverse order. This construction generalizes and provides a recursive definition of a spanning path for the general n-dimensional cube.
then it can be verified that the cost function decreases strictly with each move along a suitably chosen spanning path. If we start the simplex method at the first vertex on that spanning path and if our pivoting rule is to always move to the next vertex on that path, then the simplex method will require 2n – 1 pivots. We summarize this discussion in the following theorem whose proof is left as an exercise (Exercise 3.32).
Theorem 3 . 5 Consider the linear programming problem of minimiz ing-Xnsubjecttotheconstraints(3.7)-(3.8). Then:
(a) The feasible set has 2n vertices.
(b) The vertices can be ordered so that each one is adjacent to and
has lower cost than the previous one.
(c) There exists a pivoting rule under which the simplex method
requires 2n – 1 changes of basis before it terminates.
We observe in Figure 3.8 that the first and the last vertex in the span ning path are adjacent. This property persists in the perturbed polyhedron as well. Thus, with a different pivoting rule, the simplex method could terminate with a single pivot. We are thus led to the following question: is it true that for every pivoting rule there are examples where the simplex

126 Chap. 3 The simplex method
method takes an exponential number of iterations? For several popular pivoting rules, such examples have been constructed. However, these ex amples cannot exclude the possibility that some other pivoting rule might fare better. This is one of the most important open problems in the theory of linear programming. In the next subsection, we address a closely related issue .
The diameter of polyhedra and the Hirsch conjecture
The preceding discussion leads us to the notion of the diameter of a poly hedron P, which is defined as follows. Suppose that from any vertex of the polyhedron, we are only allowed to jump to an adjacent vertex. We define the distance d(x, y) between two vertices x and y as the minimum number of such jumps required to reach y starting from x. The diameter D(P) of the polyhedron P is then defined as the maximum of d(x,y) over all pairs (x,y) of vertices. Finally, we define D.(n, m) as the maximum of D(P) over all bounded polyhedra in lRn that are represented in terms of m inequality constraints. The quantity D.u (n, m) is defined similarly, except that general, possibly unbounded, polyhedra are allowed. For example, we have
and
see Figure 3.9.
D.(2,m)= l;J’ D.u(2, m) = m – 2;
(a)
(b)
Figure 3.9: Let n = 2 and m = 8. (a) A bounded polyhedron with diameter lm/2J = 4. (b) An unbounded polyhedron with diameter m – 2 = 6.
Suppose that the feasible set in a linear programming problem has diameter d and that the distance between vertices x and y is equal to d. If

Sec. 3. 7 Computational efficiency of the simplex method 127
the simplex method (or any other method that proceeds from one vertex
x,
to an adjacent vertex) is initialized at
optimal solution, then at least d steps will be required. Now, if �(n, m) or �u (n, m) increases exponentially with n and m, this implies that there exist examples for which the simplex method takes an exponentially increasing number of steps, no matter which pivoting rule is used. Thus, in order to have any hope of developing pivoting rules under which the simplex method requires a polynomial number of iterations, we must first establish that �(n, m) or �u (n, m) grows with n and m at the rate of some polynomial. The practical success of the simplex method has led to the conjecture that indeed �(n, m) and �u(n, m) do not grow exponentially fast. In fact, the following, much stronger, conjecture has been advanced:
and if y happens to be the unique
Hirsch Conjecture: �(n, m) ::; m – n.
Despite the significance of �(n, m) and �u(n, m), we are far from es tablishing the Hirsch conjecture or even from establishing that these quan tities exhibit polynomial growth. It is known (Klee and Walkup, 1967) that the Hirsch conjecture is false for unbounded polyhedra and, in particular, that
�u (n, m) � m – n + l�J .
Unfortunately, this is the best lower bound known; even though it disproves the Hirsch conjecture for unbounded polyhedra, it does not provide any insights as to whether the growth of �u(n,m) is polynomial or exponential.
Regarding upper bounds, it has been established (Kalai and Kleit man, 1993) that the worst-case diameter grows slower than exponentially, but the available upper bound grows faster than any polynomial. In par ticular, the following bounds are available:
The average case behavior of the simplex method
Our discussion has been focused on the worst-case behavior of the simplex method, but this is only part of the story. Even if every pivoting rule re quires an exponential number of iterations in the worst case, this is not necessarily relevant to the typical behavior of the simplex method. For this reason, there has been a fair amount of research aiming at an under standing of the typical or average behavior of the simplex method, and an explanation of its observed behavior.
The main difficulty in studying the average behavior of any algorithm lies in defining the meaning of the term “average.” Basically, one needs to define a probability distribution over the set of all problems of a given size, and then take the mathematical expectation of the number of iterations
�(n, m) ::; �u(n, m) < m1+1og2 n = (2n)log2 m. 128 Chap. 3 The simplex method required by the algorithm, when applied to a random problem drawn ac cording to the postulated probability distribution. Unfortunately, there is no natural probability distribution over the set of linear programming prob lems. Nevertheless, a fair number of positive results have been obtained for a few different types of probability distributions. In one such result, a set of c, n al,...,amElR andscalarsbI,...,b isgiven. Fori= 1, ...,m, vectors we introduce either constraint a�x ::; bi or a�x 2: bi , with equal probabil ity. We then have 2m possible linear programming problems, and suppose that L of them are feasible. Haimovich (1983) has established that under a rather special pivoting rule, the simplex method requires no more than n/2 iterations, on the average over those L feasible problems. This linear dependence on the size of the problem agrees with observed behavior; some empirical evidence is discussed in Chapter 12. 3.8 Summary This chapter was centered on the development of the simplex method, which is a complete algorithm for solving linear programming problems in stan dard form. The cornerstones of the simplex method are: (a) the optimality conditions (nonnegativity of the reduced costs) that allow us to test whether the current basis is optimal; (b) a systematic method for performing basis changes whenever the op timality conditions are violated. At a high level, the simplex method simply moves from one extreme point of the feasible set to another, each time reducing the cost, until an optimal solution is reached. However, the lower level details of the simplex method, relating to the organization of the required computations and the associated bookkeeping, play an important role. We have described three different implementations: the naive one, the revised simplex method, and the full tableau implementation. Abstractly, they are all equivalent, but their mechanics are quite different. Practical implementations of the sim plex method follow our general description of the revised simplex method, but the details are different, because an explicit computation of the inverse basis matrix is usually avoided. We have seen that degeneracy can cause substantial difficulties, in cluding the possibility of nonterminating behavior (cycling) . This is because in the presence of degeneracy, a change of basis may keep us at the same basic feasible solution, with no cost improvement resulting. Cycling can be avoided if suitable rules for choosing the entering and exiting variables (pivoting rules) are applied (e.g., Bland's rule or the lexicographic pivoting rule). Starting the simplex method requires an initial basic feasible solution, and an associated tableau. These are provided by the Phase I simplex algorithm, which is nothing but the simplex method applied to an auxiliary m Sec. 3.9 Exercises 129 problem. We saw that the changeover from Phase I to Phase II involves some delicate steps whenever some artificial variables are in the final basis constructed by the Phase I algorithm. The simplex method is a rather efficient algorithm and is incorporated in most of the commercial codes for linear programming. While the number of pivots can be an exponential function of the number of variables and constraints in the worst case, its observed behavior is a lot better, hence the practical usefulness of the method. 3 . 9 Exercises Exercise 3.1 (Local Illinhna of convex functions) Let f : Rn f-+ R be a convex function and let S c Rn be a convex set. Let x* be an element of S. Suppose that x* is a local optimum for the problem of minimizing f(x) over S; that is, there exists some Ilx-x*11 :: E. x E S. E >0suchthatf(x*)::f(x)forallxESforwhich Prove that x* is globally optimal; that is, f(x*) :S f(x) for all
Exercise 3.2 (OptiIllality conditions) Consider the problem of minimizing c/x over a polyhedron P. Prove the following:
(a) A feasible solution x is optimal if and only if c’d ;: 0 for every feasible direction d at x.
(b) A feasible solution x is the unique optimal solution if and only if c’d > 0 for every nonzero feasible direction d at x.
Exercise 3.3 Let x be an element of the standard form polyhedron P = {x E RnIAx= b, x;:O}. ProvethatavectordERnisafeasibledirectionatxif and only if Ad = 0 and di ;: 0 for every i such that Xi = O.
Exercise 3.4 Consider the problem of minimizing c/x over the set P = {x E
RnIAx= b, Dx::f, Ex:Sg}. Letx*beanelementofPthatsatisfies Dx* = f, Ex* < g. Show that the set of feasible directions at the point x* is the set Exercise3.5LetP= {xER3IXl+X2+X3= 1, x;:O}andconsiderthe {d E Rn I Ad = 0, Dd :S O}. vector x = (0, 0, 1). Find the set of feasible directions at x. Exercise 3.6 (Conditions for a unique optiIllUIll) Let x be a basic feasible solution associated with some basis matrix B. Prove the following: (a) If the reduced cost of every nonbasic variable is positive, then x is the unique optimal solution. (b) If x is the unique optimal solution and is nondegenerate, then the reduced cost of every nonbasic variable is positive. 130 Chap. 3 The simplex method Exercise 3.7 (Optimality conditions) Consider a feasible solution x to a standard form problem, and let Z = {i I Xi = O}. Show that x is an optimal solution if and only if the linear programming problem minimize c'd subject to Ad = 0 di � 0, i E Z, has an optimal cost of zero. (In this sense, deciding optimality is equivalent to solving a new linear programming problem.) Exercise 3 . 8 * This exercise deals with the problem o f deciding whether a given degenerate basic feasible solution is optimal and shows that this is essentially as hard as solving a general linear programming problem. Consider the linear programming problem of minimizing c'x over all x E P, where P C �n is a given bounded polyhedron. Let Q = {(tx,t) E �n+l I x E P, t E [0,IJ}. (a) Show that Q is a polyhedron. (b) Give an example of P and Q, with n = 2, for which the zero vector (in �n+l) is a degenerate basic feasible solution in Q; show the example in a figure. �n+l (c) Show that the zero vector (in and only if the optimal cost in the original linear programming problem is greater than or equal to zero. Exercise 3.9 (Necessary and sufficient conditions for a unique opti mum) Consider a linear programming problem in standard form and suppose that x* is an optimal basic feasible solution. Consider an optimal basis associ ated with x*. Let B and N be the set of basic and nonbasic indices, respectively. Let I be the set of nonbasic indices i for which the corresponding reduced costs Ci are zero. (a) Show that if I is empty, then x* is the only optimal solution. (b) Show that x* is the unique optimal solution if and only if the following linear programming problem has an optimal value of zero: maximize LXi iEI cycle, no matter which pivoting rule is used. Exercise 3 . 1 1 * Construct an example with n - m = 3 and a pivoting rule under which the simplex method will cycle. ) minimizes (c, O)'y over all y E Q if subject to Ax = b Xi = 0, i E N \ I, Xi � 0, i E B U I. Exercise 3.10 * Show that if n - m = 2, then the simplex method will not Sec. 3.9 Exercises Exercise 3.12 Consider the problem X1,X2 2 0. (a) Convert the problem into standard form and construct a basic feasible solution at which (X1,X2) = (0,0). (b) Carry out the full tableau implementation of the simplex method, starting with the basic feasible solution of part (a) . (c) Draw a graphical representation of the problem in terms of the original variables Xl,X2, and indicate the path taken by the simplex algorithm. Exercise 3.13 This exercise shows that our efficient procedures for updating a tableau can be derived from a useful fact in numerical linear algebra. (a) (b) (c) (Matrix inversion lemma) Let C be an m x m invertible matrix and let u, v be vectors in lRm. Show that (C + WV,)-l = C-1 _ C-1WV'C-1 1 +v'C-1w (Note that wv' is an m x m matrix) . Hint: Multiply both sides by (C + wv'). 1 l Assuming2that C- is available, explain how to obtain (C+WV,)- using only O(m ) arithmetic operations. Let B and B be basis matrices before and after an iteration of the simplex method. Let AB(£) , AB(£) be the exiting and entering column, respectively. minimize -2X1 subject to Xl X2 :: X2 Xl + X2 :: 6 2 131 Show that where e£ is the £th unit vector. B - B = (AS(£) - AB(£))e�, 11 for suitable scalars gi . Provide a formula for gi . Interpret the above equa tion in terms of the mechanics for pivoting in the revised simplex method. (e) Multiply both sides of the equation in part (d) by [b I A] and obtain an interpretation of the mechanics for pivoting in the full tableau implemen tation. Exercise 3.14 Suppose that a feasible tableau is available. Show how to obtain a tableau with lexicographically positive rows. Hint: Permute the columns. Exercise 3.15 (Perturbation approach to lexicography) Consider a stan dard form problem, under the usual assumption that the rows of A are linearly independent. Let € be a scalar and define b(€)=b+[2 ] €:m (d) Notethate�B l that is the ith row of B- and e�B- is the pivot row. Show i = 1, . . . , m, - 132 Chap. 3 The simplex method For every E > 0, we define the E-perturbed problem to be the linear programming
problem obtained by replacing b with b(E).
(a) Given a basis matrix B, show that the corresponding basic solution XB (E)
in the E-perturbed problem is equal to
(b) Show that there exists some E* > 0 such that all basic solutions to the E-perturbed problem are nondegenerate, for 0 < E < E*. (c) Suppose that all rows of B-1 [b I I] are lexicographically positive. Show that XB(E) is a basic feasible solution to the E-perturbed problem for E positive and sufficiently small. (d) Consider a feasible basis for the original problem, and assume that all rows of B-1[b I I] are lexicographically positive. Let some nonbasic variable u = B-1Aj. Let the exiting variable be Xj enter the basis, and define determined as follows. For every row i such that Ui is positive, divide the ith row of B-1[b I I] by Ui, compare the results lexicographically, and choose the exiting variable to be the one corresponding to the lexicographically smallest row. Show that this is the same choice of exiting variable as in the original simplex method applied to the E-perturbed problem, when E is sufficiently small. (e) Explain why the revised simplex method, with the lexicographic rule de scribed in part (d) , is guaranteed to terminate even in the face of degener acy. Exercise 3.16 (Lexicography and the revised shnplex method) Suppose that we have a basic feasible solution and an associated basis matrix B such that every row of B - 1 is lexicographically positive . Consider a pivoting rule that chooses the entering variable Xj arbitrarily (as long as Cj < 0) and the exiting u = B-1Aj• For each i with Ui > 0, divide the ith row (a) The row vector (-C�B-1b, -C�B-1) increases lexicographically at each
variable as follows. Let
of [B-1b I B-1] by Ui and choose the row which is lexicographically smallest. If row I! was lexicographically smallest, then the I!th basic variable XB(£) exits the basis. Prove the following:
iteration.
(b) Every row of B-1 is lexicographically positive throughout the algorithm.
(c) The revised simplex method terminates after a finite number of steps.
Exercise 3.17 Solve completely (Le., both Phase I and Phase II) via the sim
plex method the following problem:
minimize 2X1 + 3X2 + 3X3 + X4 2X5
subject to Xl + 3X2 Xl + 2X2
-Xl 4X2 + 3X3 Xl,…,X5 2: o.
+ 4X4 + X5 2 3X4 + X5 2 1

Sec. 3.9 Exercises 133
Exercise 3.18 Consider the simplex method applied to a standard form prob lem and assume that the rows of the matrix A are linearly independent. For each of the statements that follow, give either a proof or a counterexample.
(a) An iteration of the simplex method may move the feasible solution by a positive distance while leaving the cost unchanged.
(b) A variable that has just left the basis cannot reenter in the very next iteration.
(c) A variable that has just entered the basis cannot leave in the very next iteration.
(d) If there is a nondegenerate optimal basis, then there exists a unique optimal basis.
(e) If x is an optimal solution, no more than m of its components can be positive, where m is the number of equality constraints.
Exercise 3.19 While solving a standard form problem, we arrive at the follow ing tableau, with X3 , X4 , and X5 being the basic variables:
-10 8 -2 0 0 0 4-1’f/100 1 Q -4 0 1 0
(3″(3001
The entries Q, (3, ,,(, 8, ‘f/ in the tableau are unknown parameters. For each one of the following statements, find some parameter values that will make the statement true.
(a) The current solution is optimal and there are multiple optimal solutions.
(b) The optimal cost is – 00 .
(c) The current solution is feasible but not optimal.
Exercise 3.20 Consider a linear programming problem in standard form, de
scribed in terms of the following initial tableau:
000083″(e
010Q 103 2 0 0 1 -2 2 ‘f/ -1 31000-121
The entries Q, (3, ,,(, 8, ‘f/, e in the tableau are unknown parameters. Furthermore, let B be the basis matrix corresponding to having X2, X3, and Xl (in that order) be the basic variables. For each one of the following statements, find the ranges of values of the various parameters that will make the statement to be true.
(a) Phase II of the simplex method can be applied using this as an initial tableau.
(3

134 Chap. 3 The simplex method
(b) The first row in the present tableau indicates that the problem is infeasible.
(c) The corresponding basic solution is feasible, but we do not have an optimal basis.
(d) The corresponding basic solution is feasible and the first simplex iteration
-00.
the basis, and when X6 is the entering variable, X3 leaves the basis.
indicates that the optimal cost is
(e) The corresponding basic solution is feasible, X6 is a candidate for entering
(f) The corresponding basic solution is feasible, X7 is a candidate for enter ing the basis, but if it does, the solution and the objective value remain unchanged.
Exercise 3.21 Consider the oil refinery problem in Exercise 1.16.
(a) Use the simplex method to find an optimal solution.
(b) Suppose that the selling price of heating oil is sure to remain fixed over the next month, but the selling price of gasoline may rise. How high can it go without causing the optimal solution to change?
(c) The refinery manager can buy crude oil B on the spot market at $40/barrel, in unlimited quantities. How much should be bought?
Exercise 3.22 Consider the following linear programming problem with a sin gle constraint:
minimize subject to
n
L CiXi
i=l
LaiXi = b i=l
Xi :: 0, i = 1, . . . , n.
(a) Derive a simple test for checking the feasibility of this problem.
(b) Assuming that the optimal cost is finite, develop a simple method for ob taining an optimal solution directly.
Exercise 3.23 While solving a linear programming problem by the simplex method, the following tableau is obtained at some iteration.
oo
1
o
AssumethatinthistableauwehaveCj::0forj=m+1,…,n- 1,andcn o. l We decrease Xj by (), and adjust the basic variables from XB to XB +(}B- Aj. Given that we wish to preserve feasibility, what is the largest possible value of ()? How are the new basic columns determined?
(d) Assuming that every basic feasible solution is nondegenerate, show that the cost strictly decreases with each iteration and the method terminates.
Exercise 3.26 (The big-M method) Consider the variant ofthe big-M meth od in which M is treated as an undetermined large parameter. Prove the follow ing.
Ui > 0 for all i.
Assume that
(a) (b) (e )
If the simplex method terminates with a solution (x, y) for which y = 0, then x is an optimal solution to the original problem.
If the simplex method terminates with a solution (x, y) for which y =I- 0, then the original problem is infeasible.
If the simplex method terminates with an indication that the optimal cost
in the auxiliary problem is
-00,
show that the original problem is either
minimize

136 Chap. 3 The simplex method
infeasible or its optimal cost is – 00 . Hint: When the simplex method ter minates, it has discovered a feasible direction d = (dx , dy ) of cost decrease. Show that dy = O.
(d) Provide examples to show that both alternatives in part (c) are possible. Exercise 3.27 *
(a) Suppose that we wish to find a vector x E Rn that satisfies Ax = 0 and x ::: 0, and such that the number of positive components of x is maximized. Show that this can be accomplished by solving the linear programming problem
n maximize L Yi
i=l ) subjectto A(z+Y=0
Yi s:: 1, for all i, z, y ::: O.
(b) Suppose that we wish to find a vector x E Rn that satisfies Ax = b and x ::: 0, and such that the number of positive components of x is maximized. Show how this can be accomplished by solving a single linear programming problem.
Exercise 3.28 Consider a linear programming problem in standard form with a bounded feasible set. Furthermore, suppose that we know the value of a scalar U such that any feasible solution satisfies Xi s:: U, for all i. Show that the problem can be transformed into an equivalent one that contains the constraint L�=lXi = 1.
Exercise 3.29 Consider the simplex method, viewed in terms of column geom etry. Show that the m + 1 basic points (Ai, Ci), as defined in Section 3.6, are affinely independent.
Exercise 3.30 Consider the simplex method, viewed in terms of column geom etry. In the terminology of Section 3.6, show that the vertical distance from the dual plane to a point (Aj , Cj ) is equal to the reduced cost of the variable Xj .
Exercise 3.31 Consider the linear programming problem
minimize Xl + 3X2 + 2X3 + 2X4 subjectto 2Xl +3X2 + X3 + X4 bl Xl +2X2 + X3 +3X4 b2
Xl + X2 + X3 + X4 1 Xl , . . . , X4 ::: 0,
where bl , b2 are free parameters. Let P(bl , b2) be the feasible set. Use the column geometry of linear programming to answer the following questions.
(a) Characterize explicitly (preferably with a picture) the set of all (bl ‘ b2) for whichP(h,b2) isnonempty.

Sec. 3.10 Notesandsources 137
(b) Characterize explicitly (preferably with a picture) the set of all (b1 , b2) for
which some basic feasible solution is degenerate.
(c) There are four bases in this problem; in the ith basis, all variables except for Xi are basic. For every (b1 , b2) for which there exists a degenerate basic feasible solution, enumerate all bases that correspond to each degenerate basic feasible solution.
(d) Fori= 1,…,4,let8i= {(b1,b2)Itheithbasisisoptimal}. Identify, preferably with a picture, the sets 81 , . . . , 84 .
(e) For which values of (b1 , b2 ) is the optimal solution degenerate?
(f) Let b1 = 9/5 and b2 = 7/5. Suppose that we start the simplex method with X2 , X3 , X4 as the basic variables. Which path will the simplex method follow?
Exercise 3.32 * Prove Theorem 3.5.
Exercise 3.33 Consider a polyhedron in standard form, and let x, y be two different basic feasible solutions . If we are allowed to move from any basic feasible solution to an adjacent one in a single step, show that we can go from x to y in a finite number of steps.
3.10 Notes and sources
3.2. The simplex method was pioneered by Dantzig in 1947, who later wrote a comprehensive text on the subject (Dantzig, 1963).
3.3. For more discussion of practical implementations of the simplex meth od based on products ofsparse matrices, instead ofB-1, see the books by Gill, Murray, and Wright (1981), Chvatal (1983), Murty (1983), and Luenberger (1984). An excellent introduction to numerical linear algebra is the text by Golub and Van Loan (1983). Example 3.6, which shows the possibility of cycling, is due to Beale (1955).
If we have upper bounds for all or some of the variables, instead of converting the problem to standard form, we can use a suitable adaptation of the simplex method. This is developed in Exercise 3.25 and in the textbooks that we mentioned earlier.
3.4. The lexicographic anticycling rule is due to Dantzig, Orden, and Wolfe (1955). It can be viewed as an outgrowth of a perturbation method developed by Orden and also by Charnes (1952). For an exposition of the perturbation method, see Chvatal (1983) and Murty (1983), as well as Exercise 3.15. The smallest subscript rule is due to Bland (1977). A proof that Bland’s rule avoids cycling can also be found in Papadimitriou and Steiglitz (1982), Chvatal (1983), or Murty (1983).
3.6. The column geometry interpretation o f the simplex method i s due t o Dantzig (1963). For further discussion, see Stone and Tovey (1991).

138 3.7.
3.9.
Chap. 3 The simplex method
The example showing that the simplex method can take an exponen tial number of iterations is due to Klee and Minty (1972). The Hirsch conjecture was made by Hirsch in 1957. The first results on the aver age case behavior of the simplex method were obtained by Borgwardt (1982) and Smale (1983). Schrijver (1986) contains an overview of the early research in this area, as well as proof of the n/2 bound on the number of pivots due to Haimovich (1983).
The results in Exercises 3.10 and 3.11, which deal with the smallest examples of cycling, are due to Marshall and Suurballe (1969). The matrix inversion lemma [Exercise 3.13(a)] is known as the Sherman Morrison formula.

Chapter 4 Duality theory
Contents
4.1. Motivation
4.2. The dual problem
4.3. The duality theorem
4.4. Optimal dual variables as marginal costs
4.5. Standard form problems and the dual simplex method
4.6. Farkas’ lemma and linear inequalities
4.7. From separating hyperplanes to duality*
4.8. Cones and extreme rays
4.9. Representation of polyhedra
4.10. General linear programming duality*
4.11. Summary
4.12. Exercises
4.13. Notes and sources
139

140 Chap. 4 Duality theory
In this chapter, we start with a linear programming problem, called the pri mal, and introduce another linear programming problem, called the dual. Duality theory deals with the relation between these two problems and un covers the deeper structure of linear programming. It is a powerful theoret ical tool that has numerous applications, provides new geometric insights, and leads to another algorithm for linear programming (the dual simplex method) .
4.1 Motivation
Duality theory can be motivated as an outgrowth of the Lagrange multiplier method, often used in calculus to minimize a function subject to equality constraints. For example, in order to solve the problem
minimize x2 + y2 subjectto x+y=1,
we introduce a Lagrange multiplier p and form the Lagrangean L(x, y, p)
defined by
L(x,y,p)= x2+y2+p(l-x-y).
While keeping p fixed, we minimize the Lagrangean over all x and y, subject to no constraints, which can be done by setting aLlax and aLlay to zero. The optimal solution to this unconstrained problem is
and depends on p. The constraint x + y = 1 gives us the additional relation p = 1, and the optimal solution to the original problem is x = y = 1/2.
The main idea in the above example is the following. Instead of enforcing the hard constraint x + y = 1, we allow it to be violated and
Y
The situation in linear programming is similar: we associate a price variable with each constraint and start searching for prices under which the presence or absence of the constraints does not affect the optimal cost. It turns out that the right prices can be found by solving a new linear programming problem, called the dual of the original. We now motivate the form of the dual problem.
associate a Lagrange multiplier, or price, p with the amount 1 – x
by which it is violated. This leads to the unconstrained minimization of x2+y2+p(l-x-y). Whenthepriceisproperlychosen(p= 1,inour example), the optimal solution to the constrained problem is also optimal for the unconstrained problem. In particular, under that specific value of p, the presence or absence of the hard constraint does not affect the optimal cost .
–

Sec. 4.1 Motivation 141 Consider the standard form problem
mInImIZe c’x subject to Ax = b
which we call the primal problem, and let x* be an optimal solution, as sumed to exist. We introduce a relaxed problem in which the constraint Ax = b is replaced by a penalty p'(b – Ax), where p is a price vector of the same dimension as b. We are then faced with the problem
minimize c’x + p'(b – Ax) subject to x 2: o.
Let g(p) be the optimal cost for the relaxed problem, as a function of the price vector p. The relaxed problem allows for more options than those present in the primal problem, and we expect g(p) to be no larger than the optimal primal cost c’x* . Indeed,
x;O:O
where the last inequality follows from the fact that x* is a feasible solution to the primal problem, and satisfies Ax* = b. Thus, each p leads to a lower bound g(p) for the optimal cost c’x*. The problem
maximize g(p)
subject to no constraints
can be then interpreted as a search for the tightest possible lower bound of this type, and is known as the dual problem. The main result in du ality theory asserts that the optimal cost in the dual problem is equal to the optimal cost c’x* in the primal. In other words, when the prices are chosen according to an optimal solution for the dual problem, the option of violating the constraints Ax = b is of no value.
Using the definition of g(p), we have
g(p) min [c’x + p'(b – Ax)]
x;O:O
x 2: 0,
p’b + min(c’ – p’A)x. x2:0
p’A)x = {O,
,
In maximizing g(p), we only need to consider those values of p for which g(p) is not equal to – 00. We therefore conclude that the dual problem is the same as the linear programming problem
maximize p’b subject to p’A :: c’.
Note that
min(c’
ifc’- p’A2:0′, otherwise.
–
x2:0 -00
g(p)=min[c’x+p'(b- Ax)]::c’x*+p'(b- Ax*)= c’x*,

142 Chap. 4 Duality theory
In the preceding example, we started with the equality constraint Ax = b and we ended up with no constraints on the sign of the price vector p. If the primal problem had instead inequality constraints of the
-s
form Ax 2: b, they could be replaced by Ax constraint can be written in the form
= b,
I] which leads to the dual constraints
= 0, p'[A I -I] :: [e’ 1 0′],
[A I
–
or, equivalently,
Also, if the vector x is free rather{than sign-constrained, we use the fact
subject to
a�x2:bi, iEMI, subjectto Pi2:0, iEMI,
m. i
– p’A = 0′, ‘ – p’A x = .
x
n(
e
)
0, if
a�x::bi, iEM2, a�x=bi, iEM3,
Pi::0, Pifree, p’Aj :: Cj, p’Aj 2: Cj, p’Aj = Cj,
iEM2, iEM3, j E Nb j E N2, j E N3.
Xj 2: 0, Xj :: 0, Xj free,
j E NI, j E N2, j E N3,
p’A :: e’ ,
p 2: o.
e
– 00 , otherWIse ,
to end up with the constraint p’A = e’ in the dual problem. These consid erations motivate the general form of the dual problem which we introduce in the next section.
In summary, the construction of the dual of a primal minimization problem can be viewed as follows. We have a vector of parameters (dual variables) p, and for every p we have a method for obtaining a lower bound on the optimal primal cost. The dual problem is a maximization problem that looks for the tightest such lower bound. For some vectors p, the corresponding lower bound is equal to -00, and does not carry any useful information. Thus, we only need to maximize over those p that lead to nontrivial lower bounds, and this is what gives rise to the dual constraints.
4.2 The dual problem
Let A be a matrix with rows a� and columns Aj • Given a primal problem with the structure shown on the left, its dual is defined to be the maxi mization problem shown on the right:
minimize e’x maximize p’b
‘
s
2: o. The equality [:]

Sec. 4.2 The dual problem 143
Notice that for each constraint in the primal (other than the sign con straints), we introduce a variable in the dual problem; for each variable in the primal, we introduce a constraint in the dual. Depending on whether the primal constraint is an equality or inequality constraint, the corre sponding dual variable is either free or sign-constrained, respectively. In addition, depending on whether a variable in the primal problem is free or sign-constrained, we have an equality or inequality constraint, respectively, in the dual problem. We summarize these relations in Table 4.1.
and
PRIMAL
constraints
minimize
“2 bi � bi = bi
� O
free
If we start with a maximization problem, we can always convert it into an equivalent minimization problem, and then form its dual according to the rules we have described. However, to avoid confusion, we will adhere to the convention that the primal is a minimization problem, and its dual is a maximization problem. Finally, we will keep referring to the objective function in the dual problem as a “cost” that is being maximized.
A problem and its dual can be stated more compactly, in matrix notation, if a particular form is assumed for the primal. We have, for example, the following pairs of primal and dual problems:
variables
Table 4.1: Relation between primal and dual variables and constraints.
minimize c’x subject to Ax x
b
> 0,
maximize p’b subject to p’A � c’,
minimize c’x subject to Ax
maximize p’b subject to p’A = c’
“2 b,
Exrunple4.1 Considertheprimalproblemshownontheleftanditsdualshown
maximize
“2 0
�O
free
o.
Let us assume temporarily that the rows of A are linearly independent and that there exists an optimal solution. Let us apply the simplex method to this problem. As long as cycling is avoided, e.g., by using the lexicographic pivoting rule, the simplex method terminates with an optimal solution x and an optimal basis B . Let XB = B – 1 b be the corresponding vector of basic variables. When the simplex method terminates, the reduced costs must be nonnegative and we obtain
C ‘ – c ‘B B – 1 A > 0 ‘ , –
where cB is the vector with the costs of the basic variables. Let us define
a vector p by letting p’ =cBB I /
. We then have p’A :: c , which shows that p is a feasible solution to the dual problem
In addition,
–
maximize p/b subject to p’A :: c/.
‘b=cI B-Ib=cI XB=CIX. PBB
It follows that p is an optimal solution to the dual (cf. Corollary 4.2), and the optimal dual cost is equal to the optimal primal cost.
If we are dealing with a general linear programming problem III that has an optimal solution, we first transform it into an equivalent standard

Sec. 4.3 The duality theorem 149
form problem II2, with the same optimal cost, and in which the rows of the matrix A are linearly independent. Let Dl and D2 be the duals of III and II2, respectively. By Theorem 4.2, the dual problems Dl and D2 have the same optimal cost. We have already proved that II2 and D2 have the same optimal cost. It follows that III and Dl have the same optimal cost (see
Figure 4.1).
equivalent
D
——- � D2 duality for
standard form problems
Figure 4. 1 : Proof of the duality theorem for general linear pro gramming problems.
The preceding proof shows that an optimal solution to the dual prob lem is obtained as a byproduct of the simplex method as applied to a primal problem in standard form. It is based on the fact that the simplex method is guaranteed to terminate and this, in turn, depends on the existence of pivoting rules that prevent cycling. There is an alternative derivation of the duality theorem, which provides a geometric, algorithm-independent view of the subject, and which is developed in Section 4.7. At this point, we provide an illustration that conveys most of the content of the geometric proof.
Example 4.4 Consider a solid ball constrained to lie in a polyhedron defined by inequality constraints of the form a�x 2: bi. If left under the influence of gravity, this ball reaches equilibrium at the lowest corner x* of the polyhedron; see Figure 4.2. This corner is an optimal solution to the problem
minimize c’ x
subject to a�x 2: bi, V i,
where c is a vertical vector pointing upwards. At equilibrium, gravity is counter balanced by the forces exerted on the ball by the “walls” of the polyhedron. The latter forces are normal to the walls, that is, they are aligned with the vectors ai . Weconcludethatc = 2:iPiai,forsomenonnegativecoefficientsPi;inparticular,
duals of equivalent problems are equivalent

150 Chap. 4 Duality theory the vector p is a feasible solution to the dual problem
maximize p’h ‘ subject to p’A = c
p � O.
Given that forces can only be exerted by the walls that touch the ball, we must
a
Figure 4.2: A mechanical analogy of the duality theorem.
Recall that in a linear programming problem, exactly one of the fol
lowing three possibilities will occur:
(a) There is an optimal solution.
(b) The problem is “unbounded”; that is, the optimal cost is -00 (for minimization problems) , or +00 (for maximization problems) .
(c) The problem is infeasible.
This leads to nine possible combinations for the primal and the dual, which are shown in Table 4.2. By the strong duality theorem, if one problem has an optimal solution, so does the other. Furthermore, as discussed earlier, the weak duality theorem implies that if one problem is unbounded, the other must be infeasible. This allows us to mark some of the entries in Table 4.2 as “impossible.”
�
) = 0 for all i. We It follows (Corollary 4.2) that
x
-a
* ‘ * *
have Pi = 0, whenever
therefore have p’h = LiPibi = LiPi�X* = c x
p is an optimal solution to the dual, and the optimal dual cost is equal to the optimal primal cost.
> bi. Consequently, Pi(bi .
�
x

Sec. 4.3 The duality theorem
II Finite optimum
Unbounded
Impossible Impossible Possible
151
Infeasible I Impossible
Possible Possible
Finite optimum Unbounded Infeasible
Possible Impossible Impossible
Table 4.2: The different possibilities for the primal and the dual.
The case where both problems are infeasible can indeed occur, as shown by
the following example.
Example 4.5 Consider the infeasible primal
Its dual is
which is also infeasible.
minimize Xl + 2X2 subjectto Xl+X2 1
2Xl + 2X2 3.
maximize Pl + 3P2 subject to Pl + 2P2 1
Pl + 2P2 2,
There is another interesting relation between the primal and the dual which is known as Clark’s theorem (Clark, 1961). It asserts that unless both problems are infeasible, at least one of them must have an unbounded feasible set (Exercise 4.21).
Complementary slackness
An important relation between primal and dual optimal solutions is pro vided by the complementary slackness conditions, which we present next.
Theorem4.5 (Complementaryslackness)Letxandpbefeasible solutions to the primal and the dual problem, respectively. The vectors x and p are optimal solutions for the two respective problems if and only if:
Pi(a�x-bi) = 0, ‘t/i, (Cj-p’Aj)xj = 0, ‘t/j.
Proof. In the proof of Theorem 4.3, we defined Ui = Pi(a�x – bi) and Vj = (cj -p’Aj)xj, and noted that for x primal feasible and p dual feasible,

152 Chap. 4 Duality theory we have Ui � 0 and Vj � 0 for all i and j. In addition, we showed that
c’x – p’b = L Ui + L Vj . ij
By the strong duality theorem, if x and p are optimal, then c’x = p’b, which implies that Ui = Vj = 0 for all i, j. Conversely, ifUi = Vj = 0 for all i, j, then c’x = p’b, and Corollary 4.2 implies that x and p are optimal.
o
The first complementary slackness condition is automatically satis fied by every feasible solution to a problem in standard form. If the pri mal problem is not in standard form and has a constraint like a�x � bi, the corresponding complementary slackness condition asserts that the dual variable Pi is zero unless the constraint is active. An intuitive explanation is that a constraint which is not active at an optimal solution can be re moved from the problem without affecting the optimal cost, and there is no point in associating a nonzero price with such a constraint. Note also the analogy with Example 4.4, where “forces” were only exerted by the active constraints.
Ifthe primal problem is in standard form and a nondegenerate optimal basic feasible solution is known, the complementary slackness conditions determine a unique solution to the dual problem. We illustrate this fact in the next example.
Example 4.6 Consider a problem in standard form and its dual:
minimize 13xI + lOx2 + 6X3 maximize 8PI + 3P2 subjectto 5XI + X2 +3X3 8 subjectto 5PI +3P2 ::
3XI + X2 3 PI + P2 :: 10 Xl,X2,X3�0, 3PI :: 6.
As will be verified shortly, the vector x* = (1, 0, 1) is a nondegenerate optimal solution to the primal problem. Assuming this to be the case, we use the comple mentary slackness conditions to construct the optimal solution to the dual. The condition Pi(a�x* – bi) = 0 is automatically satisfied for each i, since the primal is in standard form. The condition (Cj -p’Aj)xj = 0 is clearly satisfied forj = 2, because X2 = O. However, since xi > 0 and X3 > 0, we obtain
and
3PI = 6,
which we can solve to obtain PI = 2 and P2 = 1. Note that this is a dual feasible solution whose cost is equal to 19, which is the same as the cost of x* . This verifies that x* is indeed an optimal solution as claimed earlier.
We now generalize the above example. Suppose that Xj is a ba sic variable in a nondegenerate optimal basic feasible solution to a primal
13

Sec. 4.3 The duality theorem 153
problem in standard form. Then, the complementary slackness condition (Cj_piAj)xj=0yieldspiAj=Cjforeverysuchj. Sincethebasiccolumns Aj are linearly independent, we obtain a system of equations for p which has a unique solution, namely, pi = C�B-l. A similar conclusion can also be drawn for problems not in standard form (Exercise 4.12). On the other hand, if we are given a degenerate optimal basic feasible solution to the primal, complementary slackness may be of very little help in determining an optimal solution to the dual problem (Exercise 4.17).
We finally mention that if the primal constraints are of the form Ax 2 h, x 2 0, and the primal problem has an optimal solution, then there exist optimal solutions to the primal and the dual which satisfy strict complementary slackness; that is, a variable in one problem is nonzero if and only if the corresponding constraint in the other problem is active (Exercise 4.20). This result has some interesting applications in discrete optimization, but these lie outside the scope of this book.
A geometric view
We now develop a geometric view that allows us to visualize pairs of primal and dual vectors without having to draw the dual feasible set.
We consider the primal problem
c’x
subjectto a�x2bi, i=1,.. .,m,
minimize
where the dimension of x is equal to n. We assume that the vectors ai span
�n. The corresponding dual problem is
maximize p’h m
subject to L>i� = C i=l
P 2 0.
Let I be a subset of {1, . . . , m} of cardinality n, such that the vectors �, i E I, are linearly independent. The system a�x = bi, i E I, has a unique solution, denoted by xl, which is a basic solution to the primal problem (cf. Definition 2.9 in Section 2.2). We assume, that xl is nondegenerate, that is, a�x -=I- bi for i � I.
Let p E �m be a dual vector (not necessarily dual feasible) , and let us consider what is required for xl and p to be optimal solutions to the primal and the dual problem, respectively. We need:
(a) a�xI 2 bi, for all i,
(b) Pi = O, for all i � I,
(c)
(d) P 2 0,
(primal feasibility), (complementary slackness), (dual feasibility),
(dual feasibility).

154
Chap. 4 Duality theory
cl
Figure 4.3: Consider a primal problem with two variables and five inequality constraints (n = 2, m = 5), and suppose that no two of the vectors ai are collinear. Every two-element subset I of {I, 2, 3, 4, 5} determines basic solutions xl and pI of the primal and the dual, respectively. I
If I = {1, 2}, Xl is primal infeasible (point A) and p is dual in
c
feasible, because
combination of the vectors al and a2 . I
If I = {I, 3}, Xl is primal feasible (point B) and p
sible. l I
IfI= {I,4},x isprimalfeasible(pointC)andp isdualfeasible,
cannot be expressed as a nonnegative linear is dual infea
c
Given the complementary slackness condition (b), condition (c) becomes
Since the vectors ai, i E I, are linearly independent, the latter equation has a unique solution that we denote by pl. In fact, it is readily seen that the vectors ai, i E I, form a basis for the dual problem (which is in standard form) and pI is the associated basic solution. For the vector pI to be dual feasible, we also need it to be nonnegative. We conclude that once the complementary slackness condition (b) is enforced, feasibility of
because
the vectors al and 8.4. In particular, xl and p are optimal.
If I = {1, 5}, xl is primal infeasible (point D) and pI is dual feasible.
can be expressed as a nonnegative linear combination of
I

Sec. 4.4 Optimal dual variables as marginal costs 155
Figure 4.4: The vector x* is a degenerate basic feasible solution of the primal. If we choose I = { 1 , 2 } , the corresponding dual
basic solution pi is infeasible, because
combination of aI , a2 . On the other hand, if we choose I = {I, 3} or I = {2, 3}, the resulting dual basic solution pi is feasible and, therefore, optimal.
the resulting dual vector pI is equivalent to c being a nonnegative linear combination of the vectors ai, i E I, associated with the active primal constraints. This allows us to visualize dual feasibility without having to draw the dual feasible set; see Figure 4.3.
Ifx* is a degenerate basic solution to the primal, there can be several subsets I such that xl = x*. Using different choices for I, and by solving
c,
4.4 Optimal dual variables as marginal costs
In this section, we elaborate on the interpretation of the dual variables as prices. This theme will be revisited, in more depth, in Chapter 5.
Consider the standard form problem
the system LiEIPiai =
been enforcing complementary slackness and Theorem 4.5 applies.
l
we may obtain several dual basic solutions p may then well be the case that some of them are dual feasible and some are not; see Figure 4.4. Still, if pI is dual feasible (Le., all Pi are nonnegative) and if x* is primal feasible, then they are both optimal, because we have
c’x subject to Ax b
x > o.
We assume that the rows of A are linearly independent and that there
minimize
c
is not a nonnegative linear
. It

156 Chap. 4 Duality theory
is a nondegenerate basic feasible solution x* which is optimal. Let B be the corresponding basis matrix and let XB = B-1b be the vector of basic variables, which is positive, by nondegeneracy. Let us now replace b by b + d, where d is a small perturbation vector. Since B-1b > 0, we also have B-1 (b + d) > 0, as long as d is small. This implies that the same basis leads to a basic feasible solution of the perturbed problem as well. Perturbing the right-hand side vector b has no effect on the reduced costs associated with this basis. By the optimality of x* in the original problem, the vector of reduced costs c’ – c’aB-1A is nonnegative and this establishes that the same basis is optimal for the perturbed problem as well. Thus, the optimal cost in the perturbed problem is
where p’ = c’aB-1 is an optimal solution to the dual problem. Therefore, a small change of d in the right-hand side vector b results in a change of p’d in the optimal cost. We conclude that each component Pi of the optimal dual vector can be interpreted as the marginal cost (or shadow price) per unit increase of the ith requirement bi.
We conclude with yet another interpretation of duality, for standard form problems. In order to develop some concrete intuition, we phrase our discussion in terms of the diet problem (Example 1.3 in Section 1.1). We interpret each vector Aj as the nutritional content o f the jth available food, and view b as the nutritional content of an ideal food that we wish to synthesize. Let us interpret Pi as the “fair” price per unit of the ith nutrient. A unit of the jth food has a value of Cj at the food market, but it also has a value of p’Aj if priced at the nutrient market. Complementary slackness asserts that every food which is used (at a nonzero level) to synthesize the ideal food, should be consistently priced at the two markets. Thus, duality is concerned with two alternative ways of cost accounting. The value of the ideal food, as computed in the food market, is c’x*, where x* is an optimal solution to the primal problem; the value of the ideal food, as computed in the nutrient market, is p’b. The duality relation c’x* = p’b states that when prices are chosen appropriately, the two accounting methods should give the same results.
4.5 Standard form problems and the dual simplex method
In this section, we concentrate on the case where the primal problem is in standard form. We develop the dual simplex method, which is an alternative to the simplex method of Chapter 3. We also comment on the relation between the basic feasible solutions to the primal and the dual, including a discussion of dual degeneracy.

Sec. 4.5 Standard form problems and the dual simplex method 157
In the proof of the strong duality theorem, we considered the simplex method applied to a primal problem in standard form and defined a dual vector p by letting p’1= C�B-1. We then noted that the primal optimality condition c’ – C�B- A 2:: 0′ is the same as the dual feasibility condition p’A � c’. We can thus think of the simplex method as an algorithm that maintains primal feasibility and works towards dual feasibility. A method with this property is generally called a primal algorithm. An alternative is to start with a dual feasible solution and work towards primal feasibility. A method of this type is called a dual algorithm. In this section, we present a dual simplex method, implemented in terms of the full tableau. We argue that it does indeed solve the dual problem, and we show that it moves from one basic feasible solution of the dual problem to another. An alternative implementation that only keeps track of the matrix B-l, instead of the entire tableau, is called a revised dual simplex method (Exercise 4.23).
The dual simplex method
Let us consider a problem in standard form, under the usual assumption that the rows of the matrix A are linearly independent. Let B be a basis
matrix, consisting of
the corresponding tableau
m
linearly independent columns of A, and consider
or, in more detail,
-C�XB C1 … cn
XB(m)
XB(l)
B-1A1 . . . B-1An
II II
We do not require B-1b to be nonnegative, which means that we have a basic, but not necessarily feasible solution to the primal problem. However, we assume that c 2:: 0; equivalently, the vector p’ = c�B-1 satisfies p’A � c’, and we have a feasible solution to1the dual problem.
‘
The cost of this dual feasible solution is p’b = C�B- b = C�XB
is the negative of1the entry at the upper left corner of the tableau. If the inequality B- b 2:: 0 happens to hold, we also have a primal feasible solution with the same cost, and1optimal solutions to both problems have been found. If the inequality B- b 2:: 0 fails to hold, we perform a change of basis in a manner we describe next.
which

158 Chap. 4 Duality theory
We find some £ such that XB(£) < 0 and consider the £th row of the tableau, called the pivot row; this row is of the form (XB(£),V1," " vn), where Vi is the £th component of B-1Ai' For each i with Vi < 0 (if such i exist), we form the ratio i;/Ivil and let j be an index for which this ratio is smallest; that is, Vj < 0 and (4.2) (We call the corresponding entry Vj the pivot element. Note that Xj must be a nonbasic variable, since the jth column in the tableau contains the negative element Vj.) We then perform a change of basis: column Aj enters the basis and column AB(£) exits. This change of basis (or pivot) is effected exactly as in the primal simplex method: we add to each row of the tableau a multiple of the pivot row so that all entries in the pivot column are set to zero, with the exception of the pivot element which is set to 1. In particular, in order to set the reduced cost in the pivot column to zero, we multiply the pivot row by cj/lvjl and add it to the zeroth row. For every i, the new value of Ci is equal to _� Ci + Vi Cj ' which is nonnegative because of the way that j was selected [ef. Eq. (4.2)J. We conclude that the reduced costs in the new tableau will also be nonneg ative and dual feasibility has been maintained. Example 4.7 Consider the tableau Xl X2 X3 X4 X5 0 2 6 10 0 0 2 -2 4 1 1 0 -1 4 -2* -3 0 1 Since XB(2) < 0, we choose the second row to be the pivot row. Negative entries of the pivot row are found in the second and third column. We compare the corresponding ratios 6/1 - 21 and 10/1 - 31. The smallest ratio is 6/1 - 21 and, therefore, the second column enters the basis. (The pivot element is indicated by an asterisk.) We multiply the pivot row by 3 and add it to the zeroth row. We multiply the pivot row by 2 and add it to the first row. We then divide the pivot row by -2. The new tableau is Xl X2 X3 X4 X5 14 0 1 0 3 0 6 0 -5 1 2 1/2 -2 1 3/2 0 -1/2 C- Ci _ _J = IVjl {ilvi 0,
and note that the first is the dual of the second. The maximization prob lem is infeasible, which implies that the minimization problem is either unbounded (the optimal cost is -00) or infeasible. Since p = 0 is a feasi ble solution to the minimization problem, it follows that the minimization problem is unbounded. Therefore, there exists some p which is feasible, that is, p’A 2: 0′, and whose cost is negative, that is, p’b < O. 0 We now provide a geometric illustration of Farkas' lemma (see Fig ure 4.7). Let Al, . . . , An be the columns of the matrix A and note that Ax = L:�=l Aixi' Therefore, the existence of a vector x 2: 0 satisfying 166 Chap. 4 Duality theory Ax = b is the same as requiring that b lies in the set of all nonnegative linear combinations of the vectors AI' . . . ' An, which is the shaded region in Figure 4.7. If b does not belong to the shaded region (in which case the first alternative in Farkas' lemma does not hold) , we expectI intuitively that we can find a vector p and an associated hyperplane {z p'z = o} such that b lies on one side of the hyperplane while the shaded region lies on the other side. We then have p'b < 0 and p'Ai � 0 for all i, or, equivalently, p'A � 0', and the second alternative holds. Farkas' lemma predates the development of linear programming, but duality theory leads to a simple proof. A different proof, based on the geometric argument we just gave, is provided in the next section. Finally, there is an equivalent statement of Farkas' lemma which is sometimes more convenient . Corollary 4.3 Let AI, . . . , An and b be given vectors and suppose that any vector p that satisfies p'Ai � 0, i = 1,..., n, must also as satisfy p'b � o. Then, b can be expressed combination ofthe vectors AI,...,An. Our next result is of a similar character. a nonnegative linear Theorem 4.7 Suppose that the system oflinear inequalities Ax S b has at least one solution, and let d be some scalar. Then, the following are equivalent: (a) Every feasible solution to the system Ax S b satisfies c'x S d. (b) Thereexistssomep�0suchthatp'A=c'andp'bSd. Proof. Consider the following pair of problems maximize c'x minimize p'b subject to Ax S b, subject to p'A = c' p � O, and note that the first is the dual of the second. If the system Ax S b has a feasible solution and if every feasible solution satisfies c'x S d, then the first problem has an optimal solution and the optimal cost is bounded above by d. By the strong duality theorem, the second problem also has an optimal solution p whose cost is bounded above by d. This optimal solution satisfies p'A = c', p � 0, and p'b S d. Conversely, if some p satisfies p'A = c', p � 0, and p'b S d, then the weak duality theorem asserts that every feasible solution to the first problem must also satisfy c'x S d. 0 Results such as Theorems 4.6 and 4.7 are often called theorems ofthe Sec. 4. 6 Farkas' lemma and linear inequalities 167 Figure 4.7: If the vector b does not belong to the set of all nonnegative linear combinations of AI, . . . , An, then we can find a hyperplane {z I p'z = O} that separates it from that set. alternative. There are several more results of this type; see, for example, Exercises 4.26, 4.27, and 4.28. Applications of Farkas' lemma to asset pricing Consider a market that operates for a single period, and in which n different assets are traded. Depending on the events during that single period, there are m possible states of nature at the end of the period. If we invest one s, s to buy them back at the end. Hence, one must pay out rsilxil if state occurs, which is the same as receiving a payoff of rsiXi. dollar in some asset i and the state of nature turns out to be payoffofrsi. Thus, each asset iisdescribedbyapayoffvector (rli,...,rmi). The following m X n payoff matrix gives the payoffs of each of the n assets for each of the m states of nature: Let Xi be the amount held of asset i. A portfolio of assets is then a vector (X ,...,x ). The components of a portfolio ln x x= or negative. A positive value of Xi indicates that one has bought Xi units of asset i and is thus entitled to receive rsiXi if state negative value of Xi indicates a "short" position in asset i: this amounts to selling IXiI units of asset i at the beginning of the period, with a promise s we receive a can be either positive materializes . A 168 Chap. 4 Duality theory The wealth in state s that results from a portfolio x is given by n Ws = LrSixi. i=1 Weintroducethevectorw=(WI,. ..,wm), andweobtain w = Rx. Let Pi be the price of asset i in the beginning of the period, and let p = (PI , . . . , Pn) be the vector of asset prices. Then, the cost of acquiring a portfolio x is given by p'x. The central problem in asset pricing is to determine what the prices Pi should be. In order to address this question, we introduce the absence of arbitrage condition, which underlies much of finance theory: asset prices should always be such that no investor can get a guaranteed nonnegative payoff out of a negative investment. In other words, any portfolio that pays off nonnegative amounts in every state of nature, must be valuable to investors , so it must have nonnegative cost . Mathematically, the absence of arbitrage condition can be expressed as follows: if Rx�0, thenwemusthave p'x�O. Given a particular set of assets, as described by the payoff matrix R, only certain prices p are consistent with the absence of arbitrage. What charac terizes such prices? What restrictions does the assumption of no arbitrage impose on asset prices? The answer is provided by Farkas' lemma. Theorem 4.8 The absence of arbitrage condition holds if and only if there exists a nonnegative vector q = (q1 , " " qm ) , such that the price of each asset i is given by m Pi = LqsrSi. 8=1 Proof. The absence of arbitrage condition states that there exists no vector x such that x'R' � 0' and x'p < O. This is of the same form as condition (b) in the statement of Farkas' lemma (Theorem 4.6) . (Note that here p plays the role of b, and R' plays the role of A.) Therefore, by Farkas' lemma, the absence of arbitrage condition holds if and only if there exists some nonnegative vector q such that R'q = p, which is the same as the condition in the theorem's statement. D Theorem 4.8 asserts that whenever the market works efficiently enough to eliminate the possibility of arbitrage, there must exist "state prices" qs Sec. 4. 6 From separating hyperplanes to duality 169 that can be used to value the existing assets. Intuitively, it establishes a nonnegative price qs for an elementary asset that pays one dollar if the s, s, The no arbitrage condition is very simple, and yet very powerful. It is the key element behind many important results in financial economics, but these lie beyond the scope of this text. (See, however, Exercise 4.33 for an application in options pricing.) 4.7 From separating hyperplanes to duality* Let us review the path followed in our development of duality theory. We started from the fact that the simplex method, in conjunction with an anti cycling rule, is guaranteed to terminate. We then exploited the termination conditions of the simplex method to derive the strong duality theorem. We finally used the duality theorem to derive Farkas' lemma, which we inter preted in terms of a hyperplane that separates b from the columns of A. In this section, we show that the reverse line of argument is also possible. We start from first principles and prove a general result on separating hyper planes. We then establish Farkas' lemma, and conclude by showing that the duality theorem follows from Farkas' lemma. This line of argument is more elegant and fundamental because instead of relying on the rather compli cated development of the simplex method, it only involves a small number of basic geometric concepts. Furthermore, it can be naturally generalized to nonlinear optimization problems. Closed sets and Weierstrass' theorem Before we proceed any further, we need to develop some background ma terial. A set S c Rn is called closed if it has the following property: if xl,x2, . . . is a sequence of elements of S that converges to some x E Rn, then x E S. In other words, S contains the limit of any sequence of elements Iof S. Intuitively, the set S contains its boundary. Theorem 4.9 Every polyhedron j, c1"wd. Proof. Consider the polyhedron P = {x E Rn I Ax 2: b}. Suppose that Xl,x2,.. .isasequenceofelementsofPthatconvergestosomex*.Wehave state of nature is must be consistently priced, its total value being the sum of the values of the elementary assets from which it is composed. There is an alternative interpretation of the variables qs as being (unnormalized) probabilities of and nothing otherwise. It then requires that every asset the different states state price vector q will not be unique, unless the number of assets equals or exceeds the number of states. which, however, we will not pursue. In general, the 170 Chap. 4 Duality theory to show that x* E P. For each k, we have xk E P and, therefore, Axk 2: b. Taking the limit, we obtain Ax* = A(limk-oo xk) = limk-oo (Axk) 2: b, and x* belongs to P. D The following is a fundamental result from real analysis that provides us with conditions for the existence of an optimal solution to an optimiza tion problem. The proof lies beyond the scope of this book and is omitted. Theorem 4.10 (Weierstrass' theorem) If f : lRn 1- lR is a con tinuous function, and ifS is a nonempty, closed, and bounded subset of lRn, then there exists some x* E S such that f(x*) S f(x) for all X E S. Similarly, there exists some y* E S such that f(y*) 2: f(x) for all x E S. Weierstrass' theorem is not valid if the set S is not closed. Consider, forexample,thesetS= {xElRIx>o}.Thissetisnotclosedbecausewe can form a sequence of elements of S that converge to zero, but x = 0 does not belong to S. We then observe that the cost function f(x) = x is not minimized at any point in S; for every x > 0, there exists another positive number with smaller cost, and no feasible x can be optimal. Ultimately, the reason that S is not closed is that the feasible set was defined by means of strict inequalities. The definition of polyhedra and linear programming problems does not allow for strict inequalities in order to avoid situations of this type.
The separating hyperplane theorem
The result that follows is “geometrically obvious” but nevertheless ex tremely important in the study of convex sets and functions. It states that if we are given a closed and nonempty convex set S and a point x* fj. S, then we can find a hyperplane, called a separating hyperplane, such that S and x* lie in different halfspaces (Figure 4.8).
Theorem4.11 (Separatinghyperplanetheorem)LetSbeanon empty closed convex subset of lRn and let x* E �n be a vector that
C
and D = Sn B [Figure 4.9(a)]. The set D is nonempty, because w E D.
does not belong to S. Then, there exists some vector c’x* < c'x for all x E S. E lRn such that Proof. Let II . II be the Euclidean norm defined by Ilxll = (X'X)1/2. Let us fix some element w of S, and let B= {x I llx-x*11 s Ilw-x*II}, Sec. 4. 7 From separating hyperplanes to duality 171 c Figure 4.8: A hyperplane that separates the point x* from the convex set S. Furthermore, D is the intersection of the closed set S with the closed set B and is also closed. Finally, D is a bounded set because B is bounded. Consider the quantity Ilx - x*ll, where x ranges over the set D. This is a continuous function of x. Since D is nonempty, closed, and bounded, Weierstrass' theorem implies that there exists some y E D such that Ily-x*1I ::; Ilx-x*ll, VxED. For any xESthat does not belong to D, we have Ilx-x*II > Ilw-x*II 2: Ily – x*ll· We conclude that y minimizes Ilx – x*11 over all x E S.
We have so far established that there exists an element y of S which
c=
Let X E S. For any A satisfying 0 < A ::; 1 , we have y + A(X - y) E S, is closest to x*. We now show that the vector property [see Figure 4.9(b)] . y - x* has the desired because S is convex. Since y minimizes Ilx - x* II over all X E S, we obtain Ily-x*112 < IIY+A(X-y)-x*112 Ily-x*112+2A(y-x*)'(x-y)+A211x_ y112, which yields We divide by A and then take the limit as A decreases to zero. We obtain (y-x*)'(x-y) 2: o. [This inequality states that the angle () in Figure 4.9(b) is no larger than 90 degrees.] Thus, (y-x*)'x 2: (y-x*)'y 2A(y-x*)'(x-y)+A211x-Yl12 2: o. 172 Chap. 4 Duality theory x* (a) (b) Figure 4.9: Illustration ofthe proofofthe separating hyperplane theorem . (y-x*)'x*+ (y-x*)'(y-x*) > (y-x*)’x*.
Setting c = y – x* proves the theorem. D Farkas’ lemma revisited
We now show that Farkas’ lemma is a consequence of the separating hy perplane theorem.
We will only be concerned with the difficult half of Farkas’ lemma. In particular, we will prove that if the system Ax = b, x ?: 0, does not have a solution, then there exists a vector p such that p’A ?: 0′ and p’b < O. Let and suppose that the vector b does not belong to S. The set S is clearly convex; it is also nonempty because 0 E S. Finally, the set S is closed; this may seem obvious, but is not easy to prove. For one possible proof, note that S is the projection of the polyhedron {(x, y) I y = Ax, x ?: O} onto the y coordinates, is itself a polyhedron (see Section 2.8), and is therefore closed. An alternative proof is outlined in Exercise 4.37. We now invoke the separating hyperplane theorem to separate b from S and conclude that there exists a vector p such that p'b < p'y for every S {Ax l x ?: O} {y I there exists x such that y = Ax, x?:O}, Sec. 4. 7 From separating hyperplanes to duality* 173 y E S. Since 0 E S, we must have p'b < 0. Furthermore, for every column Ai of A and every A > 0, we have AAi E S and p’b < AP'Ai. We divide both sides of the latter inequality by A and then take the limit as A tends to infinity, to conclude that p'Ai 2: 0. Since this is true for every i, we obtain p'A 2: 0' and the proof is complete. The duality theorem revisited We will now derive the duality theorem as a corollary of Farkas' lemma. We only provide the proof for the case where the primal constraints are of the form Ax 2: b. The proof for the general case can be constructed along the same lines at the expense of more notation (Exercise 4.38). We also note that the proof given here is very similar to the line of argument used in the heuristic explanation of the duality theorem in Example 4.4. We consider the following pair of primal and dual problems minimize c'x maximize p'b subject to Ax 2: b, subject to p'A = c' p 2: 0, and we assume that the primal has an optimal solution x*. We will show that the dual problem also has a feasible solution with the same cost. Once this is done, the strong duality theorem follows from weak duality (cf. Corol lary 4.2). Let f = {i I a�x* = bd be the set of indices of the constraints that are active at x* . We will first show that any vector d that satisfies a�d 2: ° for every i E f, must also satisfy c'd 2: 0. Consider such a vector d and let Ebeapositivescalar. Wethenhavea�(x*+Ed)2:�x*=biforalliEf. In addition, if i rJ. f and if E is sufficiently small, the inequality a�x* > bi implies that a� (x* + Ed) > bi . We conclude that when E is sufficiently small, x* + Ed is a feasible solution. By the optimality of x* , we obtain c’d 2: 0, which establishes our claim. By Farkas’ lemma (cf. Corollary 4.3), c can be expressed as a nonnegative linear combination of the vectors ai, i E f,
and there exist nonnegative scalars Pi, i E
For i rJ. f, we define Pi = 0. We then have p 2: 0 and Eq. (4.3) shows that
the vector p satisfies the dual constraint p’A = c’. In addition, p’b = “Pibi = “Pi ,X* = C,X*,
�� iEI iEI
which shows that the cost of this dual feasible solution p is the same as the optimal primal cost. The duality theorem now follows from Corollary 4.2.
f, such that
(4.3)

174 Chap. 4 Duality theory
(a)
(b)
Figure 4.10: Examples of cones.
In conclusion, we have accomplished the goals that were set out in the beginning of this section. We proved the separating hyperplane theorem, which is a very intuitive and seemingly simple result, but with many im portant ramifications in optimization and other areas in mathematics. We used the separating hyperplane theorem to establish Farkas’ lemma, and finally showed that the strong duality theorem is an easy consequence of Farkas’ lemma.
4.8 Cones and extreme rays
We have seen in Chapter 2, that if the optimal cost in a linear programming problem is finite, then our search for an optimal solution can be restricted to finitely many points, namely, the basic feasible solutions, assuming one exists. In this section, we wish to develop a similar result for the case where
-00.
The first step in our development is to introduce the concept of a cone.
Definition 4.1 A set C c lRn is a cone ifAX E C for all A � 0 and allx E C.
In particular, we will show that the optimal cost if and only if there exists a cost reducing direction along which we can move without ever leaving the feasible set. Furthermore, our search for such a direction can be restricted to a finite set of suitably defined “extreme
the optimal cost is
-00
is rays.”
Cones

Sec. 4.8 Oones and extreme rays 175
Notice that if C is a nonempty cone, then 0 E O. To this see, consider an arbitrary element x of 0 and set .>. = 0 in the definition of a cone; see alsoFigure4.10. ApolyhedronoftheformP={xE�n IAx;:O}is easily seen to be a nonempty cone and is called a polyhedral cone.
Let x be a nonzero element of a polyhedral cone O. We then have 3x/2E0andx/2EO. Sincexistheaverageof3x/2andx/2,itisnot an extreme point and, therefore, the only possible extreme point is the zero vector. If the zero vector is indeed an extreme point, we say that the cone is pointed. Whether this will be the case or not is determined by the criteria provided by our next result.
Theorem 4.12 Let 0 C �n be the polyhedral cone defined by the
=
constraints a�x ;: 0, i
1, . . . ,m. Then, the following are equivalent:
(a) The zero vector is an extreme point ofO.
(b) The cone 0 does not contain a line.
(c) Thereexistn vectorsoutofthefamilyal,…,am, whichare
linearly independent.
Proof. This result is a special case of Theorem 2.6 in Section 2.5. D Rays and recession cones
Consider a nonempty polyhedron
and let us fix some Y E P. We define the recession cone at y as the set of all directions d along which we can move indefinitely away from y, without leaving the set P. More formally, the recession cone is defined as the set
{d E �n I A(y + ‘>’d) ;: b, for all .>. ;: o}. It is easily seen that this set is the same as
and is a polyhedral cone. This shows that the recession cone is independent of the starting point y; see Figure 4.11. The nonzero elements of the recession cone are called the rays of the polyhedron P.
For the case of a nonempty polyhedron P = {x E �n I Ax = b, x ;: O} in standard form, the recession cone is seen to be the set of all vectors
d that satisfy
Ad = O, d ;: o.

176 Chap. 4 Duality theory
Figure 4.11: The recession cone at different elements of a polyhedron. Extreme rays
We now define the extreme rays of a polyhedron. Intuitively, these are the directions associated with “edges” of the polyhedron that extend to infinity; see Figure 4.12 for an illustration.
Definition 4.2
(a) A nonzero element x of a polyhedral cone C c �n is called an extreme ray if there are n – 1 linearly independent constraints that are active at x.
(b) An extreme ray of the recession cone associated with a nonempty polyhedron P is also called an extreme ray ofP.
Note that a positive multiple ofan extreme ray is also an extreme ray. We say that two extreme rays are equivalent if one is a positive multiple of the other. Note that for this to happen, they must correspond to the same n-1 linearly independent active constraints. Any n-1 linearly independent constraints define a line and can lead to at most two nonequivalent extreme rays (one being the negative of the other). Given that there is a finite number of ways that we can choose n – 1 constraints to become active, and as long as we do not distinguish between equivalent extreme rays, we conclude that the number of extreme rays of a polyhedron is finite. A finite collection of extreme rays will be said to be a complete set of extreme rays if it contains exactly one representative from each equivalence class.

Sec. 4.8 Cones and extreme rays 177
(a) (b)
Figure 4.12: Extreme rays of polyhedral cones. (a) The vector y is an extreme ray because n = 2 and the constraint a�x = 0 is active at y. (b) A polyhedral cone defined by three linearly
independent constraints of the form a�x 2: O. The vector
an extreme ray because n = 3 and the two linearly independent
is
The definition of extreme rays mimics the definition of basic feasible solutions. An alternative and equivalent definition, resembling the defini tion of extreme points of polyhedra, is explored in Exercise 4.39.
Characterization of unbounded linear programming problems
We now derive conditions under which the optimal cost in a linear pro gramming problem is equal to -00, first for the case where the feasible set is a cone, and then for the general case.
Proof. One direction of the result is trivial because if some extreme ray has negative cost, then the cost becomes arbitrarily negative by moving along this ray.
For the converse, suppose that the optimal cost is -00. In particular, there exists some x E C whose cost is negative and, by suitably scaling x,
constraints a�x 2: 0 and a�x 2: 0 are active at
z z.
Theorem4.13 Considertheproblemofminimizingc’xoverapointed
polyhedralconeC= {xE�nIa�x?:0, i= 1,
cost is equal to -00 if and only if some extreme ray d of C satisfies c’d < O. . . . ,m}. Theoptimal 178 Chap. 4 Duality theory we can assume that c'x = - 1 . In particular, the polyhedron p= {xE�n Ia�x2':0,...,a�x2':O,c'x= -I} is nonempty. Since C is pointed, the vectors aI , . . . , am span �n and this implies that P has at least one extreme point; let d be one of them. At d, we have n linearly independent active constraints, which means that n - 1 linearly independent constraints of the form a�x 2': 0 must be active. It follows that d is an extreme ray of C. 0 By exploiting duality, Theorem 4.13 leads to a criterion for unbound edness in general linear programming problems. Interestingly enough, this criterion does not involve the right-hand side vector h. Theorem 4.14 Consider the problem of minimizing c'x subject to Ax 2': h, and assume that the feasible set has at least one extreme point. The optimal cost is equal to ray d of the feasible set satisfies c'd < o. Proof. One direction of the result is trivial because if an extreme ray has negative cost, then the cost becomes arbitrarily negative by starting at a feasible solution and moving along the direction of this ray. For the proof of the reverse direction, we consider the dual problem: maximize p'h subject to p'A = c' p 2': o. If the primal problem is unbounded, the dual problem is infeasible. Then, the related problem maximize p'0 subject to p'A = c' p 2': 0, is also infeasible. This implies that the associated primal problem minimize c'x subject to Ax 2': 0, is either unbounded or infeasible. Since x = 0 is one feasible solution, it must be unbounded. Since the primal feasible set has at least one extreme point, the rows of A span �n, where n is the dimension of x. It follows that the recession cone {x I Ax 2': O} is pointed and, by Theorem 4.13, there exists an extreme ray d of the recession cone satisfying c'd < O. By definition, this is an extreme ray of the feasible set. 0 - 00 if and only if some extreme Sec. 4.9 Representation ofpolyhedra 179 The unboundedness criterion in the simplex method We end this section by pointing out that if we have a standard form prob -00, lem in which the optimal cost is termination with an extreme ray. the simplex method provides us at Indeed, consider what happens when the simplex method terminates -00. Out of the constraints defining the recession cone, the jth basic di with an indication that the optimal cost is a basis matrix B, a nonbasic variable Xj with negative reduced cost, and the jth column B-1Aj of the tableau has no positive elements. Consider the jth basic direction d, which is the vector that satisfies dB = _B-1Aj, dj = 1, and di = 0 for every nonbasic index i other than j. Then, the vector d satisfies Ad = 0 and d 2: 0, and belongs to the recession cone. It is also a direction of cost decrease, since the reduced cost Cj of the entering variable is negative. rection d satisfies n - these are the constraints Ad = 0 (m of them) and the constraints di = 0 Xl,. k 1r w ,...,w 1 linearly independent such constraints with equality: for i nonbasic and different than j (n - m - d is an extreme ray. 1 of them) . We conclude that 4.9 Representation of polyhedra In this section, we establish one of the fundamental results of linear pro gramming theory. In particular, we show that any element of a polyhedron that has at least one extreme point can be represented as a convex combi nation of extreme points plus a nonnegative linear combination of extreme rays. A precise statement is given by our next result. A generalization to the case of general polyhedra is developed in Exercise 4.47. Theorem 4.15 (Resolution theorem) Let be a nonempty polyhedron with at least one extreme point. Let At that point, we have . , x be the extreme points, and let set of extreme rays of P. Let be a complete . Then, Q = P. 180 Chap. 4 Duality theory Proof. We first prove that Q C P. Let kirj X=LAiX + LBjw i=l j=l be an element of Q, where the coefficients Ai and Bj are nonnegative, and =1 Awj 2: 0 for every j, which implies that the vector z = 2: Bjw satisfies ;=1 j = i 2:7 AiX is a convex combination of ele 2:7=1Ai=1. Thevectory ments of P. It therefore belongs to P and satisfies Ay 2: b. We also have Az2:o. Itthenfollowsthatthevectorx=y+zsatisfiesAx2:band belongs to P. For the reverse inclusion, we assume that P is not a subset of Q and we will derive a contradiction. Let z be an element of P that does not belong to Q. Consider the linear programming problem maximize subjectto kr LOAi+LOBj i=l j=l kirj LAiX+LBjw=z i=l j=l k LAi = l i=l Ai 2: 0, Bj2:0, (4.4) which is infeasible because z fj. Q. This problem is the dual of the problem minimize p'z + q subjectto p'xij+q2:0, i=1,...,k, (4.5) p'w 2:0, j= 1,...,r. Because the latter problem has a feasible solution, namely, p = 0 and q = 0, the optimal cost is -00, and there exists a feasible solution (p,q) whose cost p'z+ qis negative. On the other hand, p'xi+ q2: ° for all i and this implies that p'z < p'xi for all i. We also have p'wj 2: ° for all j. 1 Having fixed p as above, we now consider the linear programming problem If the optimal cost is finite, there exists an extreme point xi which is op minimize p'x subject to Ax 2: b. timal. Since z is a feasible solution, we obtain p'xi ::; p'z, which is a i = 1, . . . , k, j =1,...,r, 1 For an intuitive view of this proof, the purpose of this paragraph was to construct a hyperplane that separates z from Q . Sec. 4.9 Representation ofpolyhedra 181 contradiction. If the optimal cost is - 00 , Theorem 4.14 implies that there ' wi < 0, which is again a contradiction. D exists an extreme ray wi such that p Example 4.10 Consider the unbounded polyhedron defined by the constraints Xl - X2 � -2 Xl+X2 � 1 XI,X2 � ° (see Figure 4.13). This polyhedron has three extreme points, namely, Xl = (0, 2), x2 = (0, 1), and x3 = (1, 0). The recession cone C is described by the inequalities dl-d2�0,dl+d2�0,anddl,d2�0. WeconcludethatC={(dl,d2)10:: d2 :: dl}. This cone has two extreme rays, namely, WI = (1, 1) and w2 = (1, 0). The vector y = (2, 2) is an element of the polyhedron and can be represented as However, this representation is not unique; for example, we also have = "2x + "2x + "2w . Figure 4.13: The polyhedron of Example 4.10. W e n o t { e t h a t t h e s e t Q i n T h e o r e m 4 . 1 5 i s t h e i m a g e o f t h e p o} l y h e d r o n H= (.A ,...,Ak,Ih,...,Or) It l Ai = l, Ai � O, O � O i , [2]1[0]1[1]3[1] 12133I Y=2="21+"2° +"21 182 Chap. 4 Duality theory under the linear mapping (>\1,. . . ,Ak'(h,…,Or)
kr
f-+
i 2:AiX +
j 2:0jW .
Thus, one corollary of the resolution theorem is that every polyhedron is the image, under a linear mapping, of a polyhedron H with this particular structure .
We now specialize Theorem 4.15 to the case of bounded polyhedra, to recover a result that was also proved in Section 2.7, using a different line of argument.
Corollary 4.4 A nonempty bounded polyhedron is the convex hull of its extreme points.
is a nonzero element of the cone C
P, we have x + Ad E P for all A 2: 0, contradicting the boundedness of P. We conclude that C consists of only the zero vector and does not have any extreme rays. The result then follows from Theorem 4.15. D
=
{x I Ax 2: O} and x is an element of
There is another corollary of Theorem 4.15 that deals with cones, ftnd which is proved by noting that a cone can have no extreme points other than the zero vector.
Converse to the resolution theorem
i=l j=l
Proof. Let P = {x I Ax 2: b} be a nonempty bounded polyhedron. If d
Corollary 4.5 Assume that the cone C = {x I Ax 2: O} is pointed.
Then, every element of C can be expressed combination of the extreme rays of C .
as
a nonnegative linear
Let us say that a set Q is finitely generated if it is specified in the form (4.6)
where Xl, . . . ,xk and w1, . . . ,wr are some given elements of �n. The res olution theorem states that a polyhedron with at least one extreme point is a finitely generated set (this is also true for general polyhedra; see Exer cise 4.47). We now discuss a converse result, which states that every finitely generated set is a polyhedron.

Sec. 4.9 General1inearprogrammingduality 183 As observed earlier, a finitely generated set Q can be viewed as the
image of the {polyhedron
I “tAi = l, Ai 2: 0, {}j 2: O ,=1
}
H=
(A1,…,Ak,(h,. . . ,(q
under a certain linear mapping. Thus, the results of Section 2.8 apply and establish that a finitely generated set is indeed a polyhedron. We record this result and also present a proof based on duality.
Theorem 4 . 1 6 A finitely generated set is a polyhedron. In particular, the convex hull of finitely many vectors is a (bounded) polyhedron.
Proof. Consider the linear programming problem (4.4) that was used in the proof of Theorem 4.15. A given vector z belongs to a finitely generated set Q of the form (4.6) if and only if the problem (4.4) has a feasible solution. Using duality, this is the case if and only if problem (4.5) has finite optimal cost. We convert problem (4.5) to standard form by introducing
nonnegative variables p+
q+-q-, as well as surplus variables. Since standard form polyhedra contain no lines, Theorem 4.13 shows that the optimal cost in the standard form problem is finite if and only if
(p+)’z – (p-)’z + q+ – q- 2: 0,
for each one of its finitely many extreme rays. Hence, z E Q if and only if z satisfies a finite collection of linear inequalities. This shows that Q is a polyhedron. D
In conclusion, we have two ways of representing a polyhedron:
(a) in terms of a finite set of linear constraints;
(b) as a finitely generated set, in terms ofits extreme points and extreme rays .
These two descriptions are mathematically equivalent, but can be quite different from a practical viewpoint. For example, we may be able to describe a polyhedron in terms of a small number of linear constraints. If on the other hand, this polyhedron has many extreme points, a description as a finitely generated set can be much more complicated. Furthermore, passing from one type of description to the other is, in general, a complicated computational task.
4. 10 General linear programming duality*
,P
+
-,q ,q-, such that p
=
+
p -p-,andq
=
In the definition of the dual problem (Section 4.2), we associated a dual
variable Pi with each constraint of the form a�x
=
bi,a�x2:bi,or a�x:: bi.

184 Chap. 4 Duality theory
However, no dual variables were associated with constraints of the form Xi � 0 or Xi :: O. In the same spirit, and in a more general approach to linear programming duality, we can choose arbitrarily which constraints will be associated with price variables and which ones will not. In this section, we develop a general duality theorem that covers such a situation.
Consider the primal problem
where P is the polyhedron
minimize c’x subject to Ax � b
x E P, P= {xIDx�d}.
We associate a dual vector p with the constraint Ax � b. The constraint x E P is a generalization of constraints of the form Xi � 0 or Xi :: 0 and dual variables are not associated with it.
As in Section 4.1, we define the dual objective g(p) by
‘
[c x + p'(b – Ax)].
The dual problem is then defined as maximize g(p)
subject to p � o.
We first provide a generalization of the weak duality theorem.
g(p)
= min xEP
(4.7)
Theorem 4.17 (Weak duality) Ifx is primal feasible (Ax � b and x E P), and p is dual feasible (p � 0), then g(p) :: c’x.
Proof. If x and p are primal and dual feasible, respectively, then p'(b – Ax) :: 0, which implies that
‘
[c Y + p'(b – Ay)]
< c'x+p'(b-Ax) < c'x. D We also have the following generalization of the strong duality theo- rem. Theorem 4.18 (Strong duality) If the primal problem has an op timal solution, so does the dual, and the respective optimal costs are equal. g(p) min yEP Sec. 4.10 Generallinearprogrammingduality* 185 Proof. Since P = {x I Dx � d}, the primal problem is of the form c'x subject to Ax � b Dx � d, and we assume that it has an optimal solution. Its dual, which is also be written as where f(p) is the optimal cost in the problem minimize maximize p'b+q'd ' subjectto p'A+q'D=c p�O q � O, (4.8) must then have the same optimal cost. For any fixed p, the vector q should be chosen optimally in the problem (4.8). Thus, the dual problem (4.8) can maximize p'b + f(p) subject to p � 0, maximize q'd ' subjectto q'D=c - p'A (4.9) q � O. [If the latter problem is infeasible, we set f(p) = - 00 . ] Using the strong duality theorem for problem (4.9), we obtain f(p) = min (c'x - p'Ax). Dx;2:d We conclude that the dual problem (4.8) has the same optimal cost as the problem maximize p'b + min (c'x - p'Ax) subject to p � O. By comparing with Eq. (4.7), we see that this is the same as maximizing g(p) over all p � O. D The idea of selectively assigning dual variables to some of the con straints is often used in order to treat "simpler" constraints differently than more "complex" ones, and has numerous applications in large scale optimization. (Applications to integer programming are discussed in Sec tion 11.4.) Finally, let us point out that the approach in this section extends to certain nonlinear optimization problems. For example, if we replace the Dx;2:d 186 Chap. 4 Duality theory linear cost function c'x by a general convex function c(x), and the poly hedron P by a general convex set, we can again define the dual objective according to the formula min [c(x) + p'(b xEP It turns out that the strong duality theorem remains valid for such nonlinear problems, under suitable technical conditions, but this lies beyond the scope of this book. 4.11 Summary We summarize here the main ideas that have been developed in this chapter. Given a (primal) linear programming problem, we can associate with it another (dual) linear programming problem, by following a set of mechan ical rules. The definition of the dual problem is consistent, in the sense that the duals of equivalent primal problems are themselves equivalent. Each dual variable is associated with a particular primal constraint and can be viewed as a penalty for violating that constraint. By replacing the primal constraints with penalty terms, we increase the set of available options, and this allows us to construct primal solutions whose cost is less than the optimal cost. In particular, every dual feasible vector leads to a lower bound on the optimal cost of the primal problem (this is the essence of the weak duality theorem). The maximization in the dual problem is then a search for the tightest such lower bound. The strong duality theorem asserts that the tightest such lower bound is equal to the optimal primal cost . An optimal dual variable can also be interpreted as a marginal cost, that is, as the rate of change of the optimal primal cost when we perform a small perturbation of the right-hand side vector b, assuming nondegeneracy. A useful relation between optimal primal and dual solutions is pro vided by the complementary slackness conditions. Intuitively, these con ditions require that any constraint that is inactive at an optimal solution carries a zero price, which is compatible with the interpretation of prices as marginal costs. We saw that every basis matrix in a standard form problem deter mines not only a primal basic solution, but also a basic dual solution. This observation is at the heart of the dual simplex method. This method is similar to the primal simplex method in that it generates a sequence of primal basic solutions, together with an associated sequence of dual basic solutions. It is different, however, in that the dual basic solutions are dual feasible, with ever improving costs, while the primal basic solutions are in feasible (except for the last one) . We developed the dual simplex method by simply describing its mechanics and by providing an algebraic justification. g(p) = - Ax)]. Sec. 4.12 Exercises 187 Nevertheless, the dual simplex method also has a geometric interpretation. It keeps moving from one dual basic feasible solution to an adjacent one and, in this respect, it is similar to the primal simplex method applied to the dual problem. All of duality theory can be developed by exploiting the termination conditions of the simplex method, and this was our initial approach to the subject. We also pursued an alternative line of development that proceeded from first principles and used geometric arguments. This is a more direct and more general approach, but requires more abstract reasoning. Duality theory provided us with some powerful tools based on which we were able to enhance our geometric understanding of polyhedra. We derived a few theorems of the alternative (like Farkas' lemma), which are surprisingly powerful and have applications in a wide variety of contexts. In fact, Farkas' lemma can be viewed as the core of linear programming duality theory. Another major result that we derived is the resolution theorem, which allows us to express any element of a nonempty polyhedron with at least one extreme point as a convex combination of its extreme points plus a nonnegative linear combination of its extreme rays; in other words, every polyhedron is "finitely generated." The converse is also true, and every finitely generated set is a polyhedron (can be represented in terms of linear inequality constraints). Results of this type play a key role in confirming our intuitive geometric understanding of polyhedra and linear programming. They allow us to develop alternative views of certain situations and lead to deeper understanding. Many such results have an "obvious" geometric content and are often taken for granted. Nevertheless, as we have seen, rigorous proofs can be quite elaborate. 4.12 Exercises Exercise 4.1 Consider the linear programming problem: minimize subject to Xl X2 2Xl + 3X2 X3 + X4 � 0 3Xl + X2 + 4X3 2X4 � 3 -Xl X2+2X3+X4 6 Xl � 0 X2,X3 � o. Write down the corresponding dual problem. Exercise 4.2 Consider the primal problem minimize c'x subject to Ax x � b � o. Form the dual problem and convert it into an equivalent minimization problem. Derive a set of conditions on the matrix A and the vectors b, c, under which the 188 Chap. 4 Duality theory dual is identical to the primal, and construct an example in which these conditions are satisfied. Exercise 4.3 The purpose of this exercise is to show that solving linear pro gramming problems is no harder than solving systems of linear inequalities. Suppose that we are given a subroutine which, given a system of linear in equality constraints, either produces a solution or decides that no solution exists. Construct a simple algorithm that uses a single call to this subroutine and which finds an optimal solution to any linear programming problem that has an optimal solution. Exercise 4.4 Let A be a symmetric square matrix. Consider the linear pro gramming problem minimize c'x subject to Ax :: c x :: o. Prove that if x* satisfies Ax* = c and x* :: 0, then x* is an optimal solution. Exercise 4.5 Consider a linear programming problem in standard form and assume that the rows of A are linearly independent. For each one of the following statements, provide either a proof or a counterexample. (a) Let x* be a basic feasible solution. Suppose that for every basis correspond ing to x* , the associated basic solution to the dual is infeasible. Then, the optimal cost must be strictly less that c' x* . (b) The dual of the auxiliary primal problem considered in Phase I of the simplex method is always feasible. (c) Let Pi be the dual variable associated with the ith equality constraint in the primal. Eliminating the ith primal equality constraint is equivalent to introducing the additional constraint Pi = 0 in the dual problem. (d) If the unboundedness criterion in the primal simplex algorithm is satisfied, then the dual problem is infeasible. Exercise 4.6 * (Duality in Chebychev approximation) Let A be an m x n matrix and let b be a vector in Rm. We consider the problem of minimizing IIAx - blloo over all x E Rn. Here 11 · 1100 is the vector norm defined by IIYlloo = maXi IYi l . Let v be the value of the optimal cost. (a) Let p be any vector in Rm that satisfies l::l IPil = 1 and p'A = 0'. Show that p'b :: v. (b) In order to obtain the best possible lower bound of the form considered in part (a), we form the linear programming problem maximize p ' b subject to p'A = 0' i=l Show that the optimal cost in this problem is equal to v . Sec. 4.12 Exercises 189 Exercise 4.7 (Duality in piecewise linear convex optimization) Con v (a) Consider any vector p E )Rm that satisfies p'A = 0', p 2: 0, and 2:::1Pi = 1. Show that -p'b :s; v. (b) In order to obtain the best possible lower bound of the form considered in part (a), we form the linear programming problem maximize - p ' b subject to p'A 0' 1 p > 0,
where e is the vector with all components equal to 1. Show that the optimal
v.
Exercise 4.8 Consider the linear programming problem of minimizing c’x sub ject to Ax = b, x 2: o. Let x* be an optimal solution, assumed to exist, and let p* be an optimal solution to the dual.
(a) Let x be an optimal solution to the primal, when c is replaced by some C. Show that (c – c)'(x – x*) :s; O.
(b) Let the cost vector be fixed at c, but suppose that we now change b to b, and let x be a corresponding optimal solution to the primal. Prove that (p*)'(b – b) :s; c'(x – x*).
Exercise 4.9 (Back-propagation of dual variables in a multiperiod problem) A company makes a product that can be either sold or stored to meet future demand. Let t = 1, . . . , T denote the periods of the planning hori zon. Let bt be the production volume during period t, which is assumed to be known in advance. During each period t, a quantity Xt of the product is sold, at a unit price of dt. Furthermore, a quantity Yt can be sent to long-term storage, at a unit transportation cost of c. Alternatively, a quantity Wt can be retrieved from storage, at zero cost. We assume that when the product is prepared for long-term storage, it is partly damaged, and only a fraction f of the total survives. Demand is assumed to be unlimited. The main question is whether it is profitable to store some of the production, in anticipation of higher prices in the future. This leads us to the following problem, where Zt stands for the amount kept in long-term storage, at the end of period t:
sider the problem of minimizing maxi=1,…,m(a�x – bi) over all x E )Rn. Let be the value of the optimal cost, assumed finite. Let A be the matrix with rows a1 , . . . , am , and let b be the vector with components b1 , . . . , bm .
cost in this problem is equal to
T
subjectto Xt+Yt-Wt= bt,
Zt + Wt – Zt-1 – fYt = 0,
Zo = 0, Xt,Yt,Wt,Zt 2: O.
maximize :l:>�t-\dtXt – cYt) + aTdT+1ZT t=1
t= 1,…,T, t = 1, . . . , T,
Here, dT+1 is the salvage prive for whatever inventory is left at the end of period T. Furthermore, a is a discount factor, with 0 < a < 1, reflecting the fact that future revenues are valued less than current ones. p'e 190 Chap. 4 Duality theory (a) Let Pt and qt be dual variables associated with the first and second equality constraint, respectively. Write down the dual problem. (b) Assumethat0< f< 1,bt2:0,ande2:O. Showthatthefollowing formulae provide an optimal solution to the dual problem: PT max{qt+l,at-Idt}, t= 1,...,T-1, Lagrangean by L(x,p) = c'x+p'(b-Ax). Pt t= 1,...,T-1. (c) Explain how the result in part (b) can be used to compute an optimal solution to the original problem. Primal and dual nondegeneracy can be assumed. Exercise 4.10 (Saddle points of the Lagrangean) Consider the standard form problem of minimizing c'x subject to Ax = b and x 2: 0. We define the Consider the following "game" : player 1 chooses some x 2: 0, and player 2 chooses some p; then, player 1 pays to player 2 the amount L(x, p). Player 1 would like to minimize L(x, p), while player 2 would like to maximize it. A pair (x* , p* ) , with x* 2: 0, is called an equilibrium point (or a saddle point, or a Nash equilibrium) if L(x*,p)::;L(x*,p*)::;L(x,p*), vx2:0, Vp. (Thus, we have an equilibrium ifno player is able to improve her performance by unilaterally modifying her choice.) Show that a pair (x*, p*) is an equilibrium if and only if x* and p* are optimal solutions to the standard form problem under consideration and its dual, respectively. Exercise 4.11 Consider a linear programming problem in standard form which is infeasible, but which becomes feasible and has finite optimal cost when the last equality constraint is omitted. Show that the dual of the original (infeasible) problem is feasible and the optimal cost is infinite. Exercise 4.12* (Degeneracy and uniqueness - I) Consider a general linear programming problem and suppose that we have a nondegenerate basic feasible solution to the primal. Show that the complementary slackness conditions lead to a system of equations for the dual vector that has a unique solution. Exercise 4.13* (Degeneracy and uniqueness - II) Consider the following pair of problems that are duals of each other: minimize c'x maximize p'b subject to Ax b subject to p'A ::; c'. x 2: 0, max {a?-ldT, fqT - aT-Ie}, max {at-Idt, fqt - at-Ie}, qt Sec. 4.12 Exercises 191 (a) Prove that ifone problem has a nondegenerate and unique optimal solution, so does the other. (b) Suppose that we have a nondegenerate optimal basis for the primal and that the reduced cost for one of the basic variables is zero. What does the result of part (a) imply? Is it true that there must exist another optimal basis? Exercise 4.14 (Degeneracy and uniqueness - III) Give an example in which the primal problem has a degenerate optimal basic feasible solution, but the dual has a unique optimal solution. (The example need not be in standard form. ) Exercise 4.15 (Degeneracy and uniqueness - minimize X2 subject to X2 = 1 Xl ?: 0 X2 ?:o. Write down its dual. For both the primal and the dual problem determine whether they have unique optimal solutions and whether they have nondegenerate optimal solutions. Is this example in agreement with the statement that nondegeneracy of an optimal basic feasible solution in one problem implies uniqueness of optimal solutions for the other? Explain. Exercise 4.16 Give an example of a pair (primal and dual) of linear program ming problems, both of which have multiple optimal solutions. Exercise 4.17 This exercise is meant to demonstrate that knowledge of a pri mal optimal solution does not necessarily contain information that can be ex ploited to determine a dual optimal solution. In particular, determining an opti mal solution to the dual is as hard as solving a system of linear inequalities, even if an optimal solution to the primal is available. Consider the problem of minimizing c'x subject to Ax ?: 0, and suppose that we are told that the zero vector is optimal. Let the dimensions of A be m x n, and suppose that we have an algorithm that determines a dual optimal solution and whose running time O((m+n)k), for some constant k. (Note that if x = 0 is not an optimal primal solution, the dual has no feasible solution, and we assume that in this case our algorithm exits with an error message.) Assuming the availability of the above algorithm, construct a new algorithm that takes as input a system of m linear inequalities in n variables, runs for 0 ( (m + n) k ) time, and either finds a feasible solution or determines that no feasible solution exists. Exercise 4.18 Consider a problem in standard form. Suppose that the matrix A has dimensions m x n and its rows are linearly independent. Suppose that all basic solutions to the primal and to the dual are nondegenerate. Let x be a feasible solution to the primal and let p be a dual vector (not necessarily feasible), such that the pair (x, p) satisfies complementary slackness. (a) Show that there exist m columns of A that are linearly independent and such that the corresponding components of x are all positive. IV) Consider the problem 192 (b) (c) Chap. 4 Duality theory Show that x and p are basic solutions to the primal and the dual, respec tively. Show that the result of part (a) is false if the nondegeneracy assumption is removed. Exercise 4.19 Let P = {x E Rn I Ax = b, x ::: O} be a nonempty polyhedron, and let m be the dimension of the vector b. We call Xj a null variable if Xj = 0 whenever x E P. (a) Suppose that there exists some p E Rm for which p'A ::: 0', p'b = 0, and such that the jth component of p'A is positive. Prove that Xj is a null variable. (b) Prove the converse of (a): if Xj is a null variable, then there exists some p E Rm with the properties stated in part (a). (c) If Xj is not a null variable, then by definition, there exists some y E P for which Yj > o. Use the results in parts (a) and (b) to prove that there exist x E P and p E Rm such that:
p’A ::: O’, p’b = O, x + A’p > O.
Exercise 4.20 * (Strict complementary slackness)
(a) Consider the following linear programming problem and its dual
minimize c’x maximize p’b subject to Ax b subject to p’A :S c’,
x ::: 0,
and assume that both problems have an optimal solution. Fix some j. Suppose that every optimal solution to the primal satisfies Xj = O. Show that there exists an optimal solution p to the dual such that p’Aj < Cj. (Here, Aj is the jth column of A.) Hint: Let d be the optimal cost. Consider the problem of minimizing -Xj subject to Ax = b, x ::: 0, and -c'x ::: -d, and form its dual. (b) Show that there exist optimal solutions x and p to the primal and to the dual, respectively, such that for every j we have either xj > 0 or p’Aj < Cj . Hint: Use part (a) for each j, and then take the average of the vectors obtained. (c) Consider now the following linear programming problem and its dual: minimize c'x subject to Ax ::: b x ::: 0, maximize p'b subject to p'A < c' p ::: O. Assume that both problems have an optimal solution. Show that there exist optimal solutions to the primal and to the dual, respectively, that satisfy strict complementary slackness, that is: (i) ForeveryjwehaveeitherXj >0orp’Aj < Cj. (ii) For every i, we have either a;x > bi or Pi > O. (Here, a; is the ith row of A.) Hint: Convert the primal to standard form and apply part (b).

Sec. 4.12 Exercises 193 (d) Consider the linear programming problem
minimize 5Xl + 5X2 subjectto Xl + X2 2 2 2Xl X2 2 0
Xl,X2 2 O.
Does the optimal primal solution (2/3, 4/3) , together with the correspond ing dual optimal solution, satisfy strict complementary slackness? Deter mine all primal and dual optimal solutions and identify the set of all strictly complementary pairs.
Exercise 4.21 * (Clark’s theorem) Consider the following pair of linear pro gramming problems:
minimize c’x subject to Ax > b
x 2 0,
maximize p’b subject to p’A < c' p 2 o. Suppose that at least one of these two problems has a feasible solution. Prove that the set of feasible solutions to at least one of the two problems is unbounded. Hint: Interpret boundedness of a set in terms of the finiteness of the optimal cost of some linear programming problem. Exercise 4.22 Consider the dual simplex method applied to a standard form problem with linearly independent rows. Suppose that we have a basis which is primal infeasible, but dual feasible, and let i be such that XB(i) < O. Suppose that all entries in the ith row in the tableau (other than XB(i)) are nonnegative. Show that the optimal dual cost is +00. Exercise4.23 Describeindetailthemechanicsofa1reviseddualsimplexmeth od that works in terms of the inverse basis matrix B - instead of the full simplex tableau. Exercise 4.24 Consider the lexicographic pivoting rule for the dual simplex method and suppose that the algorithm is initialized with each column of the tableau being lexicographically positive. Prove that the dual simplex method does not cycle. Exercise 4.25 This exercise shows that if we bring the dual problem into stan dard form and then apply the primal simplex method, the resulting algorithm is not identical to the dual simplex method. Consider the following standard form problem and its dual. maximize PI + p2 subjectto PI::;1 Here, there is only one possible basis and the dual simplex method must terminate immediately. Show that if the dual problem is converted into standard form and the primal simplex method is applied to it, one or more changes of basis may be required. minimize Xl + X2 subjectto Xl=1 X2 = 1 Xl,X2 2 0 P2 ::; 1. 194 Chap. 4 Duality theory Exercise 4.26 Let A be a given matrix. Show that exactly one of the following alternatives must hold. (a) Thereexistssomex=I-0suchthatAx=0,x?:o. (b) There exists some p such that p'A > 0′.
Exercise 4.27 Let A be a given matrix. Show that the following two state ments are equivalent.
(a) EveryvectorsuchthatAx?:0andx?:0mustsatisfyXl=O.
(b) Thereexistssomepsuchthatp’A:S0,p?:0,andp’A1<0,whereA1 is the first column of A. Exercise 4.28 Let a and a1, . . . ,am be given vectors in Rn. Prove that the following two statements are equivalent: (a) For all x ?: 0, we have a'x :S maXia�x. (b) There exist nonnegative coefficients Ai that sum to 1 and such that a :S 2::1 Aiai. Exercise 4.29 (Inconsistent systems of linear inequalities) Let a1 , . . . , am be some vectors in Rn , with m > n + 1 . Suppose that the system of inequalities a�x?:bi,i=1,…,m, doesnothaveanysolutions. Showthatwecanchoose n + 1 of these inequalities, so that the resulting system of inequalities has no solutions.
Exercise 4.30 (Helly’s theorem)
(a) Let :F be a finite family of polyhedra in Rn such that every n + 1 polyhedra in :F have a point in common. Prove that all polyhedra in :F have a point in common. Hint: Use the result in Exercise 4.29.
(b) Forn=2,part(a)assertsthatthepolyhedraH,P2,…,PK(K?:3)in the plane have a point in common if and only if every three of them have a point in common. Is the result still true with “three” replaced by “two” ?
Exercise 4.31 (Unit eigenvectors of stochastic matrices) We say that an n x n matrix P, with entries Pij, is stochastic if all of its entries are nonnegative
and
Ln P i j = 1 , V i , j=l
that is, the sum of the entries of each row is equal to 1 .
Use duality to show that if P is a stochastic matrix, then the system of
equations
p’P = p’, p ?: 0,
has a nonzero solution. (Note that the vector p can be normalized so that its components sum to one. Then, the result in this exercise establishes that every finite state Markov chain has an invariant probability distribution.)

Sec. 4.12 Exercises 195
Exercise 4.32 * (Leontief systems and Samuelson’s substitution the orem) A Leontief matrix is an m x n matrix A in which every column has at most one positive element. For an interpretation, each column Aj corresponds to a production process. If aij is negative, laijI represents the amount of goods of type i consumed by the process. If aij is positive, it represents the amount of goods of type i produced by the process. If Xj is the intensity with which process j is used, then Ax represents the net output of the different goods. The matrix A is called productive if there exists some x � 0 such that Ax > O.
(a) Let A be a square productive Leontief matrix (m = n). Show that every vector z that satisfies Az � 0 must be nonnegative. Hint: If z satisfies Az � 0 but has a negative component, consider the smallest nonnega tive () such that some component of x + (}z becomes zero, and derive a contradiction.
(b) Show that every square productive Leontief matrix is invertible and that all entries of the inverse matrix are nonnegative. Hint: Use the result in part (a).
(c) We now consider the general case where n � m, and we introduce a con straint of the form e’x ::; 1, where e = (1, . . . , 1). (Such a constraint could capture, for example, a bottleneck due to the finiteness of the labor force.) An “output” vector y E )Rm is said to be achievable if y � 0 and there exists some x � 0 such that Ax = y and e’y ::; 1. An achievable vector y is said to be efficient if there exists no achievable vector z such that z � y and z of- y. (Intuitively, an output vector y which is not efficient can be im proved upon and is therefore uninteresting.) Suppose that A is productive. Show that there exists a positive efficient vector y. Hint: Given a positive achievable vector y. , consider maximizing I::i Yi over all achievable vectors y that are larger than y• .
(d) Suppose that A is productive. Show that there exists a set of m production processes that are capable of generating all possible efficient output vectors y. That is, there exist indices B(l), . . . , B(m), such that every efficient output vector y can be expressed in the form y = I:::1AB(i)XB(i)’ for some nonnegative coefficients XB(i) whose sum is bounded by 1. Hint: Consider the problem of minimizing e’x subject to Ax = y, x � 0, and show that we can use the same optimal basis for all efficient vectors y.
Exercise4.33 (Optionspricing)Consideramarketthatoperatesforasingle period, and which involves three assets: a stock, a bond, and an option. Let S be the price of the stock, in the beginning of the period. Its price S at the end of the period is random and is assumed to be equal to either Su, with probability
(3. Here u and d are scalars that satisfy d < 1 < u. (3, or Sd, with probability 1 Bonds are assumed riskless. Investing one dollar in a bond results in a payoff r, S - r K. If on the other hand we have S < K, there is no advantage in exercising the option, and we receive zero payoff. Thus, the value of the option at the end of option gives us the right to purchase, at the end of the period, one stock at a fixed price of K. If the realized price S of the stock is greater than K, we exercise the option and then immediately sell the stock in the stock market, for a payoff of - at the end of the period. (Here, is a scalar greater than 1 . ) Finally, the of the period is equal to max{O, S - K } . Since the option is itself an asset , it 196 Chap. 4 Duality theory should have a value in the beginning of the time period. Show that under the absence of arbitrage condition, the value of the option must be equal to "y max{O, Su - K} + 8 max{O, Sd - K}, where "y and 8 are a solution to the following system of linear equations: u"y+d8 1 1 r Exercise 4.34 (Finding separating hyperplanes) Consider a polyhedron P that has at least one extreme point. (a) Suppose that we are given the extreme points xi and a complete set of extreme rays wi of P. Create a linear programming problem whose solution provides us with a separating hyperplane that separates P from the origin, or allows us to conclude that none exists. (b) Suppose now that P is given to us in the form P = {x I a�x � bi, i = 1 , . . . , m } . Suppose that 0 1- P. Explain how a separating hyperplane can be found. Exercise 4.35 (Separation of disjoint polyhedra) Consider two nonempty polyhedraP= {xERnIAx::b}andQ= {xERnIDx::d}. Weare interested in finding out whether the two polyhedra have a point in common. (a) Devise a linear programming problem such that: if P n Q is nonempty, it returns a point in p n Q ; if p n Q is empty, the linear programming problem is infeasible. (b) Suppose that P n Q is empty. Use the dual of the problem you have constructed in part (a) to show that there exists a vector c such that c'x< c'yforallxEPandyEQ. Exercise 4.36 (Containment of polyhedra) (a) Let P and Q be two polyhedra in Rn described in terms of linear inequality constraints. Devise an algorithm that decides whether P is a subset of Q. (b) Repeat part (a) i f the polyhedra are described i n terms o f their extreme points and extreme rays. Exercise 4.37 (Closedness of finitely generated cones) Let AI , . . . , An be given vectors in Rm. Consider the cone C = { L:�=l Aixi I Xi � O} and let "y+8 Hint: Write down the payoff matrix R and use Theorem 4.8. yk,k= 1,2,.. that y E C (and hence C is closed), using the following argument. With y fixed as above, consider the problem of minimizing Ily - L:�=l Aixdloo, subject to the . , be a sequence of elements of C that converges to some y. Show , X � O. Here 00 stands for the maximum norm, defined by n 11 · 11 constraints Xl, Ilxlioo = maxi IXil. Explain why the above minimization problem has an optimal solution, find the value of the optimal cost, and prove that y E C. •.• Sec. 4.12 Exercises 197 Exercise 4.38 (From Farkas' lemma to duality) Use Farkas' lemma to prove the duality theorem for a linear programming problem involving constraints of the form a�x = bi, a�x � bi, and nonnegativity constraints for some of the variables xj . Hint: Start by deriving the form of the set of feasible directions at an optimal solution. Exercise 4.39 (Extreme rays of cones) Let us define a nonzero element d of a pointed polyhedral cone C to be an extreme ray if it has the following property: ifthereexistvectorsfE C andg E C andsome.>..E (0,1) satisfyingd=f+g, then both f and g are scalar multiples of d. Prove that this definition of extreme rays is equivalent to Definition 4.2.
Exercise 4.40 (Extreme rays of a cone are extreme points of its sec tions) Consider the cone C = {x E Rn I �x � 0, i = 1, . . . , m} and assume that the first n constraint vectors al , . . . , an are linearly independent . For any
r,
nonnegative scalar
we define the polyhedron Pr by
(a) Show that the polyhedron Pr is bounded for every r � o.
(b) Letr >O. ShowthatavectorxE PrisanextremepointofPrifandonly
if x is an extreme ray of the cone C.
Exercise 4.41 (Caratheodory’s theorem) Show that every element x of a bounded polyhedron P C Rn can be expressed as a convex combination of at most n + 1 extreme points of P. Hint: Consider an extreme point of the set of all possible representations of x.
Exercise 4.42 (Problems with side constraints) Consider the linear pro gramming problem of minimizing c’x over a bounded polyhedron P C Rn and subject to additional constraints �x = bi, i = 1, . . . , L. Assume that the prob lem has a feasible solution. Show that there exists an optimal solution which is a convex combination of L + 1 extreme points of P. Hint: Use the resolution theorem to represent P.
Exercise 4.43
(a) Consider the minimization of C1Xl + C2X2 subject to the constraints
Find necessary and sufficient conditions on (Cl’ C2) for the optimal cost to
be finite.
(b) For a general feasible linear programming problem, consider the set of all cost vectors for which the optimal cost is finite. Is it a polyhedron? Prove your answer.

198 Chap. 4 Duality theory Exercise 4.44
(a) LetP= {(Xl,X2)IXl-X2= 0, Xl+X2= o}.Whataretheextreme points and the extreme rays of P?
(b) LetP= {(Xl,X2)I4XI+2X22:8,2XI+X2::8}.Whataretheextreme points and the extreme rays of P?
(c) For the polyhedron of part (b), is it possible to express each one of its elements as a convex combination of its extreme points plus a nonnega tive linear combination of its extreme rays? Is this compatible with the resolution theorem?
Exercise 4.45 Let P be a polyhedron with at least one extreme point. Is it possible to express an arbitrary element of P as a convex combination of its extreme points plus a nonnegative multiple of a single extreme ray?
Exercise 4.46 (Resolution theorem for polyhedral cones) Let C be a nonempty polyhedral cone.
(a) Show that C can be expressed as the union of a finite number CI, . . . ,Ck of pointed polyhedral cones. Hint: Intersect with orthants.
(b) Show that an extreme ray of C must be an extreme ray of one of the cones
CI , . . . , Ck •
(c) Show that there exists a finite number of elements WI , . . .
that
,w
r
of C such
Exercise 4.47 (Resolution theorem for general polyhedra) Let P be a
polyhedron. Show that there exist vectors Xl, . . . , xk and WI , . . . Hint: Generalize the steps in the preceding exercise.
,w
r
such that
Exercise 4.48 * (Polar, finitely generated, and polyhedral cones) For any cone C, we define its polar CJ.. by
CJ..= {pIpiX::0, forallxE c}.
(a) Let F be a finitely generated cone, of the form
ShowthatFJ.. = {pIpiWi::0, i=1,…,r},whichisapolyhedralcone.
(b) Show that the polar of FJ.. is F and conclude that the polar of a polyhedral cone is finitely generated. Hint: Use Farkas’ lemma.

Sec. 4.13 Notes and sources 199 (c) Show that a finitely generated pointed cone F is a polyhedron. Hint: Con
sider the polar of the polar.
(d) (Polar cone theorem) Let 0 be a closed, nonempty, and convex cone. Show that (0.1).1 = O. Hint: Mimic the derivation of Farkas’ lemma using the separating hyperplane theorem (Section 4.7) .
(e) Is the polar cone theorem true when 0 is the empty set?
Exercise 4.49 Consider a polyhedron, and let x, y be two basic feasible solu tions. If we are only allowed to make moves from any basic feasible solution to an adjacent one, show that we can go from x to y in a finite number of steps. Hint: Generalize the simplex method to nonstandard form problems: starting from a nonoptimal basic feasible solution, move along an extreme ray of the cone of feasible directions .
Exercise 4.50 We are interested in the problem of deciding whether a polyhe-
dron
is nonempty. We assume that the polyhedron P = {x E �n I Ax :S b, x 2: O} is
Q= {xE�nIAx:Sb, Dx2:d, x2:O}
nonempty and bounded. For any vector p, of the same dimension as d, we define
g(p) = -p’d + max p’Dx. xEP
(a) Show that ifQ is nonempty, then g(p) 2: 0 for all p 2: o.
(b) Show that if Q is empty, then there exists some p 2: 0, such that g(p) < o. (c) If Q is empty, what is the minimum of g(p) over all p 2: O? 4.13 Notes and sources 4.3. The duality theorem is due to von Neumann (1947), and Gale, Kuhn, and Tucker (1951). 4.6. Farkas' lemma is due to Farkas (1894) and Minkowski (1896). See Schrijver (1986) for a comprehensive presentation of related results. The connection between duality theory and arbitrage was developed by Ross (1976, 1978). 4.7. Weierstrass' Theorem and its proof can be found in most texts on real analysis; see, for example, Rudin (1976). While the simplex method is only relevant to linear programming problems with a finite number of variables, the approach based on the separating hyperplane theorem leads to a generalization of duality theory that covers more general convex optimization problems, as well as infinite-dimensional linear programming problems, that is, linear programming problems with infinitely many variables and constraints; see, e.g., Luenberger (1969) and Rockafellar (1970) . 4.9. The resolution theorem and its converse are usually attributed to Farkas, Minkowski, and Weyl. 200 4.10. 4.12 Chap. 4 Duality theory For extensions of duality theory to problems involving general convex functions and constraint sets, see Rockafellar (1970) and Bertsekas (1995b). Exercises4.6and4.7areadaptedfromBoydandVandenberghe(1995). The result on strict complementary slackness (Exercise 4.20) was proved by 'IUcker (1956). The result in Exercise 4.21 is due to Clark (1961). The result in Exercise 4.30 is due to Helly (1923). Input output macroeconomic models of the form considered in Exercise 4.32, have been introduced by Leontief, who was awarded the 1973 Nobel prize in economics. The result in Exercise 4.41 is due to Caratheodory (1907). Chapter 5 Sensitivity analysis Contents 5.1. Local sensitivity analysis 5.2. Global dependence on the right-hand side vector 5.3. The set of all dual optimal solutions* 5.4. Global dependence on the cost vector 5.5. Parametric programming 5.6. Summary 5.7. Exercises 5.8. Notes and sources 201 202 Chap. 5 Consider the standard form problem Sensitivity analysis and its dual minimize c'x subject to Ax b x > 0, maximize p’b
subject to p’A ::; c’.
In this chapter, we study the dependence of the optimal cost and the opti mal solution on the coefficient matrix A, the requirement vector b, and the cost vector c. This is an important issue in practice because we often have incomplete knowledge of the problem data and we may wish to predict the effects of certain parameter changes.
In the first section of this chapter, we develop conditions under which the optimal basis remains the same despite a change in the problem data, and we examine the consequences on the optimal cost. We also discuss how to obtain an optimal solution if we add or delete some constraints. In subsequent sections, we allow larger changes in the problem data, resulting in a new optimal basis, and we develop a global perspective of the depen dence of the optimal cost on the vectors b and c. The chapter ends with a brief discussion of parametric programming, which is an extension of the simplex method tailored to the case where there is a single scalar unknown parameter.
Many of the results in this chapter can be extended to cover general linear programming problems. Nevertheless, and in order to simplify the presentation, our standing assumption throughout this chapter will be that we are dealing with a standard form problem and that the rows of the m x n matrix A are linearly independent.
5.1 Local sensitivity analysis
In this section, we develop a methodology for performing sensitivity anal ysis. We consider a linear programming problem, and we assume that we already have an optimal basis B and the associated optimal solution x*. We then assume that some entry of A, b, or c has been changed, or that a new constraint is added, or that a new variable is added. We first look for conditions under which the current basis is still optimal. If these con ditions are violated, we look for an algorithm that finds a new optimal solution without having to solve the new problem from scratch. We will see that the simplex method can be quite useful in this respect.
Having assumed that B is an optimal basis for the original problem, the following two conditions are satisfied:
(feasibility)

Sec. 5. 1
Local sensitivity analysis ‘ ‘ B-1A > 0’,
203
c-c
B
–
(optimality) .
When the problem is changed, we check to see how these conditions are affected. By insisting that both conditions (feasibility and optimality) hold for the modified problem, we obtain the conditions under which the basis matrix B remains optimal for the modified problem. In what follows, we apply this approach to several examples.
A new variable is added
Suppose that we introduce a new variable Xn+1, together with a corre sponding column An+h and obtain the new problem
minimize c’x + Cn+1Xn+1 subject to Ax + An+!Xn+l b
x 2 0.
We wish to determine whether the current basis B is still optimal.
We note that (x,Xn+1) = (x*,0) is a basic feasible solution to the new problem associated with the basis B, and we only need to examine the optimality conditions. For the basis B to remain optimal, it is necessary
and sufficient that the reduced cost of Xn+1 be nonnegative, that is, cn+! = en+! – C�B-1An+1 2 o.
If this condition is satisfied, (x* , 0) is an optimal solution to the new prob lem. If, however, cn+! < 0, then (x*,0) is not necessarily optimal. In order to find an optimal solution, we add a column to the simplex tableau, associated with the new variable, and apply the primal simplex algorithm starting from the current basis B. Typically, an optimal solution to the new problem is obtained with a small number of iterations, and this approach is usually much faster than solving the new problem from scratch. Example 5.1 Consider the problem minimize subject to 3XI + 2X2 + X3 5XI + 3X2 Xl , . . . , X4 2 o. 10 + X4 16 An optimal solution to this problem is given by x = (2, 2, 0, 0) and the corre sponding simplex tableau is given by XlX2 X4 12 0 0 2 7 2 1 0 -3 2 2 0 1 5 -3 -5XI X2 + 12x3 X3 Xl = X2 = 204 Chap. 5 Sensitivity analysis Note that B-1 is given by the last two columns of the tableau. Let us now introduce a variable X5 and consider the new problem =-4. minimize X2 + 12x3 X5 subjectto + + X3 +X5 10 + Xl , . . . , X5 2: O. WehaveA5= (1,1)and C5=C5-c'aB-lA5=-1-[-5 -1] + X4 + X5 16 [-�_�][�] Since C5 is negative, introducing the new variable to the basis can be beneficial. We observe that B-1A5 = (-1,2) and augment the tableau by introducing a column associated with X5: Xl X2 X3 X4 X5 12 0 0 2 7 -4 2 1 0 -3 2 -1 2 0 1 5 -3 2 We then bring X5 into the basis; X2 exits and we obtain the following tableau, which happens to be optimal: Xl X2 X3 X4 X5 16 0 2 12 1 0 3 1 0.5 -0.5 0.5 0 1 0 0.5 2.5 -1.5 1 An optimal solution is given by x = (3, 0, 0, 0, 1). A new inequality constraint is added Let us now introduce a new constraint a�+lx � bm+1, where am+l and bm+1 are given. If the optimal solution x* to the original problem satisfies this constraint, then x* is an optimal solution to the new problem as well. If the new constraint is violated, we introduce a nonnegative slack variable Xn+l, and rewrite the new constraint in the form a�+1x - Xn+l = bm+1. We obtain a problem in standard form, in which the matrix A is replaced by -5Xl 3Xl 2X2 5Xl 3X2 X5 = Sec. 5. 1 Local sensitivity analysis 205 Let B be an optimal basis for the original problem. We form a basis for the new problem by selecting t[he origina]l basic variables together with xn+1. The new basis matrix B is of the form - B= B0 a' -1 ' where the row vector a' contains those components of a�+l associated with the original basic columns. (The determinant of this matrix is the negative of the determinant of B, hence nonzero, and we therefore have a true basis matrix.) The basic solution associated with this basis is (x*, a�+1x* - bm+d, and is infeasible because of our assumption that x* violates the new constraint. Note that the ne[w inverse basis]matrix is readily available because B = a'B-1 -1 . (To see this, note that the product B-1B is equal to the identity matrix.) B B- 0 --1 1 Let C ables in the original problem. Then, the vector of reduced costs associated be the m-dimensional vector with the costs of the basic vari with the basis B for the new prob]lem, is given by [c'0]-[c'a0] a' _ a+1_ ' -1 and is nonnegative due to the optimality of B for the original problem. Hence, B is a dual feasible basis and we are in a position to apply the dual simplex method to[the new problem. Note that an initial simplex tableau for the new problem is readily constructed. For example, we have A a, a'B-1A - a' m+1 m+1 where B-1A is available from the final simplex tableau for the original problem . Example 5.2 Consider again the problem in Example 5.1: [�=� �[t �] c-cBA0, =['a] --1 B minimize -5Xl X2 + 12x3 subject to 3Xl + 2X2 + X3 5Xl + 3X2 Xl , · · · , X4 2: 0, and recall the optimal simplex tableau: 10 + X4 16 Xl X2 X3 12 0 0 2 7 2102 2015 X4 -3 Xl = -3 206 Chap. 5 Sensitivity analysis We introduce the additional constraint Xl + X2 ?: 5, which is violated by the optimal solution x· = (2,2,0,0). We have am+1 = (1,1,0,0), bm+1 = 5, and a�+lx· < bm+l. We form the standard form problem minimize -5XI X2 + 12x3 10 subject to 3X1 + 2X2 + 5X1 + 3X2 X3 + X4 16 -X5 5 Let a consist of the co[mponents of am+1 associated with the basic variables. Wethenhavea= (1,1)and aIB-1A-aI =[11]01 5_�] 1 o -3 The tableau for the new problem is of the form m+1 -[1 1 0 0]= [0 0 2 -1]. Xl+X2 Xl,...,X5 ?:o. Xl X2 X3 X4 X5 12 0 0 2 7 0 Xl= 210-320 X2= 2015-30 -1 0 0 2 -1 1 We now have all the information necessary to apply the dual simplex method to the new problem. Our discussion has been focused on the case where an inequality con straint is added to the primal problem. Suppose now that we introduce a new constraint piAn+1 � en+! in the dual. This is equivalent to intro ducing a new variable in the primal, and we are back to the case that was considered in the preceding subsection. A new equality constraint is added We now consider the case where the new constraint is of the form a�+lx = bm+1, and we assume that this new constraint is violated by the optimal solution x* to the original problem. The dual of the new problem is subject to [pi Pm+ll [alA ]� c', m+l where Pm+l is a dual variable associated with the new constraint. Let p* be an optimal basic feasible solution to the original dual problem. Then, (p*,O) is a feasible solution to the new dual problem. maximize p/h + Pm+lbm+l Sec. 5.1 Localsensitivityanalysis 207 Let m be the dimension of p, which is the same as the original num ber of constraints. Since p* is a basic feasible solution to the original dual problem, m of the constraints in (p*)'A :: c' are linearly independent and active. However, there is no guarantee that at (p*,0) we will have m+1 lin early independent active constraints of the new dual problem. In particular, (p*,O) need not be a basic feasible solution to the new dual problem and may not provide a convenient starting point for the dual simplex method on the new problem. While it may be possible to obtain a dual basic feasi ble solution by setting Pm+! to a suitably chosen nonzero value, we present here an alternative approach. l X Let us assume, without loss of generality, that �+ * > bm+1• We introduce the auxiliary primal problem
minimize c’x + MXn+1 = b subject to Ax
x2:0, Xn+12:0,
where M is a large positive constant. A primal feasible basis for the aux iliary problem is obtained by picking the basic variables of the optimal solution to the original problem, together with the variable Xn+!. The re sulting basis matrix is the same as the matrix B of the preceding subsection. There is a difference, however. In the preceding subsection, B was a dual feasible basis, whereas here B is a primal feasible basis. For this reason, the primal simplex method can now be used to solve the auxiliary problem to optimality.
Suppose that an optimal solution to the auxiliary problem satisfies Xn+1 = 0; this will be the case if the new problem is feasible and the
coefficient M is large enough. Then, the additional constraint a�+!x bm+1 has been satisfied and we have an optimal solution to the new problem.
Changes in the requirement vector b
Suppose that some component bi of the requirement vector b is changed to bi + 8. Equivalently, the vector b is changed to b + 8ei , where ei is the ith unit vector. We wish to determine the range of values of 8 under which the current basis remains optimal. Note that the optimality conditions are not affected by the change in b. We therefore need to examine only the feasibility condition
(5.1) Letg= (f31i,f32i,…,f3mi)betheithcolumnofB-1.Equation(5.1)
becomes XB + 8g 2: 0,
or,
=
j= 1,…,m.
�+1X Xn+1 = bm+1

208
Chap. 5 Sensitivity analysis
Equivalently,
{jl,Bji>O}
( XBCO))- (3ji
–
( XBCO)) {jl,Bji.b1 + (1 – >’)b2) :: e’y = >.e’x1 + (1 – >.)e’x2 = >.F(b1) + (1 – >.)F(b2),
establishing the convexity of F. D
We now corroborate Theorem 5.1 by taking a different approach, involving the dual problem
maximize p’b subject to p’A :: e’,
which has been assumed feasible. For any b E S, F(b) is finite and, by strong duality, is equal to the optimal value of the dual objective. Let p1,p2,…,pNbetheextremepointsofthedualfeasibleset. (Ourstanding assumption is that the matrix A has linearly independent rows; hence its columns span �m . Equivalently, the rows of A’ span �m and Theorem 2 . 6 in Section 2.5 implies that the dual feasible set must have at least one
e’x . Fix a scalar >’ E [0, 1], >.x + ( 1 – >.)x is nonnegative and satisfies >’b + (1 – >.)b . In particular, y is a feasible solution to the linear programming problem obtained when the requirement vector b is set to
=
Ay
=
e’x and F(b )
=
=
12
=
>’b1 + (1 – >.)b2. Therefore,

214 Chap. 5 Sensitivity analysis 1(8)
I*+ 8d
fJ
Figure 5 . 1 : The optimal cost when the vector b is a function of a scalar parameter. Each linear piece is of the form (pi)'(b* + (Jd), where pi is the ith extreme point of the dual feasible set. Ineachoneoftheintervals(J< (JI,(JI< (J< (J2,and(J>(J2, we have different dual optimal solutions, namely, pI, p2, and p3, respectively. For (J = (JI or (J = (J2, the dual problem has multiple optimal solutions.
extreme point.) Since the optimum of the dual must be attained at an extreme point, we obtain
F(b) = max (pi)’b, b E S. (5.2) 2. = 1 , . . . , N
In particular, F is equal to the maximum of a finite collection of linear
functions. It is therefore a piecewise linear convex function, and we have a
=
(pI)'(b
)
new proof of Theorem 5.1. In addition, within a region where F is linear,
ii
(p )’b, where p is a corresponding dual optimal solution,
we have F(b)
in agreement with our earlier discussion.
For those values of b for which F is not differentiable, that is, at the junction of two or more linear pieces, the dual problem does not have a unique optimal solution and this implies that every optimal basic feasible solution to the primal is degenerate. (This is because, as shown earlier in this section, the existence of a nondegenerate optimal basic feasible solution
to the primal implies that F is locally linear.)
b Wenowrestrict attentiontochangesinbofaparticulartype, namely,
= b* + Od, where b* and d are fixed vectors and 0 is a scalar. Let 1(0) = F(b* + Od) be the optimal cost as a function of the scalar parameter O. Using Eq. (5.2), we obtain
1(0) = max (pi)'(b*+Od), b*+OdES.
. l=l,…,N

Sec. 5.3 The set ofall dual optimal solutions* 215
b*
F(b) F(b)
b b* b
Figure 5.2: Illustration of subgradients of a function F at a point b*. A subgradient p is the gradient of a linear function F(b* ) + p’ (b – b* ) that lies below the function F(b) and agrees with it for b = b* .
This is essentially a “section” of the function Fj it is again a piecewise linear convex functionj see Figure 5.1. Once more, at breakpoints of this function, every optimal basic feasible solution to the primal must be degenerate.
5 . 3 The set of all dual optimal solutions*
We have seen that· if the function F is defined, finite, and linear in the vicinity of a certain vector b*, then there is a unique optimal dual solution, equal to the gradient of F at that point, which leads to the interpretation of dual optimal solutions as marginal costs. We would like to extend this interpretation so that it remains valid at the breakpoints of F. This is indeed possible: we will show shortly that any dual optimal solution can be viewed as a “generalized gradient” of F. We first need the following definition, which is illustrated in Figure 5.2.
Definition 5.1 Let F be a convex function defined on a convex set S. Let b* be an element ofS. We say that a vector p is a subgradient ofF at b* if
F(b*) + p'(b – b*) s:; F(b), V b E S.
Note that if b* is a breakpoint of the function F, then there are several subgradients. On the other hand, if F is linear near b*, there is a unique subgradient, equal to the gradient of F.

216 Chap. 5 Sensitivity analysis Theorem5.2 Supposethatthelinearprogrammingproblemofmin
=
b* and x :: ° is feasible and that the is an optimal solution to the
imizing c’x subject to Ax
optimal cost is finite. Then, a vector p
dual problem if and only if it is a subgradient of the optimal cost function F at the point b* .
Proof. Recall that the function F is defined on the set S, which is the set of vectors b for which the set P(b) of feasible solutions to the primal problem is nonempty. Suppose that p is an optimal solution to the dual
problem. Then, strong duality implies that p’b*
some arbitrary b E S. For any feasible solution x E PCb), weak duality yields p’b � e’x. Taking the minimum over all x E PCb), we obtain p’b � F(b). Hence, p’b – p’b* � F(b) – F(b*), and we conclude that p is a subgradient of F at b*.
We now prove the converse. Let p be a subgradient of F at b*; that F(b*)+p'(b-b*)� F(b), VbES. (5.3)
is,
Pick some x :: 0, let b = Ax, and note that x E PCb). In particular,
F(b) � e’x. Using Eq. (5.3), we obtain
p’Ax= p’b� F(b)-F(b*)+p’b*� e’x-F(b*)+p’b*.
Since this is true for all x :: 0, we must have p’A � e’, which shows that p
0, we obtain F(b*) � p’b*. 5.4 Global dependence on the cost vector
In the last two sections, we fixed the matrix A and the vector e, and we considered the effect of changing the vector b. The key to our development was the fact that the set of dual feasible solutions remains the same as b varies. In this section, we study the case where A and b are fixed, but the vector e varies. In this case, the primal feasible set remains unaffected; our standing assumption will be that it is nonempty.
We define the dual feasible set
is a dual feasible solution. Also, by letting x
Using weak duality, every dual feasible solution q must satisfy q’b* � F(b*) � p’b*, which shows that p is a dual optimal solution. D
Q(e)= {pI p’A� e},
T = {e I Q(e) is nonempty}.
and let
If el E2T and e2 E T, then there exist pI and p2 such that (pI)’A � e’
and (p )’A� e’. For any scalar ,\E [0,1], we have
(,$pl)’ + (1 – ,$(p2)’)A � ‘\el + (1 – ‘\)e2,
=
=
F(b*). Consider now

Sec. 5.5 Parametricprogramming 217
and this establishes that ‘\cl + (1 – ‘\)c2 E T. We have therefore shown that T is a convex set.
If c 1’- T, the infeasibility of the dual problem implies that the optimal primal cost is – 00. On the other hand, if c E T, the optimal primal cost must be finite. Thus, the optimal primal cost, which we will denote by G(c), is finite if and only if c E T.
Thus, G(c) is the minimum of a finite collection of linear functions and is
Let Xl,x2, N
, x
be the basic feasible solutions in the primal feasible set; clearly, these do not depend on c. Since an optimal solution to a standard form problem can always be found at an extreme point, we have
G(c)=.min C’xi. t=l,…,N
•••
a piecewise linear concave function. If for some value c* of c, the primal
has a unique optimal solution xi, we have (C*)’Xi < (c*)'xj, for all j =I- i. For c very close to c*, the inequalities c'xi < c'xj, j =I- i, continue to hold, implying that xi is still a unique primal optimal solution with cost C'xi. i C'X . On the other hand, at those values We conclude that, locally, G(c) of c that lead to multiple primal optimal solutions, the function G has a breakpoint . = We summarize the main points of the preceding discussion. Theorem5.3 Considerafeasiblelinearprogrammingprobleminstan dard form. (a) The set T of all c for which the optimal cost is finite, is convex. (b) (c) 5.5 The optimal cost G(c) is a concave function ofc on the set T. If for some value of c the primal problem has a unique optimal solution x*, then G is linear in the vicinity ofc and its gradient is equal to x*. Parametric programming Let us fix A, b, c, and a vector d of the same dimension as c. For any scalar (), we consider the problem minimize (c + (}d)'x subject to Ax b x > 0,
and let g((}) be the optimal cost as a function of (). Naturally, we assume that the feasible set is nonempty. For those values of () for which the optimal cost is finite, we have
g((}) = . min (c + (}d)’Xi, t=l,…,N

218 Chap. 5 Sensitivity analysis
where xl, . . . , xN are the extreme points of the feasible set; see Figure 5.3. In particular, g(e) is a piecewise linear and concave function of the param eter e. In this section, we discuss a systematic procedure, based on the simplex method, for obtaining g(e) for all values of e. We start with an example .
.. Xl optimal x20ptimal x30ptimal x40ptimal 6
Figure 5.3: The optimal cost g((}) as a function of (). Example 5.5 Consider the problem
minimize (-3 + 2(})XI + (3 – (})X2 + subjectto Xl + 2X2 2XI + X2
XI,X2,X3 2′: o.
We introduce slack variables in order to bring the problem into standard form, and then let the slack variables be the basic variables. This determines a basic feasible solution and leads to the following tableau.
Xl X2 X3 X4 X5 0 -3+2() 3-(} 1 0 0 5 12-310 X5= 7 2 1 0 1
If-3+2(}2′:0and3- ()2′:0,allreducedcostsarenonnegativeandwe have an optimal basic feasible solution. In particular,
g((})=0, if �<()<3. 2-- 3X3 5 X3 ::: < 7 4X3 -4 Sec. 5.5 Parametricprogramming 219 If 8 is increased slightly above 3, the reduced cost of X2 becomes negative and we no longer have an optimal basic feasible solution. We let X2 enter the basis, X4 exits, and we obtain the new tableau: -7.5 + 2.58 -4.5 + 2.58 2.5 0.5 1.5 0 5.5 - 1.58 1 -1.5 0 -2.5 Xl X2 X3 X4 X5 -1.5 + 0.58 0 0.5 0 -0.5 1 We note that all reduced costs are nonnegative if and only if 3 � 8 � 5.5/1.5. The optimal cost for that range o f values o f 8 is g(8)=7.5-2.58, if 3<8< 5.5. - -1.5 If 8 is increased beyond 5.5/1.5, the reduced cost of X3 becomes negative. If we attempt to bring X3 into the basis, we cannot find a positive pivot element in the third column of the tableau, and the problem is unbounded, with g(8) = - 00 . Let us now go back to the original tableau and suppose that 8 is decreased to a value slightly below 3/2. Then, the reduced cost of Xl becomes negative, we let Xl enter the basis, and X5 exits. The new tableau is: Xl X2 X3 X4 X5 10.5 - 78 0 4.5 - 28 -5 + 48 0 1.5 - 8 X4= 1.501.5-11-0.5 1 0.5 -2 0 0.5 Wenotethatallofthereducedcostsarenonnegativeifandonlyif5/4� 8� 3/2. For these values of 8, we have an optimal solution, with an optimal cost of g(8) = -10.5 + 78, 'f � < 8 < � Finally, for 8 < 5/4, the reduced cost of X3 is negative, but the optimal cost is We now generalize the steps in the preceding example, in order to obtain a broader methodology. The key observation is that once a basis is fixed, the reduced costs are affine (linear plus a constant) functions of e. Then, if we require that all reduced costs be nonnegative, we force e to belong to some interval. (The interval could be empty but if it is nonempty, its endpoints are also included.) We conclude that for any given basis, the set of () for which this basis is optimal is a closed interval. -00, equal to We plot the optimal cost in Figure 5.4. 1 -- 42' because all entries in the third column of the tableau are negative. 4.5 3.5 220 Chap. 5 Sensitivity analysis g(8) Figure 5.4: The optimal cost g(9) as a function of 9, in Example -00. 5.5. For 9 outside the interval [5/4, 11/3], g(9) is equal to Let us now assume that we have chosen a basic feasible solution and an associated basis matrix B, and suppose that this basis is optimal for () satisfying ()1 � () � ()2. Let Xj be a variable whose reduced cost becomes negative for () > ()2. Since this reduced cost is nonnegative for ()1 � () � ()2,
it must be equal to zero when ()
the basis and consider separately the different cases that may arise.
=
() . We now attempt to bring Xj into 2
Suppose that no entry of the jth column B-1Aj of the simplex tableau is positive. For () > ()2, the reduced cost of Xj is negative, and this implies that the optimal cost is -00 in that range.
If the jth column of the tableau has at least one positive element, we
carry out a change of basis and obtain a new basis matrix B. For ()
the reduced cost of the entering variable is zero and, therefore, the cost associated with the new basis is the same as the cost associated with the
8
old basis. Since the old basis was optimal for ()
true for the new basis. On the other hand, for () < ()2, the entering variable Xj had a positive reduced cost. According to the pivoting mechanics, and for () < ()2, a negative multiple of the pivot row is added to the pivot row, and this makes the reduced cost of the exiting variable negative. This implies that the new basis cannot be optimal for () < ()2. We conclude that the range of values of () for which the new basis is optimal is of the form ()2 � () � ()3, for some ()3. By continuing similarly, we obtain a sequence of bases, with the ith basis being optimal for ()i � () � ()Hl. Note that a basis which is optimal for () E [()i , ()i+ 1 ] cannot be optimal for values of () greater than ()Hl. Thus, if ()i+1 > ()i for all i, the same basis cannot be encountered more than once and the entire range of values of () will be traced in a finite number of iterations, with each iteration leading to a new breakpoint of the optimal cost function g(()). (The number of breakpoints may increase exponentially with the dimension of the problem.)
=
() , the same must be 2
=
(), 2

Sec. 5.6 Summary 221
The situation i s more complicated if for some basis w e have O i = 0H l . In this case, it is possible that the algorithm keeps cycling between a finite number of different bases, all of which are optimal only for 0 = Oi = Oi+1′ Such cycling can only happen in the presence of degeneracy in the primal problem (Exercise 5.17), but can be avoided if an appropriate anticycling rule is followed. In conclusion, the procedure we have outlined, together with an anticycling rule, partitions the range of possible values of 0 into consecutive intervals and, for each interval, provides us with an optimal basis and the optimal cost function as a function of O.
There is another variant of parametric programming that can be used
c
5.6 Summary
In this chapter, we have studied the dependence of optimal solutions and of the optimal cost on the problem data, that is, on the entries of A, b, and
c.
(a) If a new variable is added, we check its reduced cost and if it is negative, we add a new column to the tableau and proceed from there.
(b) If a new constraint is added, we check whether it is violated and if so, we form an auxiliary problem and its tableau, and proceed from there.
(c) If an entry of b or c is changed by 8, we obtain an interval of values of 8 for which the same basis remains optimal.
(d) If an entry of A is changed by 8, a similar analysis is possible. How ever, this case is somewhat complicated if the change affects an entry of a basic column.
(e) Assuming that the dual problem is feasible, the optimal cost is a piecewise linear convex function of the vector b (for those b for which the primal is feasible) . Furthermore, subgradients of the optimal cost function correspond to optimal solutions to the dual problem.
is kept fixed but b is replaced by b + Od, where d is a given vector
when
and 0 is a scalar. In this case, the zeroth column of the tableau depends on O. Whenever 0 reaches a value at which some basic variable becomes negative, we apply the dual simplex method in order to recover primal feasibility.
For many of the cases that we have examined, a common methodology was used. Subsequent to a change in the problem data, we first examine its effects on the feasibility and optimality conditions. Ifwe wish the same basis to remain optimal, this leads us to certain limitations on the magnitude of the changes in the problem data. For larger changes, we no longer have an optimal basis and some remedial action (involving the primal or dual simplex method) is typically needed.
We close with a summary of our main results.

222
Chap. 5 Sensitivity analysis
(f) (g)
5 . 7
Assuming that the primal problem is feasible, the optimal cost is a piecewise linear concave function of the vector c (for those c for which the primal has finite cost).
If the cost vector is an affine function of a scalar parameter e, there is a systematic procedure (parametric programming) for solving the problem for all values of e. A similar procedure is possible if the vector b is an affine function of a scalar parameter.
Exercises
Exercise 5.1 Consider the same problem as in Example 5.1, for which we al ready have an optimal basis. Let us introduce the additional constraint Xl + X2 = 3. Form the auxiliary problem described in the text, and solve it using the pri mal simplex method. Whenever the “large” constant M is compared to another number, M should be treated as being the larger one.
Exercise 5.2 (Sensitivity with respect to changes in a basic column of A) In this problem (and the next two) we study the change in the value of the optimal cost when an entry of the matrix A is perturbed by a small amount. We consider a linear programming problem in standard form, under the usual assumption that A has linearly independent rows. Suppose that we have an optimal basis B that leads to a nondegenerate optimal solution x*, and a nondegenerate dual optimal solution p. We assume that the first column is basic.
an
We will now change the first entry of Ai from
scalar. Let E be a matrix of dimensions m x m (where m is the number of rows
en,
of A) , whose entries are all zero except for the top left entry to 1.
which is equal (a) Show that if8 is small enough, B+8E is a basis matrix for the new problem.
(b) Show that under the basis B + 8E, the vector XB of basic variables in the new problem is equal to (I + 8B-1E)-lB-lb.
(c) Show that if 8 is sufficiently small, B + 8E is an optimal basis for the new problem.
(d) We use the symbol ;: to denote equality when second order terms in18 are1 ig nored. The following approximation is known to be true: (1+8B- E)- ;: 1 – 8B-1E. Using this approximation, show that
where x! (respectively, pd is the first component of the optimal solution to
XB
to
an +8, where8isasmall
the original primal (respectively, dual) problem, and in part (b).
has been defined
Exercise 5.3 (Sensitivity with respect to changes in a basic column of A) Consider a linear programming problem in standard form under the usual assumption that the rows of the matrix A are linearly independent. Suppose that the columns Ai , . . . , Am form an optimal basis. Let Ao be some vector and suppose that we change Ai to Ai + 8Ao . Consider the matrix B (8) consisting of

Sec. 5.7 Exercises 223
thecolumnsAo+DAl,A2,…,Am,. Let[151,152]beaclosedintervalofvaluesof 15 that contains zero and in which the determinant of B (D) is nonzero. Show that the subset of [151 , 152] for which B (o) is an optimal basis is also a closed interval.
Exercise 5.4 Consider the problem in Example 5.1, with a11 changed from 3to3+o. LetuskeepXlandX2asthebasicvariablesandletB(o)bethe corresponding basis matrix, as a function of o.
(a) Compute B(O)-lb. For which values of 0 is B(o) a feasible basis?
(b) Compute c�B(O)-l. For which values of 0 is B(o) an optimal basis?
(c) Determine the optimal cost, as a function of 0, when 0 is restricted to those values for which B(o) is an optimal basis matrix.
Exercise 5.5 While solving a standard form linear programming problem using the simplex method, we arrive at the following tableau:
Xl X2 X3 X4 X5 0 0 C3 0 C5
f3
‘Y
Suppose also that the last three columns of the matrix A form an identity matrix.
(a) Give necessary and sufficient conditions for the basis described by this
tableau to be optimal (in terms of the coefficients in the tableau) .
(b) Assume that this basis is optimal and that C3 = O. Find an optimal basic feasible solution, other than the one described by this tableau.
(c) Suppose that ‘Y 2: o. Show that there exists an optimal basic feasible solution, regardless of the values of C3 and C5 .
(d) Assume that the basis associated with this tableau is optimal. Suppose
E.
lower bounds on
so that this basis remains optimal.
E E
1 0 1 -1 0 20021 310400
also that bl in the original problem is replaced by bl +
Give upper and (e) Assume that the basis associated with this tableau is optimal. Suppose
lower bounds on
also that Cl in the original problem is replaced by Cl +
so that this basis remains optimal.
Exercise 5.6 Company A has agreed to supply the following quantities of spe cial lamps to Company B during the next 4 months:
Month January February March April Units 150 160 225 180
Company A can produce a maximum of 160 lamps per month at a cost of $35 per unit. Additional lamps can be purchased from Company C at a cost of $50
E.
Give upper and

224 Chap. 5 Sensitivity analysis
per lamp. Company A incurs an inventory holding cost of $5 per month for each lamp held in inventory.
(a) (b)
(c)
(d)
(e) (f) (g)
Formulate the problem that Company A is facing as a linear programming problem.
Solve the problem using a linear programming package.
Company A is considering some preventive maintenance during one of the first three months. If maintenance is scheduled for January, the company can manufacture only 151 units (instead of 160); similarly, the maximum possible production if maintenance is scheduled for February or March is 153 and 155 units, respectively. What maintenance schedule would you recommend and why?
Company D has offered to supply up to 50 lamps (total) to Company A during either January, February or March. Company D charges $45 per lamp. Should Company A buy lamps from Company D? If yes, when and how many lamps should Company A purchase, and what is the impact of this decision on the total cost?
Company C has offered to lower the price of units supplied to Company A during February. What is the maximum decrease that would make this offer attractive to Company A?
Because of anticipated increases in interest rates, the holding cost per lamp is expected to increase to $8 per unit in February. How does this change affect the total cost and the optimal solution?
Company B has just informed Company A that it requires only 90 units in January (instead of 150 requested previously). Calculate upper and lower bounds on the impact of this order on the optimal cost using information from the optimal solution to the original problem.
Exercise 5.7 A paper company manufactures three basic products: pads of paper, 5-packs of paper, and 20-packs of paper. The pad of paper consists of a single pad of 25 sheets of lined paper. The 5-pack consists of 5 pads of paper, together with a small notebook. The 20-pack of paper consists of 20 pads of paper, together with a large notebook. The small and large notebooks are not sold separately.
Production of each pad of paper requires 1 minute of paper-machine time, 1 minute of supervisory time, and $.10 in direct costs. Production of each small notebook takes 2 minutes of paper-machine time, 45 seconds of supervisory time, and $.20 in direct cost. Production of each large notebook takes 3 minutes of paper machine time, 30 seconds of supervisory time and $.30 in direct costs. To package the 5-pack takes 1 minute of packager’s time and 1 minute of supervisory time. To package the 20-pack takes 3 minutes of packager’s time and 2 minutes of supervisory time. The amounts of available paper-machine time, supervisory time, and packager’s time are constants bl , b2 , b3 , respectively. Any of the three
,Products can be sold to retailers in any quantity at the prices $.30, $1.60, and $7.00, respectively.
Provide a linear programming formulation of the problem of determining an optimal mix of the three products. (You may ignore the constraint that only integer quantities can be produced.) Try to formulate the problem in such a

Sec. 5.7 Exercises 225
way that the following questions can be answered by looking at a single dual variable or reduced cost in the final tableau. Also, for each question, give a brief explanation of why it can be answered by looking at just one dual price or reduced cost .
(a) What is the marginal value of an extra unit of supervisory time?
(b) What is the lowest price at which it is worthwhile to produce single pads
of paper for sale?
(c) Suppose that part-time supervisors can be hired at $8 per hour. Is it worthwhile to hire any?
(d) Suppose that the direct cost of producing pads of paper increases from $.10 to $.12. What is the profit decrease?
Exercise 5.8 A pottery manufacturer can make four different types of dining room service sets: JJP English, Currier, Primrose, and Bluetail. Furthermore, Primrose can be made by two different methods. Each set uses clay, enamel, dry room time, and kiln time, and results in a profit shown in Table 5.3. (Here, Ibs is the abbreviation for pounds) .
Resources E C Total
Clay (lbs) 10 15 10 10 20 130 Enamel(lbs) 1221113 Dry room (hours) 3 1 6 6 3 Kiln(hours) 2425323 Profit 51 102 66 66 89
Table 5.3: The rightmost column in the table gives the manufac turer’s resource availability for the remainder of the week. Notice that Primrose can be made by two different methods. They both use the same amount of clay (10 Ibs.) and dry room time (6 hours) . But the second method uses one pound less of enamel and three more hours in the kiln.
The manufacturer is currently committed to making the same amount of Primrose using methods 1 and 2. The formulation of the profit maximization problem is given below. The decision variables E, C, Pl , P2 , B are the number of sets of type English, Currier, Primrose Method 1, Primrose Method 2, and Bluetail, respectively. We assume, for the purposes of this problem, that the number of sets of each type can be fractional.
45

226
Chap. 5 Sensitivity analysis
maximize 51E + subjectto WE+ E + 3E + 2E +
H P2 0 E, C, Pi , P2 , B 2: O.
The optimal solution to the primal and the dual, respectively, together with sensitivity information, is given in Tables 5.4 and 5.5. Use this information to answer the questions that follow.
102C + 66H + 66P2 + 8gB 15C+10Pi+1OP2+20B::130
2C + 2H + P2 + B < 13 C + 6H + 6P2 + 3B :: 45 4C + 2H + 5P2 + 3B < 23 E C PI P2 Value Cost Coefficient 0 -3.571 51 2 0 102 0 0 66 0 -37.571 66 Increase Decrease 3.571 00 16.667 12.5 37.571 00 37.571 00 47 12.5 Optimal Reduced Objective Allowable Allowable B 5 0 89 Table 5.4: The optimal primal solution and its sensitivity with respect to changes in coefficients of the objective function. The last two columns describe the allowed changes in these coefficients for which the same solution remains optimal. (a) What is the optimal quantity of each service set, and what is the total profit ? (b) Give an economic (not mathematical) interpretation of the optimal dual variables appearing in the sensitivity report, for each of the five constraints. (c) Should the manufacturer buy an additional 20 lbs. of Clay at $1.1 per pound? (d) Suppose that the number of hours available in the dry room decreases by 30. Give a bound for the decrease in the total profit. (e) In the current model , the number of Primrose produced using method 1 was required to be the same as the number of Primrose produced by method 2. Consider a revision o f the model i n which this constraint i s replaced by the constraint Pi - P2 2: o. In the reformulated problem would the amount of Primrose made by method 1 be positive? Exercise 5.9 Using the notation of Section 5.2, show that for any positive scalar ,\ and any b E S, we have F('\b) = '\F(b). Assume that the dual feasible set is nonempty, so that F(b) is finite. 227 Allowable Decrease Table 5 . 5 : The optimal dual solution and its sensitivity. The column labeled "slack value" gives us the optimal values of the slack variables associated with each of the primal constraints. The third column simply repeats the right-hand side vector b, while the last two columns describe the allowed changes in the components of b for which the optimal dual solution remains the same. Exercise 5.10 Consider the linear programming problem: minimize Xl + X2 subject to Xl + 2X2 = e, Xl,X2 2': O. (a) Find (by inspection) an optimal solution, as a function of e. (b) Draw a graph showing the optimal cost as a function of e. (c) Use the picture in part (b) to obtain the set of all dual optimal solutions, for every value of e. Exercise 5.11 Consider the function gee), as defined in the beginning of Sec tion 5.5. Suppose that g(e) is linear for e E [el , e2] . Is it true that there exists a unique optimal solution when el < e < e2? Prove or provide a counterexample. Exercise 5.12 Consider the parametric programming problem discussed in Sec tion 5.5. (a) Supposethatforsomevalueofe,thereareexactlytwodistinctbasicfeasible solutions that are optimal. Show that they must be adjacent. (b) Lete* beabreakpointofthefunctiongee). LetXl,x2,X3bebasicfeasible solutions, all of which are optimal for e = e* . Suppose that Xl is a unique optimal solution for e < e* , x3 is a unique optimal solution for e > e* , and xl , x2 , x3 are the only optimal basic feasible solutions for e = e* . Provide an example to show that xl and x3 need not be adjacent.
Sec. 5.7 Exercises Slack
Dual Variable
Constr. RHS
Allowable Increase
Value
Clay 130 1.429 130 23.33 43.75 Enamel 9 0 13 00 4 DryRm.170 0028 Kiln 23 20.143 23 5.60 3.50 Prim. 0 11.429 0 3.50 0
45

228 Chap. 5 Sensitivity analysis Exercise 5.13 Consider the following linear programming problem:
minimize 4Xl subject to 2Xl + X2
+ 5X3
5X3 1
+ 4X3 + X4 2
-3Xl
Xl,X2,X3,X4 � 0.
(a) Write down a simplex tableau and find an optimal solution. Is it unique?
(b) Write down the dual problem and find an optimal solution. Is it unique?
(c) Suppose now that we change the vector b from b = (1,2) to b = (1 – 20, 2 – 30) , where 0 is a scalar parameter. Find an optimal solution and the value of the optimal cost, as a function of O. (For all 0, both positive and negative.)
Exercise 5.14 Consider the problem
minimize (c + Od)’x
subject to Ax b + or
x � 0,
where A is an m x n matrix with linearly independent rows. We assume that the problem is feasible and the optimal cost 1(0) is finite for all values of 0 in some interval [01 , O2] .
(a) Suppose that a certain basis is optimal for 0 = -10 and for 0 = 10. Prove that the same basis is optimal for 0 = 5.
(b) Show that 1(0) is a piecewise quadratic function of O. Give an upper bound on the number of “pieces.”
(c) Letb= 0andc= o. Supposethatacertainbasisisoptimalfor0= l. For what other nonnegative values of 0 is that same basis optimal?
(d) Is 1(0) convex, concave or neither?
Exercise 5.15 Consider the problem
minimize c’x
subject to Ax b + Od
x � 0, and let 1(0) be the optimal cost, as a function of O.
(a) Let X(O) be the set of all optimal solutions, for a given value of O. For any nonnegative scalar t, define X(O, t) to be the -union of the sets X(O), ° s: 0 s: t. Is X(O, t) a convex set? Provide a proof or a counterexample.
(b) Suppose that we remove the nonnegativity constraints x � 0 from the problem under consideration. Is X(O, t) a convex set? Provide a proof or a counterexample.
(c) SupposethatXlandx2belongtoX(O,t). Showthatthereisacontinuous path from Xl to x2 that is contained within X(O,t). That is, there exists a continuous function g(>..) such that g(>..1) = Xl, g(>”2) = x2, and g(>..) E X(O,t) for all >.. E (>”1,>”2).

Sec. 5.8 Notes and sources 229
Exercise 5.16 Consider the parametric programming problem of Section 5.5. Suppose that some basic feasible solution is optimal if and only if () is equal to some ()* .
(a) Suppose that the feasible set is unbounded. Is it true that there exist at least three distinct basic feasible solutions that are optimal when () = ()*?
(b) Answer the question in part (a) for the case where the feasible set is bounded.
Exercise 5.17 Consider the parametric programming problem. Suppose that every basic solution encountered by the algorithm is nondegenerate. Prove that the algorithm does not cycle.
5.8 Notes and sources
The material in this chapter, with the exception of Section 5.3, is standard, and can be found in any text on linear programming.
5.1. A more detailed discussion o f the results of the production planning case study can be found in Freund and Shannahan (1992).
5.3. The results i n this section have beautiful generalizations t o the case of nonlinear convex optimization; see, e.g., Rockafellar (1970).
5.5. Anticycling rules for parametric programming can be found in Murty (1983).

References
569

5 70 References
AHUJA, R. K., T. L. MAGNANTI, and J. B. aRLIN. 1993. Network Flows, Prentice Hall, Englewood Cliffs, NJ.
ANDERSEN, E., J. GONDZIIO, C. MESZAROS, and X. XU. 1996. Implementa tion of interior point methods for large scale linear programming, in Inte rior point methods in mathematical programming, T. Terlaky (ed.), Kluwer Academic Publisher, Boston, MA.
ApPLEGATE, D., and W. COOK. 1991. A computational study of the job shop scheduling problem, ORSA Journal on Computing, 3, 149-156.
BALAS, E., S. CERIA, G. CORNUEJOLS, and N. NATRAJ. 1995. Gomory cuts revisited, working paper, Carnegie-Mellon University, Pittsburgh, PA.
BALAS, E., S. CERIA, and G. CORNUlboLS. 1995. Mixed 0 – 1 programming by lift-and-project in a branch and cut environment, working paper, Carnegie Mellon University, Pittsburgh, PA.
BARAHONA, F., and E. TARDOS. 1989. Note on Weintraub’s minimum cost circulation algorithm, SIAM Journal on Computing, 18, 579-583.
BARNES, E. R. 1986. A variation on Karmarkar’s algorithm for solving linear programming problems, Mathematical Programming, 36, 174-182.
BARR, R. S., F. GLOVER, and D. KLINGMAN. 1977. The alternating path basis algorithm for the assignment problem, Mathematical Programming, 13, 1- 13.
BARTHOLDI, J. J., J. B. aRLIN, and H. D. RATLIFF. 1980. Cyclic scheduling via integer programs with circular ones, Operations Research, 28, 1074-1085.
BAZARAA, M. S., J. J. JARVIS, and H. D. SHERALI. 1990. Linear Programming and Network Flows, 2nd edition, Wiley, New York, NY.
BEALE, E. M. L. 1955. Cycling in the dual simplex algorithm, Naval Research Logistics Quarterly, 2, 269-275.
BELLMAN, R. E. 1958. On a routing problem, Quarterly of Applied Mathematics, 16, 87-90.
BENDERS, J. F. 1962. Partitioning procedures for solving mixed-variables pro gramming problems, Numerische Mathematik, 4, 238-252.
BERTSEKAS, D. P. 1979. A distributed algorithm for the assignment problem, working paper, Laboratory for Information and Decision Systems, M.LT., Cambridge, MA.
BERTSEKAS, D. P. 1981. A new algorithm for the assignment problem. Mathe matical Programming, 21, 152-171.
BERTSEKAS, D. P. 1991. Linear Network Optimization, M.LT. Press, Cambridge, MA.
BERTSEKAS, D. P. 1995a. Dynamic Programming and Optimal Control, Athena Scientific, Belmont, MA.
BERTSEKAS, D. P. 1995b. Nonlinear Programming, Athena Scientific, Belmont, MA.
BERTSEKAS, D. P., and J. N. TSITSIKLIS. 1989. Parallel and Distributed Com putation: Numerical Methods, Prentice Hall, Englewood Cliffs, NJ.
BERTSIMAS, D., and L. Hsu. 1997. A branch and cut algorithm for the job shop scheduling problem, working paper, Operations Research Center, M.LT., Cambridge, MA.
BERTSIMAS, D., and X. Luo. 1997. On the worst case complexity of potential reduction algorithms for linear programming, Mathematical Programming, to appear.

References 5 7 1
BERTSIMAS, D., and S. STOCK. 1997. The air traffic flow management problem with enroute capacities, Operations Research, to appear.
BLAND, R. G. 1977. New finite pivoting rules for the simplex method, Mathe matics of Operations Research, 2, 103-107.
BLAND, R. G., D. GOLDFARB, and M. J. TODD. 1981. The ellipsoid method: a survey, Operations Research, 29, 1039-1091.
BORGWARDT, K.-H. 1982. The average number of pivot steps required by the simplex-method is polynomial, Zeitschrijt jUr Operations Research, 26, 157- 177.
BOYD, S . , and L. VANDENBERGHE. 1995. Introduction to convex optimization with engineering applications, lecture notes, Stanford University, Stanford, CA.
BRADLEY, S. P . , A. C. HAX, and T. L. MAGNANT!. 1977. Applied Mathematical Programming, Addison-Wesley, Reading, MA.
CARATHEODORY, C. 1907. Uber den Variabilitatsbereich der Koeffizienten von Potenzreihen, die gegebene Werte nicht annehmen, Mathematische An nalen, 64, 95-115.
CERIA, S., C. CORDIER, H. MARCHAND, and L. A. WOLSEY. 1995. Cutting planes for integer programs with general integer variables, working paper, Columbia University, New York, NY.
CHARNES, A. 1952. Optimality and degeneracy in linear programming, Econo metrica, 20, 160-170.
CHRISTOFIDES, N. 1975. Worst-case analysis of a new heuristic for the traveling salesman problem, Report 388, Graduate School of Industrial Administra tion, Carnegie-Mellon University, Pittsburgh, PA.
CHVATAL, V. 1983. Linear Programming, W. H. Freeman, New York, NY. CLARK, F. E. 1961. Remark on the constraint sets in linear programming, Amer
ican Mathematical Monthly, 68, 351-352.
COBHAM, A. 1965. The intrinsic computational difficulty of functions, in Logic,
Methodology and Philosophy ofScience, Y. Bar-Hillel (ed.), North-Holland, Amsterdam, The Netherlands, 24-30.
COOK, S. A. 1971. Thecomplexityoftheoremprovingprocedures,inProceedings of the 3m A CM Symposium on the Theory of Computing, 151-158.
CORMEN, T. H., C. E. LEISERSON, and R. L. RIVEST. 1990. Introduction to Algorithms, McGraw-Hill, New York, NY.
CUNNINGHAM, W. H. 1976. A network simplex method, Mathematical Program ming, 11, 105-116.
DAHLEH, M. A., and 1. DIAZ-BoBILLO. 1995. Control of Uncertain Systems: A Linear Programming Approach, Prentice Hall, Englewood Cliffs, NJ.
DANTZIG, G. B. 1951. Application of the simplex method to a transportation problem, in Activity Analysis of Production and Allocation, T. C. Koop mans (ed.), Wiley, New York, NY, 359-373.
DANTZIG, G. B. 1963. Linear Programming and Extensions, Princeton University Press, Princeton, NJ.
DANTZIG, G. B. 1992. An t-precise feasible solution to a linear program with a convexity constraint in l/t2 iterations independent of problem size, working paper, Stanford University, Stanford, CA.
DANTZIG, G. B., A. ORDEN, and P. WOLFE. 1955. The generalized simplex method for minimizing a linear form under linear inequality constraints, Pacific Journal of Mathematics, 5, 183-195.

572 References
DANTZIG, G. B., and P. WOLFE. 1960. The decomposition principle for linear programs, Operations Research, 8, 101-11I.
DIJKSTRA, E. 1959. A note on two problems in connexion with graphs, Nu merische Mathematik, 1, 269-27I.
DIKIN , I . I . 1967. Iterative solutions of problems of linear and quadratic pro gramming, Soviet Mathematics Doklady, 8, 674-675.
DIKIN, I. I. 1974. On the convergence of an iterative process, Upravlyaemye Sistemi, 12, 54-60. (In Russian.)
DINES, L. L. 1918. Systems of linear inequalities, Annals of Mathematics, 20, 191-199.
DUDA, R 0 . , and P. E. HART. 1973. Pattern Classification and Scene Analysis, Wiley, New York, NY.
EDMONDS, J. 1965a. Paths, trees, and flowers, Canadian Journal ofMathematics, 17, 449-467.
EDMONDS, J. 1965b. Maximum matching and a polyhedron with 0 – 1 vertices, Journal of Research of the National Bureau of Standards, 69B, 125-130.
EDMONDS, J. 1971. Matroids and the greedy algorithm, Mathematical Program ming, 1, 127-136.
EDMONDS, J., and R M. KARP. 1972. Theoretical improvements in algorithmic efficiency for network flow problems, Journal of the ACM, 19, 248-264.
ELIAS, P., A. FEINSTEIN, and C. E. SHANNON. 1956. Note on maximum flow through a network, IRE Transactions on Information Theory, 2, 117-119.
FARKAS, G. 1894. On the applications of the mechanical principle of Fourier, Mathematikai es Termeszettudomanyi Ertesito, 12, 457-472. (In Hungar ian.)
FIACCO, A. V., and G. P. MCCORMICK. 1968. Nonlinear programming: sequen tial unconstrained minimization techniques, Wiley, New York, NY.
FEDERGRUEN, A., and H. GROENEVELT. 1986. Preemptive scheduling ofuniform machines by ordinary network flow techniques, Management Science, 32, 341-349.
FISHER, H., and G. L. THOMPSON. 1963. Probabilistic learning combinations of local job shop scheduling rules, in Industrial Scheduling, J. F. Muth and G. L. Thompson (eds.), Prentice Hall, Englewood Cliffs, NJ, 225-251.
FLOYD, R. W. 1962. Algorithm 97: shortest path. Communications of ACM, 5, 345.
FORD, L. R 1956. Network flow theory, report P-923, Rand Corp., Santa Monica, CA.
FORD, L. R., and D. R FULKERSON. 1956a. Maximal flow through a network, Canadian Journal of Mathematics, 8, 399-404.
FORD, L. R , and D. R FULKERSON. 1956b. Solving the transportation problem, Management Science, 3, 24-32.
FORD, L. R, and D. R. FULKERSON. 1962. Flows in Networks, Princeton University Press, Princeton, NJ.
FOURIER. J. B. J. 1827. Analyse des Travaux de l’Academie Royale des Sciences, pendant l’annee 1824, Partie mathematique, Histoire de l’Academie Royale des Sciences de l’Institut de France, 7, xlvii-Iv.
FREUND, R. M. 1991. Polynomial-time algorithms for linear programming based only on primal affine scaling and projected gradients of a potential function, Mathematical Programming, 51, 203-222.

References 5 73
FREUND, R. M., and B. SHANNAHAN. 1992. Short-run manufacturing problems at DEC, report, Sloan School of Management, M.LT., Cambridge, MA.
FRISCH, M. R. 1956. La resolution des problemes de programme lineaire par la methode du potential logarithmique, Cahiers du Seminaire D ‘ Econome trie, 4, 7-20.
FULKERSON, D. R., and G. B. DANTZIG. 1955. Computation of maximum flow in networks, Naval Research Logistics Quarterly, 2, 277-283.
GALE, D., H. W. KUHN, and A. W. TUCKER. 1951. Linear programming and the theory of games, in Activity Analysis of Production and Allocation, T. C. Koopmans (ed.), Wiley, New York, NY, 317-329.
GALE, D., and L. S. SHAPLEY. 1962. College admissions and the stability of marriage, American Mathematical Monthly, 69, 9-15.
GAREY, M. R., and D. S. JOHNSON. 1979. Computers and Intmctability: a Guide to the Theory of NP-completeness, W. H. Freeman, New York, NY.
GEMAN, S. and D. GEMAN. 1984. Stochastic Relaxation, Gibbs distribution, and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741.
GILL, P. E., W. MURRAY, and M. H. WRIGHT. 1981. Practical Optimization, Academic Press, New York, NY.
GILMORE, P. C., and R. E. GOMORY. 1961. A linear programming approach to the cutting stock problem, Opemtions Research, 9, 849-859.
GILMORE, P. C . , and R. E. GOMORY. 1963. A linear programming approach to the cutting stock problem – part II, Opemtions Research, 1 1 , 863-888.
GOEMANS, M., and D. BERTSIMAS. 1993. Survivable networks, LP relaxations and the parsimonious property, Mathematical Progmmming, 60, 145-166.
GOEMANS, M., and D. WILLIAMSON. 1993. A new 3/4 approximation algorithm for MAX SAT, in Proceedings of the 3rd International Conference in Integer Progmmming and Combinatorial Optimization, 313-321.
GOLDBERG, A. V., and R. E. TARJAN. 1988. A new approach to the maximum flow problem, Journal of the ACM, 35, 921-940.
GOLDBERG, A. V., and R. E. TARJAN. 1989. Finding minimum-cost circulations by cancelling negative cycles, Journal of the ACM, 36, 873-886.
GOLDFARB, D., and J. K. REID. 1977. A practicable steepest-edge simplex algorithm, Mathematical Progmmming, 12, 361-371.
GOLUB, G. H., and C. F. VAN LOAN. 1983. Matrix Computations, The Johns Hopkins University Press, Baltimore, MD.
GOMORY, R. E. 1958. Outline of an algorithm for integer solutions to linear programs, Bulletin of the American Mathematical Society, 64, 275-278.3
GONZAGA, C. 1989. An algorithm for solving linear programming in O(n L) operations, in Progress in Mathematical Progmmming, N. Megiddo (ed.), Springer-Verlag, New York, NY, 1-28.
GONZAGA, C. 1990. Polynomial affine algorithms for linear programming, Math ematical Progmmming, 49, 7-21.
GROTSCHEL, M., L. LOVASZ, and A. SCHRIJVER. 1981. The ellipsoid method and its consequences in combinatorial optimization, Combinatorica, 1, 169-197. GROTSCHEL, M., L. LOVASZ, and A. SCHRIJVER. 1988. Geometric Algorithms
and Combinatorial Optimization, Springer-Verlag, New York, NY.
HAIMOVICH, M. 1983. The simplex method is very good! – on the expected number of pivot steps and related properties of random linear programs, preprint .

5 74 References
HAJEK, B. 1988. Cooling schedules for optimal annealing, Mathematics of Oper ations Research, 13, 311-329.
HALL, L. A., and R. J. VANDERBEI. 1993. Two thirds is sharp for affine scaling, Operations Research Letters, 13, 197-201.
HALL, L. A., A. S. SCHULTZ, D. B. SHMOYS, and J. WEIN. 1996. Scheduling to minimize average completion time; off-line and on-line approximation algorithms, working paper, Johns Hopkins University, Baltimore, MD.
HANE , C. A., C. BARNHART, E. L. JOHNSON, R. E. MARSTEN, G. L. NEMHAUS ER, and G. SIGISMONDI. 1995. The fleet assignment problem: solving a large-scale integer program, Mathematical Programming, 70, 211-232.
HARRIS, P. M. J. 1973. Pivot selection methods of the Devex LP code, Mathe matical Programming, 5, 1-28.
HAYKIN, S. 1994. Neural Networks: A Comprehensive Foundation, McMillan, New York, NY.
HELD, M., and R. M. KARP. 1962. A dynamic programming approach to se quencing problems, SIAM Journal on Applied Mathematics, 10, 196-210.
HELD, M., and R. M. KARP. 1970. The traveling salesman problem and mini mum spanning trees, Operations Research, 18, 1138-1162.
HELD, M., and R. M. KARP. 1971. The traveling salesman problem and mini mum spanning trees: part II, Mathematical Programming, 1, 6-25.
HELLY, E. 1923. tIber Mengen konvexer Korper mit gemeinschaftlichen Punkten, Jahresbericht Deutsche Mathematische Vereinungen, 32, 175-176.
HOCHBAUM, D. (ed.). 1996. Approximation algorithms for NP-hard problems, Kluwer Academic Publishers, Boston, MA.
Hu, T. C. 1969. Integer Programming and Network Flows, Addison-Wesley, Reading, MA.
IBARRA, O. H., and C. E. KIM. 1975. Fast approximation algorithms for the knapsack and sum of subset problems, Journal of the ACM, 22, 463-468.
INFANGER, G. 1993. Planning under uncertainty: solving large-scale stochastic linear programs, Boyd & Fraser, Danvers, MA.
JOHNSON, D. S., C. ARAGON, L. MCGEOCH, and C. SCHEVON. 1990. Optimiza tion by simulated annealing: an experimental evaluation, part I: graph partitioning, Operations Research, 37, 865-892.
JOHNSON, D. S., C. ARAGON, L. MCGEOCH, and C. SCHEVON. 1992. Optimiza tion by simulated annealing: an experimental evaluation, part II: graph coloring and number partitioning, Operations Research, 39, 378-406.
KALAl, G., and D. KLEITMAN. 1992. A quasi-polynomial bound for the diameter of graphs of polyhedra, Bulletin of the American Mathematical Society, 26, 315-316.
KALL, P., and S. W. WALLACE. 1994. Stochastic Programming, Wiley, New York, NY.
KARMARKAR, N. 1984. A new polynomial-time algorithm for linear program ming, Combinatorica, 4, 373-395.
KARP, R. M. 1972. Reducibility among combinatorial problems, in Complexity of Computer Computations, R. E. Miller and J. W. Thacher (eds.), Plenum Press, New York, NY, 85-103.
KARP, R. M. 1978. A characterization of the minimum cycle mean in a digraph, Discrete Mathematics, 23, 309-311.

References 5 7 5
KARP, R. M., and C. H. PAPADIMITRIOU. 1982. On linear characterizations of combinatorial optimization problems, SIAM Journal on Computing, 11, 620-632.
KHACHIAN, L. G. 1979. A polynomial algorithm in linear programming, Soviet Mathematics Doklady, 20, 191-194.
KIRKPATRICK, S., C. D. GELATT, JR., and M. P. VECCHI. 1983. Optimization by simulated annealing, Science, 220, 671-680.
KLEE, V., and G. J. MINTY. 1972. How good is the simplex algorithm?, in Inequalities – III, O. Shisha (ed.), Academic Press, New York, NY, 159- 175.
KLEE, V., and D. W. WALKUP. 1967. The d-step conjecture for polyhedra of dimension d < 6, Acta Mathematica, 117, 53-78. KLEIN, M. 1967. A primal method for minimal cost flows with application to the assignment and transportation problems, Management Science, 14, 205- 220. KOJIMA, M., S. MIZUNO, and A. YOSHISE. 1989. A primal-dual interior point algorithm for linear programming, in Progress in Mathematical Program ming, N. Megiddo (ed.) , Springer-Verlag, New York, NY, 29-47. KUHN, H. W. 1955. The Hungarian method for the assignment problem, Naval Research Logistics Quarterly, 2, 83-97. LAWLER, E. L. 1976. Combinatorial Optimization: Networks and Matroids, Holt, Rinehart, and Winston, New York, NY. LAWLER, E. L., J. K. LENSTRA, A. H. G. RINNOOY KAN, and D. B. SHMOYS (eds.). 1985. The Traveling Salesman Problem: a Guided Tour of Combi natorial Optimization, Wiley, New York, NY. LENSTRA, J. K., A. H. G. RINNOOY KAN, and A. SCHRIJVER (eds.). 1991. History of Mathematical Programming: A Collection of Personal Reminis cences, Elsevier, Amsterdam, The Netherlands. LEVIN, A. Y. 1965. On an algorithm for the minimization of convex functions, Soviet Mathematics Doklady, 6, 286-290. LEVIN, L. A. 1973. Universal sorting problems, Problemy Peredachi Informatsii, 9, 265-266. (In Russian.) LEWIS, H. R., and C. H. PAPADIMITRIOU. 1981. Elements of the Theory of Computation, Prentice Hall, Englewood Cliffs, NJ. LUENBERGER, D. G. 1969. Optimization by Vector Space Methods, Wiley, New York, NY. LUENBERGER, D. G. 1984. Linear and Nonlinear Programming, 2nd ed., Addison Wesley, Reading, MA. LUSTIG, 1., R. E. MARSTEN, and D. SHANNO. 1994. Interior point methods: computational state of the art, ORSA Journal on Computing, 6, 1-14. MAGNANTI, T. L., and L. A. WOLSEY. 1995. Optimal Trees, in Handbook of Operations Research and Management Science, Volume 6, Network Models, M. O. Ball, C. L. Monma, T. L. Magnanti and G. L. Nemhauser (eds.), North Holland, Amsterdam, The Netherlands, 503-615. MARSHALL, K. T., and J. W. SUURBALLE. 1969. A note on cycling in the simplex method, Naval Research Logistics Quarterly, 16, 121-137. MARTIN, P., and D. B. SHMOYS. 1996. A new approach to computing opti mal schedules for the job-shop scheduling problem, in Proceedings of the 5th International Conference in Integer Programming and Combinatorial Optimization, 389-403. 5 76 References MCSHANE, K. A., C. L. MONMA, and D. SHANNO. 1991. An implementation of a primal-dual interior point method for linear programming, ORSA Journal on Computing, 1, 70-83. MEGIDDO, N. 1989. Pathways to the optimal set in linear programming, in Progress in Mathematical Programming, N. Megiddo (ed.), Springer-Verlag, New York, NY, 131-158. MEGIDDO, N., and SHUB, M. 1989. Boundary behavior of interior point algo rithms in linear programming, Mathematics of Operations Research, 14, 97-146. MINKOWSKI, H. 1896. Geometrie der Zahlen, Teubner, Leipzig, Germany. MIZUNO, S. 1996. Infeasible interior point algorithms, in Interior Point Algo rithms in Mathematical Programming, T. Terlaky (ed.), Kluwer Academic Publishers, Boston, MA. MONTEIRO, R. D. C., and I. ADLER. 1989a. Interior path following primal-dual algorithms; part I: linear programming, Mathematical Programming, 44, 27-41. MONTEIRO, R. D. C., and I. ADLER. 1989b. Interior path following primal dual algorithms; part II: convex quadratic programming, Mathematical Programming, 44, 43-66. MOTZKIN, T. S. 1936. Beitrage zur Theorie der linearen Ungleichungen (Inau gural Dissertation Basel), Azriel, Jerusalem. MURTY, K. G. 1983. Linear Programming, Wiley, New York, NY. NEMHAUSER, G. L., and L. A. WOLSEY. 1988. Integer and Combinatorial Opti mization, Wiley, New York, NY. NESTEROV, Y., and A. NEMIROVSKII. 1994. Interior point polynomial algo rithms for convex programming, SIAM, Studies in Applied Mathematics, 13, Philadelphia, PA. VONNEUMANN, J.1947. Discussionofamaximumproblem,unpublishedworking paper, Institute for Advanced Studies, Princeton, NJ. VON NEUMANN, J. 1953. A certain zero-sum two-person game equivalent to the optimal assignment problem, in Contributions to the Theory of Games, II, H. W. Kuhn and A. W. Tucker (eds.) , Annals of Mathematics Studies, 28, Princeton University Press, Princeton, NJ, 5-12. ORDEN, A. 1993. LP from the '40s to the '90s, Interfaces, 23, 2-12. aRLIN, J. B. 1984. Genuinely polynomial simplex and non-simplex algorithms for the minimum cost flow problem, technical report 1615-84, Sloan School of Management, M.I.T., Cambridge, MA. PADBERG, M. W., and M. R. RAo. 1980. The Russian method and integer programming, working paper, New York University, New York, NY. PAPADIMITRIOU, C. H. 1994. Computational Complexity, Addison-Wesley, Read ing, MA. PAPADIMITRIOU, C. H., and K. STEIGLITZ. 1982. Combinatorial Optimization: Algorithms and Complexity, Prentice Hall, Englewood Cliffs, NJ. PLOTKIN, S., and E. Tardos. 1990. Improved dual network simplex, in Proceed ings of the First ACM-SIAM Symposium on Discrete Algorithms, 367-376. POLJAK, B. T. 1987. Introduction to Optimization, Optimization Software Inc., New York, NY. PRIM, R. C. 1957. Shortest connection networks and some generalizations, Bell System Technical Journal, 36, 1389-1401. References 5 77 QUEYRANNE, M. 1993. Structure of a simple scheduling polyhedron, Mathemat ical Programming, 58, 263-285. RECSKI, A. 1989. Matroid Theory and its Applications in Electric Network The ory and in Statics, Springer-Verlag, New York, NY. RENEGAR, J. 1988. A polynomial time algorithm based on Newton's method for linear programming, Mathematical Programming, 40, 59-93. ROCKAFELLAR, R. T. 1970. Convex Analysis, Princeton University Press, Prince ton, NJ. ROCKAFELLAR, R. T. 1984. Network Flows and Monotropic Optimization, Wiley, New York, NY. Ross, S. 1976. Risk, return, and arbitrage, in Risk and Return in Finance, L Friend, and J. Bicksler (eds.), Cambridge, Ballinger, England. Ross, S . 1978. A simple approach to the valuation of risky streams, Journal of Business, 51, 453-475. RUDIN, W. 1976. Real Analysis, McGraw-Hill, New York, NY. RUSHMEIER, R. A., and S. A. KONTOGIORGIS. 1997. Advances in the optimiza tion of airline fleet assignment, Transportation Science, to appear. SCHRIJVER, A. 1986. Theory of Linear and Integer Programming, Wiley, New York, NY. SCHULTZ, A. S. 1996. Scheduling to minimize total weighted completion time: performance guarantees of LP-based heuristics and lower bounds, in Pro ceedings of the 5th International Conference in Integer Programming and Combinatorial Optimization, 301-315. SHOR, N. Z. 1970. Utilizationoftheoperationofspacedilationintheminimiza tion of convex functions, Cybernetics, 6, 7-15. SMALE, S. 1983. On the average number of steps in the simplex method of linear programming, Mathematical Programming, 27, 241-262. SMITH, W. E. 1956. Various optimizers for single-stage production, Naval Re search Logistics Quarterly, 3, 59-66. STIGLER, G. 1945. The cost of subsistence, Journal of Farm Economics, 27, 303-314. STOCK, S. 1996. Allocation of NSF graduate fellowships, report, Sloan School of Management, M.LT., Cambridge, MA. STONE, R. E., and C. A. TOVEY. 1991. The simplex and projective scaling algorithms as iteratively reweighted least squares, SIAM Review, 33, 220- 237. STRANG, G. 1988. Linear Algebra and its Applications, 3rd ed., Academic Press, New York, NY. TARDOS, E. 1985. A strongly polynomial minimum cost circulation algorithm, Combinatorica, 5, 247-255. TEO, C. 1996. Constructing approximation algorithms via linear programming relaxations: primal dual and randomized rounding techniques, Ph.D. thesis, Operations Research Center, M.LT., Cambridge, MA. TSENG, P. 1989. A simple complexity proof for a polynomial-time linear pro gramming algorithm, Operations Research Letters, 8, 155-159. TSENG, P., and Z.-Q. Luo. 1992. On the convergence of the affine scaling algorithm, Mathematical Programming, 56, 301-319. 5 78 References TSUCHIYA, T. 1991. Global convergence of the affine scaling methods for de generate linear programming problems, Mathematical Programming, 52, 377-404. TSUCHlYA, T., and M. MURAMATSU. 1995. Global convergence of a long-step affine scaling algorithm for degenerate linear programming problems, SIAM Journal on Optimization, 5, 525-551. TUCKER, A. W. 1956. Dual systems of homogeneous linear relations, in Linear Inequalities and Related Systems, H. W. Kuhn and A. W. Tucker (eds.), Princeton University Press, Princeton, NJ, 3-18. VANDERBEI, R. J., M. S. MEKETON, and B. A. FREEDMAN. 1986. A mod ification of Karmarkar's linear programming algorithm, Algorithmica, 1, 395-407. VANDERBEI, R. J., J. C. LAGARIAS. 1990. I. I. Dikin's convergence result for the affine-scaling algorithm, in Mathematical Developments Arising from Linear Programming, J. C. Lagarias and M. J. Todd (eds.), American Mathematical Society, Providence, RI, Contemporary Mathematics, 114, 109-119. VRANAS, P. 1996. Optimal slot allocation for European air traffic flow manage ment, working paper, German aerospace research establishment, Berlin, Germany. WAGNER, H. M. 1959. On a class of capacitated transportation problems, Man agement Science, 5, 304-318. WARSHALL, S. 1962. A theorem on boolean matrices, Journal of the A CM, 23, 11-12. WEBER, R. 1995. Personal communication. WEINTRAUB, A. 1974. A primal algorithm to solve network flow problems with convex costs, Management Science, 21, 87-97. WILLIAMS, H. P. 1990. Model Building in Mathematical Programming, Wiley, New York, NY. WILLIAMSON, D. 1994. On the design of approximation algorithms for a class of graph problems, Ph.D. thesis, Department of EECS, M.LT., Cambridge, MA. YE, Y. 1991. An O(n3L) potential reduction algorithm for linear programming, Mathematical Programming, 50, 239-258. YE, Y., M. J. TODD, and S. MIZUNO. 1994. An O(FnL)-iteration homogeneous and self-dual linear programming algorithm, Mathematics of Operations Research, 19, 53-67. YUDIN, D. B . , and A. NEMIROVSKII. 1977. Informational complexity and efficient methods for the solution of convex extremal problems, Matekon, 13, 25-45. ZHANG, Y., and R. A. TAPIA. 1993. A superlinearly convergent polynomial primal-dual interior point algorithm for linear programming, SIAM Journal on Optimization, 3, 118-133. Index 579 580 Index AB Absolute values, problems with, 17-19, 35 Active constraint, 48 Adjacent bases, 56 basic solutions, 53, 56 vertices, 78 Affine function, 15, 34 independence, 120 subspace, 30-31 transformation, 364 Affine scaling algorithm, 394, 395-409, 440-441, 448, 449 initialization, 403 long-step, 401, 402-403, 440, 441 performance, 403-404 short-step, 401, 404-409, 440 Ball, 364 Barrier function, 419 Barrier problem, 420, 422, 431 Basic column, 55 Basic direction, 84 Basic feasible solution, 50, 52 existence, 62-65 existence of an optimum, 65-67 finite number of, 52 initial, see initialization magnitude bounds, 373 to bounded variable LP, 76 to general LP, 50 Basic solution, 50, 52 to network flow problems, 280-284 to standard form LP, 53-54 to dual, 154, 161-164 Basic indices, 55 Basic variable, 55 Basis, 55 adjacent, 56 degenerate, 59 optimal, 87 relation to spanning trees, 280-284 Basis matrix, 55, 87 Basis of a subspace, 29, 30 Bellman equation, 332, 336, 354 Bellman-Ford algorithm, 336-339, 354-355, 358 Benders decomposition, 254-260, 263, 264 Big-M method, 117-119, 135-136 Big 0 notation, 32 Binary search, 372 Binding constraint, 48 Bipartite matching problem, 326, 353, 358 Birkhoff-von Neumann theorem, 353 Bit model of computation, 362 Bland's rule, see smallest subscript rule Bounded polyhedra, representation, 67-70 Bounded set, 43 Branch and bound, 485-490, 524, 530, 542-544, 560-562 Branch and cut, 489-450, 530 Bring into the basis, 88 c Candidate list, 94 Capacity of an arc, 272 of a cut, 309 of a node, 275 Caratheodory's theorem, 76, 197 Cardinality, 26 Caterer problem, 347 Air traffic flow management, 544-551, 567 Algorithm, 32-34, 40, 361 complexity of, see running time efficient, 363 polynomial time, 362, 515 Analytic center, 422 Anticycling in dual simplex, 160 in network simplex, 357 in parametric programming, 229 in primal simplex, 108-111 Approximation algorithms, 480, 507-511, 528-530, 558 Arbitrage, 168, 199 Arc backward, 269 balanced, 316 directed, 268 endpoint of, 267 forward, 269 in directed graphs, 268 in undirected graphs, 267 incident, 267, 268 incoming, 268 outgoing, 268 Arithmetic model of computation, 362 Artificial variables, 112 elimination of, 112-113 Asset pricing, 167-169 Assignment problem, 274, 320, 323, 325-332 with side constraint, 526-527 Auction algorithm, 270, 325-332, 354, 358 Augmenting path, 304 Average computational complexity, 127-128, 138 Index 581 Central path, 420, 422, 444 Certificate of infeasibility, 165 Changes in data, see sensitivity analysis Chebychev approximation, 188 Chebychev center, 36 Cholesky factor, 440, 537 Clark's theorem, 151, 193 Classifier, 14 Closedness of finitely generated cones 172, 196 Circuits, 315 Circulation, 278 decomposition of, 350 simple, 278 Circulation problem, 275 Clique, 484 Closed set, 169 Column of a matrix, notation, 27 zeroth, 98 Column generation, 236-238 Column geometry, 119-123, 137 Column space, 30 Column vector, 26 Combination convex, 44 linear, 29 Communication network, 12-13 Complementary slackness, 151-155, 191 economic interpretation, 329 in assignment problem, 326-327 in network flow problems, 314 strict, 153, 192, 437 Complexity theory, 514-523 Computer manufacturing, 7-10 Concave function, 15 characterization, 503, 525 Cone, 174 containing a line, 175 pointed, 175 polyhedral, 175 Connected graph directed, 268 undirected, 267 Connectivity, 352 Convex combination, 44 Convex function, 15, 34, 40 Convex hull, 44, 68, 74, 183 of integer solutions, 464 Convex polyhedron, see polyhedron Convex set, 43 Convexity constraint, 120 Corner point, see extreme point Cost function, 3 Cramer's rule, 29 Crossover problem, 541-542 Currency conversion, 36 Cut, 309 capacity of, 309 minimum, 310, 390 s-t, 309 Cutset, 467 Cutset formulation of minimum spanning tree problem, 467 of traveling salesman problem, 470 Cutting plane method for integer programming, 480-484, 530 for linear programming, 236-239 for mixed integer programming, 524 Cutting stock problem, 234-236, 260, 263 Cycle cost of, 278 directed, 269 in directed graphs, 269 in undirected graphs, 267 negative cost, 291 unsaturated, 301 Cyclic problems, 40 Cycling, 92 in primal simplex, 104-105, 130, 138 see also anticycling D DNA sequencing, 525 Dantzig-Wolfe decomposition, 239-254, 261-263, 264 Data fitting, 19-20 Decision variables, 3 Deep cuts, 380, 388 Degeneracy, 58-62, 536, 541 and interior point methods, 439 and uniqueness, 190-191 in assignment problems, 350 in dual, 163-164 in standard form, 59-60, 62 in transportation problems, 349 Degenerate basic solution, 58 Degree, 267 Delayed column generation, 236-238 Delayed constraint generation, 236, 263 Demand , 272 Determinant, 29 Devex rule, 94, 540 Diameter of a polyhedron, 126 Diet problem, 5, 40, 156, 260-261 Dijkstra's algorithm, 340-342, 343, 358 Dimension, 29, 30 of a polyhedron, 68 Disjunctive constraints, 454, 472-473 Dual algorithm, 157 Dual ascent approximate, 266 582 Index in network flow problems, 266, 316-325, 357 steepest, 354 termination, 320 Dual plane, 122 Dual problem, 141, 142, 142-146 optimal solutions, 215-216 Dual simplex method, 156-164, 536-537, 540-544 for network flow problems, 266, 323-325, 354, 358 geometry, 160 revised, 157 Dual variables in network flow problems, 285 interpretation, 155-156 Duality for general LP, 183-187 Duality gap, 399 Duality in integer programming, 494-507 Duality in network flow problems, 312-316 Duality theorem, 146-155, 173, 184, 197, 199 Dynamic programming, 490-493, 530 integer knapsack problem, 236 zero-one knapsack problem, 491-493 traveling salesman problem, 490 E Edge of a polyhedron, 53, 78 Edge of an undirected graph, 267 Efficient algorithm, see algorithm Electric power, 10-11, 255-256, 564 Elementary direction, 316 Elementary row operation, 96 Ellipsoid, 364, 396 Ellipsoid method, 363-392 complexity, 377 for full-dimensional bounded polyhe- dra, 371 for optimization, 378-380 practical performance, 380 sliding objective, 379, 389 Enter the basis, 88 Epsilon-relaxation method, 266, 358 Evaluation problem, 517 Exponential number of constraints, 462-464, 476, 518, 565 Farkas' lemma, 165, 172, 197, 199 Feasible direction, 83, 129 Feasible set, 3 Feasible solution, 3 Finitely generated cone, 196, 198 set, 182 Fixed charge network design problem, 476, 566 Fleet assignment problem, 537-544, 567 Flow, 272 feasible, 272 Flow augmentation, 304 Flow conservation, 272 Flow decomposition theorem, 298-300, 351 for circulations, 350 Floyd-Warshall algorithm, 355-356 Forcing constraints, 453 Ford-Fulkerson algorithm, 305-312, 357 Fourier-Motzkin elimination, 70-74, 79 Fractional programming, 36 Free variable, 3 elimination of, 5 Full-dimensional polyhedron, see polyhe dron Full rank, 30, 57 Full tableau, 98 G Gaussian elimination, 33, 363 Global minimum, 15 Gomory cutting plane algorithm, 482-484 Graph, 267-272 connected, 267, 268 directed, 268 undirected, 267 Graph coloring problem, 566-567 Graphical solution, 21-25 Greedy algorithm for minimum spanning trees, 344, 356 Groundholding, 545 H Halfspace, 43 Hamilton circuit, 521 Held-Karp lower bound, 502 Helly's theorem, 194 Heuristic algorithms, 480 Hirsch conjecture, 126-127 Hungarian method, 266, 320, 323, 358 Hyperplane, 43 FI Facility location problem, 453-454, Identity matrix, 28 380-387, 465-472, 551-562 Exponential time, 33 Extreme point, 46, 50 see also basic feasible solution Extreme ray, 67, 176-177, 197, 525 Euclidean norm, 27 Index 583 Incidence matrix, 277, 457 truncated, 280 Independent set problem, 484 Initialization affine scaling algorithm, 403 Dantzig-Wolfe decomposition, 250-251 negative cost cycle algorithm, 294 network flow problems, 352 network simplex algorithm, 286 potential reduction algorithm, 416-418 primal path following algorithm, 429-431 primal-dual path following algorithm, 435-437 primal simplex method, 111-119 Inner product, 27 Instance of a problem, 360-361 size, 361 Integer programming, 12, 452 mixed, 452, 524 zero-one, 452, 517, 518 Interior, 395 Interior point methods, 393-449, 537 computational aspects, 439-440, 536-537, 540-544 Intree, 333 Inverse matrix, 28 Invertible matrix, 28 J Job shop scheduling problem, 476, 551-563, 565, 567 K Karush-Kuhn-T'ucker conditions, 421 Knapsack problem approximation algorithms, 507-509, 530 complexity, 518, 522 dynamic programming, 491-493, 530 integer, 236 zero-one, 453 Konig-Egervary theorem, 352 L Label correcting methods, 339-340 Labeling algorithm, 307-309, 357 Lagrange multiplier, 140, 494 Lagrangean, 140, 190 Lagrangean decomposition, 527-528 Lagrangean dual, 495 solution to, 502-507 Lagrangean relaxation, 496, 530 Leaf, 269 Length, of cycle, path, walk, 333 Leontief systems, 195, 200 Lexicographic pivoting rule, 108-111, 131-132, 137 in revised simplex, 132 in dual simplex, 160 Libraries, see optimization libraries Line, 63 Linear algebra, 26-31, 40, 137 Linear combination, 29 Linear inequalities, 165 inconsistent, 194 Linear programming, 2, 38 examples, 6-14 Linear programming relaxation, 12, 462 Linearly dependent vectors, 28 Linearly independent vectors, 28 Linearly independent constraints, 49 Local minimum, 15, 82, 131 Local search, 511-512, 530 Lot sizing problem, 475, 524 M Marginal cost, 155-156 Marriage problem, 352 Matching problem, 470-472, 477-478 see also bipartite matching, stable matching Matrix, 26 identity, 28 incidence, 277 inversion, 363 inverse, 28 invertible, 28 nonsingular, 28 positive definite, 364 rotation, 368, 388 square, 28 Matrix inversion lemma, 131, 138 Max-flow min-cut theorem, 310-311, 351, Maximization problems, 3 Maximum flow problem, 273, 301-312 Maximum satisfiability, 529-530 Min-cut problem, see cut Mean cycle cost minimization, 355, 358 Minimum spanning tree problem, 343-345, 356, 358, 466, 477 multicut formulation, 476 Modeling languages, 534-535, 567 Moment problem, 35 Multicommodity flow problem, 13 Multiperiod problems, 10-11, 189 N NP, 518, 531 357 584 Index NP-complete, 519, 531 NP-hard, 518, 531, 556 NSF fellowships, 459-461, 477 Nash equilibrium, 190 Negative cost cycle algorithm, 291-301, 357 largest improvement rule, 301, 351, 357 mean cost rule, 301, 357 Network, 272 Network flow problem, 13, 551 capacitated, 273, 291 circulation, 275 complementary slackness, 314 dual, 312-313, 357 formulation, 272-278 integrality of optimal solutions, 289-290, 300 sensitivity, 313-314 shortest paths, relation to, 334 single source, 275 uncapacitated, 273, 286 with lower bounds, 276, 277 with piecewise linear convex costs, 347 see also primal-dual method Network simplex algorithm, 278-291, 356-357, 536 anticycling, 357, 358 dual, 323-325, 354 Newton direction, 424, 432 method, 432-433, 449 step, 422 Node, 267, 268 labeled, 307 scanned, 307 sink, 272 source, 272 Node-arc incidence matrix, 277 truncated, 280 Nonbasic variable, 55 Nonsingular matrix, 28 Null variable, 192 Nullspace, 30 Nurse scheduling, 11-12, 40 o Objective function, 3 One-tree, 501 Operation count, 32-34 Optimal control, 20-21, 40 Optimal cost, 3 Optimal solution, 3 to dual, 215-216 Optimality conditions for LP problems 82-87, 129, 130 for maximum flow problems, 310 for network flow problems, 298-300 Karush-Kuhn-Thcker, 421 Optimization libraries, 535-537, 567 Optimization problem, 517 Options pricing, 195 Order of magnitude, 32 Orthant, 65 Orthogonal vectors, 27 p P, 515 Parametric programming, 217-221, 227-229 Path augmenting, 304 directed, 269 in directed graphs, 269 in undirected graphs, 267 shortest, 333 unsaturated, 307 walk, 333 Path following algorithm, primal, 419-431 complexity, 431 initialization, 429-431 Path following algorithm, primal-dual, 431-438 complexity, 435 infeasible, 435-436 performance, 437-438 quadratic programming, 445-446 self-dual, 436-437 Path following algorithms, 395-396, 449, 542 Pattern classification, 14, 40 Perfect matching, 326, 353 see also matching problem Perturbation of constraints and degener acy, 60, 131-132, 541 Piecewise linear convex optimization, 16-17 189, 347 Piecewise linear function, 15, 455 Pivot, 90, 158 Pivot column, 98 Pivot element, 98, 158 Pivot row, 98, 158 Pivot selection, 92-94 Pivoting rules, 92, 108- 1 1 1 Polar cone, 198 Polar cone theorem, 198-199 Polyhedron, 42 containing a line, 63 full-dimensional, 365, 370, 375-377, 389 in standard form, 43, 53-58 isomorphic, 76 see also representation Polynomial time, 33, 362, 515 Index 585 Potential function, 409, 448 Potential reduction algorithm, 394, 409-419, 445, 448 complexity, 418, 442 initialization, 416-418 performance, 419 with line searches, 419, 443-444 Preemptive scheduling, 302, 357 Preflow-push methods, 266, 358 Preprocessing, 540 Price variable, 140 Primal algorithm, 157, 266 Primal problem, 141, 142 Primal-dual method, 266, 320, 321-323, 353, 357 Primal-dual path following method, see path following algorithm Probability consistency problem, 384-386 Problem, 360 Product of matrices, 28 Production and distribution problem, 475 Production planning, 7-10, 35, 40, 210-212, 229 Project management, 335-336 Projections of polyhedra, 70-74 Proper subset, 26 Pushing flow, 278 Q Quadratic programming, 445-446 R Rank, 30 Ray, 172 see also extreme ray Recession cone, 175 Recognition problem, 515, 517 Reduced cost, 84 in network flow problems, 285 Reduction (of a problem to another), 515 Redundant constraints, 57-58 Reinversion, 107 Relaxation, see linear programming relax- ation Relaxation algorithm, 266, 321, 358 Relaxed dual problem, 237 Representation of bounded polyhedra, 67 of cones, 182, 198 of polyhedra, 179-183, 198 Requirement line, 122 Residual network, 295-297 Resolution theorem, 179, 198, 199 Restricted problem, 233 Revised dual simplex method, 157 Revised simplex method, 95-98, 105-107 lexicographic rule, 132 Rocket control, 21 Row space, 30 vector, 26 zeroth, 99 Running time, 32, 362 s Saddle point of Lagrangean, 190 Samuelson's substitution theorem, 195 Scaling in auction algorithm, 332 in maximum flow problem, 352 in network flow problems, 358 Scanning a node, 307 Scheduling, 11-12, 302, 357, 551-563, 567 with precedence constraints, 556 Schwartz inequality, 27 Self-arc, 267 Sensitivity analysis, 201-215, 216-217 adding new equality constraint, 206-207 adding new inequality constraint, 204-206 adding new variable, 203-204 changes in a nonbasic column, 209 changes in a basic column, 210, 222-223 changes in b, 207-208, 212-215 changes in c , 208-209, 216-217 in network flow problems, 313-314 Separating hyperplane, 170 between disjoint polyhedra, 196 finding, 196 Separating hyperplane theorem, 170 Separation problem, 237, 382, 392, 555 Sequencing with setup times, 457-459, 518 Set covering problem, 456-457, 518 Set packing problem, 456-457, 518 Set partitioning problem, 456-457, 518 Setup times, 457-459, 518 Shadow price, 156 Shortest path problem, 273, 332-343 all-pairs, 333, 342-343, 355-356, 358 all-to-one, 333 relation to network flow problem, 333 Side constraints, 197, 526-527 Simplex, 120, 137 Simplex method, 90-91 average case behavior, 127-128, 138 column geometry, 119-123 computational efficiency, 124-128 dual, see dual simplex method 586 Index for degenerate problems, 92 for networks, see network simplex full tableau implementation, 98-105, 105-107 history, 38 implementations, 94-108 initialization, 111-119 naive implementation, 94-95 performance, 536-537, 54-541 revised, see revised simplex method termination, 91, 110 two-phase, 116-117 unbounded problems, 179 with upper bound constraints, 135 Simplex multipliers, 94, 161 Simplex tableau, 98 Simulated annealing, 512-514, 531 Size of an instance, 361 Total unimodularity, 357 Tour, 383, 469 Tournament problem, 347 Transformation (of a problem to another), 516 Transportation problem, 273, 274-275, 358 degeneracy, 349 Transpose, 27 Transshipment problem, 266 Traveling salesman problem, directed, 478, 518 branch and bound, 488-489 dynamic programming, 490 integer programming formulation, 477 Traveling salesman problem, undirected, 478, 565, 518, 526 approximation algorithm, 509-510, 528 integer programming formulation, 469-470, 476 local search, 511-512, 530 lower bound, 383-384, 501-502 with triangle inequality, 509-510, 521, 528 Tree, 269 Slack variable, 6, 76 Sliding objective ellipsoid method, 379, 389 Smallest subscript rule, 94, 111, 137 Span, of a set of vectors, 29 Spanning path, 124 Spanning tree, 271-272 see also minimum spanning trees Sparsity, 107, 108, 440, 536, 537 Square matrix, 28 Stable matching problem, 563, 567 Standard form, 4-5 of shortest paths, 333 reduction to, 5-6 visualization, 25 Steepest edge rule, 94, 540-543 Steiner tree problem, 391 Stochastic matrices, 194 Stochastic programming, 254-260, 264, 564 Strong duality, 148, 184 Strong formulations, 461-465 Strongly polynomial, 357 Subdifferential, 503 Subgradient, 215, 503, 504, 526 Subgradient algorithm, 505-506, 530 Submodular function minimization, 391-392 Subspace, 29 Subtour elimination in the minimum spanning tree problem, 466 in the traveling salesman problem, 470 Supply, 272 Surplus variable, 6 Survivable network design problem, 391, 528-529 T Theorems of the alternative, 166, 194 feasible, 280 Typography, 524 u Unbounded cost, 3 Unbounded problem, 3 characterization, 177-179 Unique solution, 129, 130 to dual, 152, 190-191 Unit vector, 27 Unrestricted variable, see free variable v see also spanning tree Tree solution, 280 Vector, 26 Vehicle routing problem, 475 Vertex, 47, 50 see also basic feasible solution Volume, 364 of a simplex, 390 von Neumann algorithm, 446-448, 449 Vulnerability, 352 w Walk directed, 269 in directed graphs, 268 in undirected graphs, 267 Weak duality, 146, 184, 495 Index 587 Weierstrass' theorem, 170, 199 Worst-case running time, 362 z Zeroth column, 98 Zeroth row, 99

Related Posts