代写 R C algorithm matlab scala graph network theory AN INTRODUCTION TO OPTIMIZATION

AN INTRODUCTION TO OPTIMIZATION
SOLUTIONS MANUAL
Fourth Edition
EdwinK.P.ChongandStanislawH.Z ak
A JOHN WILEY SONS, INC., PUBLICATION

1. Methods of Proof and Some Notation
1.1
1.2
1.3
1.4
1.5
A B AB not Bnot A FFTT FTTT TFFF TTTT
A B AB not A and not B FFTT FTTT TFFF TTTT
not A not B
TT TF FT FF
not A not B
TT TF FT FF
not A and B
not A not B
T T T F
TT TF FT FF
A FFT FTT TFT TTF
A B AandBorAandnotB FFF FTF TFT TTT
B not A or not B
AandB AandnotB
FF FF FT TF
The cards that you should turn over are 3 and A. The remaining cards are irrelevant to ascertaining the truth or falsity of the rule. The card with S is irrelevant because S is not a vowel. The card with 8 is not relevant because the rule does not say that if a card has an even number on one side, then it has a vowel on the other side.
Turning over the A card directly verifies the rule, while turning over the 3 card verifies the contraposition. 2. Vector Spaces and Matrices
2.1
We show this by contradiction. Suppose n m. Then, the number of columns of A is n. Since rank A is the maximum number of linearly independent columns of A, then rank A cannot be greater than n m, which contradicts the assumption that rank A m.
2.2 .
: Since there exists a solution, then by Theorem 2.1, rank A rankA.b. So, it remains to prove that rankA n. For this, suppose that rankA n note that it is impossible for rankA n since A has only n columns. Hence, there exists y 2 Rn, y 6 0, such that Ay 0 this is because the columns of
1

A are linearly dependent, and Ay is a linear combination of the columns of A. Let x be a solution to Ax b. Then clearly x y 6 x is also a solution. This contradicts the uniqueness of the solution. Hence, rank A n.
: By Theorem 2.1, a solution exists. It remains to prove that it is unique. For this, let x and y be solutions, i.e., Ax b and Ay b. Subtracting, we get Axy 0. Since rankA n and A has n columns, then x y 0 and hence x y, which shows that the solution is unique.
2.3
Consider the vectors a i 1,ai 2 Rn1, i 1,…,k. Since k n 2, then the vectors a 1,…,a k must be linearly independent in Rn1. Hence, there exist 1, . . . k, not all zero, such that
Xk
iai 0.
i1
PThe first component of the above vector equation is Pki1 i 0, while the last n components have the form ki1 iai 0, completing the proof.
2.4
a. We first postmultiply M by the matrix
Ik O
Mmk,k Imk
to obtain
Note that the determinant of the postmultiplying matrix is 1. Next we postmultiply the resulting product
Mmk,k Imk Ik Mk,k O Mmk,k
O O
Imk O Ik
Imk. Mk,k O
by
to obtain
Notice that where
Imk O
The above easily follows from the fact that the determinant changes its sign if we interchange columns, as
Imk O
O Imk O Ik Ik O .
Mk,k O Imk O O Mk,k detMdet Ik O !det O Ik!,
O Mk,k Imk O det O Ik!1.
discussed in Section 2.2. Moreover,
det Ik O ! detIkdetMk,k detMk,k.
O Mk,k
detM detMk,k.
Hence,
b. We can see this on the following examples. We assume, without loss of generality that Mmk,k O and
let Mk,k 2. Thus k 1. First consider the case when m 2. Then we have M O Imk0 1.
Mk,k O 20 2

Thus,
Next consider the case when m 3. Then
det

O Mk,k
detM 2 detMk,k. 260 . 1 037
Imk det6 0 . 0 1 7 2 6 detMk,k. O 64 75
2 . 0 0 detM 6 detMk,k
detM detMk,k. MA B
CD
Therefore, in general,
However, when k m2, that is, when all submatrices are square and of the same dimension, then it is
true that See 121.
2.5
Let
and suppose that each block is k k. John R. Silvester 121 showed that if at least one of the blocks is equal to O zero matrix, then the desired formula holds. Indeed, if a row or column block is zero, then the determinant is equal to zero as follows from the determinants properties discussed Section 2.2. That is, if A B O, or A C O, and so on, then obviously det M 0. This includes the case when any three or all four block matrices are zero matrices.
If B O or C O then
detM detA B detAD. CD
The only case left to analyze is when A O or D O. We will show that in either case, detM detBC.
Without loss of generality suppose that D O. Following arguments of John R. Silvester 121, we premul tiply M by the product of three matrices whose determinants are unity:
Hence,
Thus we have
Ik IkIk OIk IkA B C O. O Ik Ik Ik O Ik C O A B
detA B C O CO AB
det C det B
det Ik det C det B.
detA B detBC detCB. CO
3

2.6
We represent the given system of equations in the form Ax b, where
26×137
A 1 1 2 1 , x6x27, and b 1 . 1 2 0 1 4×35 2
x4
A1 1 2 1!1 1 2 1,
Using elementary row operations yields
1 2 0 1 0 3 2 2
and
1 1,
A,b1 1 2 1 1!1 1 2
1 2 0 1 2 0 3 2 2 3
from which rank A 2 and rankA, b 2. Therefore, by Theorem 2.1, the system has a solution. We next represent the system of equations as
1 1x112x3x4 1 2 x2 2×4
Assigning arbitrary values to x3 and x4 x3 d3, x4 d4, we get x1 1 1112x3x4
x2 1 2 2×4
12 112x3x4
3 1 1 2×4 4d3 1d4 .

2×13 2 4d3 1d4 3 243 213 203
33
1 2 d3 2 d4 33
Therefore, a general solution is
6 x2 7 6 1 2 d3 2 d4 7 6 2 7 d3 6 2 7 d4 6 17 ,
3333 3333
4x354d3 5415405405 x4 d4 010
where d3 and d4 are arbitrary values. 2.7
1. Apply the definition of a: 8
8:a if a 0 0 if a 0 a if a 0
:a if a 0 0 if a 0 a if a 0
a.
2. Ifa0,thenaa. Ifa0,thenaa0a. Henceaa. Ontheotherhand,aa
by the above. Hence, a a a by property 1. 4
a

3. Wehavefourcasestoconsider. First,ifa,b0,thenab0. Hence,ababab. Second,ifa,b0,thenab0. Henceabababab.
Third, if a 0 and b 0, then we have two further subcases:
1. Ifab0,thenababab. 2. Ifab0,thenababab.
The fourth case, a 0 and b 0, is identical to the third case, with a and b interchanged. 4. Wefirstshowabab. Wehave
ab ab
ab by property 3
a b by property 1.
To show ab ab, we note that a abb abb, which implies ab ab. On the other hand, from the above we have ba ba ab by property 1. Therefore, ab ab. 5. Wehavefourcases. First,ifa,b0,wehaveab0andhenceababab. Second,ifa,b0, wehaveab0andhenceabababab. Third,ifa0,b0,wehaveab0andhence ab ab ab ab. The fourth case, a 0 and b 0, is identical to the third case, with a and b
interchanged. 6. We have
ab ab by property 3 cd.
7. : Byproperty2,aaandaa. Therefore,abimpliesaabandaab.
: Ifa0,thenaab. Ifa0,thenaab.
For the case when is replaced by , we simply repeat the above proof with replaced by . 8. This is simply the negation of property 7 apply DeMorgans Law.
2.8
Observe that we can represent hx, yi2 as
hx, yi2 x 2 3 y QxQy xQ2y,
where
Note that the matrix Q Q is nonsingular.
1. Now, hx, xi2 Qx Qx kQxk2 0, and
hx,xi2 0 , kQxk2 0 , Qx0
, x0
since Q is nonsingular.
2. hx, yi2 QxQy QyQx hy, xi2. 3. We have
hxy,zi2 xyQ2z
xQ2z yQ2z
hx,zi2 hy,zi2. 5
35
Q1 1. 12

4. hrx, yi2 rxQ2y rxQ2y rhx, yi2.
2.9
We have kxk kxyyk kxykkyk by the Triangle Inequality. Hence, kxkkyk kxyk. On the other hand, from the above we have kyk kxk ky xk kx yk. Combining the two inequalities, we obtain kxk kyk kx yk.
2.10
Let0begiven. Set. Hence,ifkxyk,thenbyExercise2.9,kxkkykkxyk. 3. Transformations
3.1
Let v be the vector such that x are the coordinates of v with respect to e1,e2,…,en, and x0 are the coordinates of v with respect to e01, e02, . . . , e0n. Then,
and
Hence,
which implies
3.2
a. We have
Therefore,
b. We have
Therefore,
3.3
We have
vx1e1 xnen e1,…,enx,
vx01e01 x0ne0n e01,…,e0nx0. e1,…,enx e01,…,e0nx0
x0 e01,…,e0n1e1,…,enx Tx.
261 2 437 e01,e02,e03 e1,e2,e34 3 1 55.
4 5 3
21 2 431 T e01,e02,e031e1,e2,e3 64 3 1 575
4 5 3
261 2 337 e1, e2, e3 e01, e02, e03 41 1 05 .
228 14 143
1 64 29 19 42 11 13
7 75. 7
345
261 2 337 T41 1 05.
345
262 2 337 e1,e2,e3 e01,e02,e034 1 1 05.
1 2 1 6

Therefore, the transformation matrix from e01, e02, e03 to e1, e2, e3 is
262 2 337 T41 1 05,
1 2 1
Now, consider a linear transformation L : R3 ! R3, and let A be its representation with respect to
e1, e2, e3, and B its representation with respect to e01, e02, e03. Let y Ax and y0 Bx0. Then, y0 Ty TAx TAT1x0 TAT1x0.
Hence, the representation of the linear transformation with respect to e01,e02,e03 is
3.4
We have
1 26310837 41 8 4 5.
2 13 7
261 1 1 137
e01, e02, e03, e04 e1, e2, e3, e4 60 1 1 17 . 40 0 1 15
B TAT
0001 Therefore, the transformation matrix from e1, e2, e3, e4 to e01, e02, e03, e04 is
21 1 1 131 21 1 0 03 T60 1 1 17 60 1 1 07.
40 0 1 15 40 0 1 15 0001 0001
Now, consider a linear transformation L : R4 ! R4, and let A be its representation with respect to
e1, e2, e3, e4, and B its representation with respect to e01, e02, e03, e04. Let y
Ax and y0 Bx0.
Let v1, v2, v3, v4 be a set of linearly independent eigenvectors of A corresponding to the eigenvalues 1, 2, 3, and 4. Let T v1, v2, v3, v4. Then,
Then, Therefore,
3.5
y0 Ty TAx TAT1x0 TAT1x0.
265 3 4 337
1 27. 41 0 1 25
B TAT1 63 2 1114
Hence,
261 0 037 AT T 4 0 2 0 5 ,
0 0 3 7
40 0 3 05 0 0 0 4
AT
1v1,2v2,3v3,4v4 v1,v2,v3,v46 0 2 0 0 7.
Av1, v2, v3, v4 Av1, Av2, Av3, Av4261 0 0 0 37

or
1 2610037 T AT 4 0 2 0 5 .
0 0 3
Therefore, the linear transformation has a diagonal matrix form with respect to the basis formed by a linearly independent set of eigenvectors.
Because
theeigenvaluesare1 2,2 3,3 1,and4 1.
detA 2 3 1 1,
From Avi ivi, where vi 6 0 i 1, 2, 3, the corresponding eigenvectors are
26037 26037 26 0 37 26 24 37 v1 607, v2 607, v3 6 2 7,and v4 6127.
Therefore, the basis we are interested in is
415 415 495 415 0119
826 037 26 037 26 0 37 26 24 379
607, 607, 6 2 7, 6127 . :415 415 495 4 1 5;
v1,v2,v3
Suppose v1, . . . , vn are eigenvectors of A corresponding to 1, . . . , n, respectively. Then, for each i
3.6
1119
1,…,n, we have
which shows that 11,…,1n are the eigenvalues of In A.
In Avi vi Avi vi ivi 1ivi
Alternatively, we may write the characteristic polynomial of In A as
InA1 det1 In In A detIn A 1nA,
which shows the desired result.
3.7
Let x,y 2 V?, and , 2 R. To show that V? is a subspace, we need to show that xy 2 V?. For this, let v be any vector in V. Then,
vx y vx vy 0,
since vx vy 0 by definition.
3.8
The null space of A is N A x 2 R3 : Ax 0 . Using elementary row operations and backsubstitution, we can solve the system of equations:
26 4 2 0 37 26 4 2 0 37 26 4 2 0 37 4 x 1 2 x 2 0 42 1 15!40 2 15!40 2 15 2x2x3 0
2 3 1 0 2 1 0 0 0
1 11 2×13213
67647
x2 2×3, x1 2×2 4×3 x4x25415x3.
8
2 x3 1

Therefore,
3.9
8 26 1 37 9 NA :4245c : c 2 R;.
Let x,y 2 RA, and , 2 R. Then, there exists v,u such that x Av and y Au. Thus, x y Av Au Av u.
Hence, x y 2 RA, which shows that RA is a subspace. Letx,y2NA,and,2R. Then,Ax0andAy0. Thus,
Ax y Ax Ay 0. Hence, x y 2 N A, which shows that N A is a subspace.
3.10
Let v 2 RB, i.e., v Bx for some x. Consider the matrix A v. Then, NA NA v, since if u 2 NA, then u 2 NB by assumption, and hence uv uBx xBu 0. Now,
dim RA dim N A m dim RA v dim N A v m.
and
Since dim N A dim N A v, then we have dim RA dim RA v. Hence, v is a linear combi
nation of the columns of A, i.e., v 2 RA, which completes the proof.
3.11
We first show V V??. Let v 2 V, and u any element of V?. Then uv vu 0. Therefore, v2V??. ?? ??
We now show V V . Let a1,…,ak be a basis for V , and b1,…,bl a basis for V . Define A a1 ak and B b1 bl, so that V RA and V ?? RB. Hence, it remains to show that RB RA. Using the result of Exercise 3.10, it suces to show that NA NB. So let x 2 NA, which implies that x 2 RA? V?, since RA? NA. Hence, for all y, we have Byx 0 yBx, which implies that Bx 0. Therefore, x 2 NB, which completes the proof.
3.12
Letw2W?,andybeanyelementofV. SinceVW,theny2W. Therefore,bydefinitionofw,wehave wy 0. Therefore, w 2 V?.
3.13
Let r dimV. Let v1,…,vr be a basis for V, and V the matrix whose ith column is vi. Then, clearly V RV.
Let u1,…,unr be a basis for V?, and U the matrix whose ith row is ui . Then, V? RU, and V V?? RU? NU by Exercise 3.11 and Theorem 3.4.
3.14
a. Let x 2 V. Then, x PxI Px. Note that Px 2 V, and I Px 2 V?. Therefore, xPxIPxisanorthogonaldecompositionofxwithrespecttoV. However,xx0isalsoan orthogonal decomposition of x with respect to V. Since the orthogonal decomposition is unique, we must have x P x.
b. Suppose P is an orthogonal projector onto V. Clearly, RP V by definition. However, from part a, xPxforallx2V,andhenceVRP. Therefore,RPV.
3.15
To answer the question, we have to represent the quadratic form with a symmetric matrix as x 11 811 1!xx 1 72x.
2 1 1 2 8 1 72 1 9

The leading principal minors are 1 1 and 2 454. Therefore, the quadratic form is indefinite.
3.16
The leading principal minors are 1 2, 2 0, 3 0, which are all nonnegative. However, the eigenvalues of A are 0, 1.4641, 5.4641 for example, use Matlab to quickly check this. This implies that the matrix A is indefinite by Theorem 3.7. An alternative way to show that A is not positive semidefinite is to find a vector x such that xAx 0. So, let x be an eigenvector of A corresponding to its negative eigenvalue 1.4641. Then, xAx xx xx kxk2 0. For this example, we can take x 0.3251, 0.3251, 0.8881 , for which we can verify that x Ax 1.4643.
3.17
a. The matrix Q is indefinite, since 2 1 and 3 2.
b. Let x 2 M. Then, x2 x3 x1, x1 x3 x2, and x1 x2 x3. Therefore,
xQxx1x2 x3x2x1 x3x3x1 x2x21 x2 x23. This implies that the matrix Q is negative definite on the subspace M.
3.18
a. We have
Then,
260 0 037 26×137 fx1,x2,x3 x2 x1,x2,x340 1 054×25.
0 0 0 x3
260 0 037 Q40 1 05
000
and the eigenvalues of Q are 1 0, 2 1, and 3 0. Therefore, the quadratic form is positive semidefinite.
b. We have
Then,
and the eigenvalues of Q are 1 2, 2 1 p22, and 3 1 p22. Therefore, the quadratic form is indefinite.
c. We have
Then,
fx1,x2,x3x21 2×2 x1x3 x1,x2,x34 0 2 0 54×25. 1 0 0 x3
21 0 132×13 6 2767
2 21 0 13
627 Q40 2 05
1 0 0 2
261 1 137 26×137 fx1,x2,x3x21 x23 2x1x2 2x1x3 2x2x3 x1,x2,x341 0 154×25.
1 1 1 x3
261 1 137 Q41 0 15
111
and the eigenvalues of Q are 1 0, 2 1 p3, and 3 1 p3. Therefore, the quadratic form is indefinite.
10

3.19
We have
Let
fx1, x2, x3
26 4 2 6 37 26 x 1 37
Q42 1 35, x4x25x1e1 x2e2 x3e3,
andq vQv fori,j1,2,3. ij i j
and, in this case, we get
Case of i 1. From v1 Qe1 1,
Therefore,
viQei 1,
2611 0 037 Q40 22 05.
0 0 33
11e1Qe1 11e1 Qe1 11q11 1.
4×21 x2 9×23 4x1x2 6x2x3 12x1x3
26 4 2 6 3726×137 x1,x2,x342 1 354×25.
6 3 9 x3
where e1, e2, and e3 form the natural basis for R3.
Let v1, v2, and v3 be another basis for R3. Then, the vector x is represented in the new basis as x , where
x v1,v2,v3x V x .
Now, fx xQx V x QV x x V QV x x Q x , where
2q q q 3 611 12 137
Q4q q q 5 21 22 23
q q q 31 32 33
Wewillfindabasisv ,v ,v suchthatq 0fori6j,andisoftheform 123 ij
v1 11e1
v2 21e122e2
v3 31e1 32e2 33e3
Because
wededucethatifviQej 0forji,thenviQvj 0. Inthiscase,
q vQv vQ e … e vQe… vQe, ij i j i j1 1 jj j j1 i 1 jj i j
q vQv vQ e … e vQe… vQe vQe. ii i i i i1 1 ii i i1 i 1 ii i i ii i i
Our task therefore is to find vi i 1, 2, 3 such that
viQej 0, ji
6 3 9 x3
11 1 1 1 q11 1 4
213 v111e16047.
11
405 0

Case of i 2. From v2 Qe1 0,
21e1 22e2Qe1 21e1 Qe1 22e2 Qe1 21q11 22q21 0. From v2 Qe2 1,
Therefore,
21e1 22e2Qe2 21e1 Qe2 22e2 Qe2 21q12 22q22 1. q11 q21 21 0 .
q12 q22 22 1
But, since 2 0, this system of equations is inconsistent. Hence, in this problem v2 Qe2 0 should be satisfied instead of v2 Qe2 1 so that the system can have a solution. In this case, the diagonal
matrix becomes
2611 0 037 Q40 0 05,
0 0 33
q11 q2121 0 21 1 ,
where 22 is an arbitrary real number. Thus,
213
v2 21e1 22e2 641275a, 0
Since in this case 3 detQ 0, we will have to apply the same reasoning of the previous case and use the condition v3 Qe3 0 instead of v3 Qe3 1. In this way the diagonal matrix becomes
2611 0 037 Q40 005.
000 Thus, from v3 Qe1 0, v3 Qe2 0 and v3 Qe3 0,
and the system of equations become
q12 q22 22 0 22 12 22
where a is an arbitrary real number. Case of i 3.
Therefore,
263137 263137 Q 4325 Q 4325
33 33
26 4 2 6 37263137 26037 42 1 354325 405.
6 3 9 33 0
26q11 q21 4q12 q22 q13 q23
q3137 263137 q325 4325 q33 33
263137 26 31 37 43254231 3335,
33 33 12

where 31 and 33 are arbitrary real numbers. Thus,
26 b 37 v3 31e1 32e2 33e3 42b3c5,
c
We represent this quadratic form as fx xQx, where
26 1 1 37 Q4 1 25.
1 2 5
The leading principal minors of Q are 1 1, 2 1 2, 3 52 4. For the quadratic form to be positive definite, all the leading principal minors of Q must be positive. This is the case if and only if 2 45, 0.
3.21
The matrix Q Q 0 can be represented as Q Q12Q12, where Q12 Q12 0. 1. Now, hx, xiQ Q12xQ12x kQ12xk2 0, and
hx,xiQ 0 , kQ12xk2 0 , Q12x 0
hxy,ziQ xyQz
xQzyQz
hx,ziQ hy,ziQ. 4. hrx, yiQ rxQy rxQy rhx, yiQ.
where b and c are arbitrary real numbers. Finally,
21ab3 6427
where a, b, and c are arbitrary real numbers. 3.20
V x1,x2,x340 a 2b3c5, 00c
, x0 2. hx,yiQ xQy yQx yQx hy,xiQ.
since Q12 is nonsingular. 3. We have
3.22
We have
We first show that kAk1 maxi
PkAk1 maxkAxk1 : kxk1 1.
nk1 aik. For this, note that for each x such that kxk1 1, we have
kAxk1
aik xk
X Xn
n
max
ik1
max aik xk
i
k1
k1
max 13
aik,
Xn i

since xk maxk xk kxk1 1. Therefore,
Pn k1 maxi k1 aik. So, let j be such that
Xn
ajk max
Define x by
Clearly kx k1 1. Furthermore, for i 6 j,
and Therefore,
3.23
We have
We first show that kAk1 maxk
Xn Xn a j k x k
k1 k1
a j k .
Xn
ajk max
i k1
aik.
Xn aik.
k1
if ajk 6 0 otherwise.
Xn Xn
aikx k aik max
k1 k1 ik1 k1
k1
i
ajkajk x k 1
Xn Xn kAx k1 max aikx k
aik.
i k1 k1
Xn i
kAk1 max
To show that kAk1 maxi Pn aik, it remains to find a x 2 Rn, kx k1 1, such that kAx k1
k1
Xn Xn aik
ajk
since Pnk1 xk kxk1 1. Therefore,
Xm
P kAk1 maxkAxk1 : kxk1 1.
mi1 aik. For this, note that for each x such that kxk1 1, we have
kAxk1
Xm Xn aik xk
i1 k1
Xm Xn i1 k1
aik xk nm
XX!
xk aik k1 i1
XXm max aik
Xn
xk
k1
k
m
max aik,
k
i1
kAk1 max k
i1
aik.
14
i1

To shPow that kAk1 maxk Pmi1 aik, it remains to find a x 2 Rm, kx k1 1, such that kAx k1 maxk mi1 aik. So, let j be such that
i1 i1 Definex by 1 ifkj
Clearly kx k1 1. Furthermore,
kAx k1 aikx k
4. Concepts from Geometry
4.1
Xm Xm
aij max aik. k
x k 0 otherwise .
Xm Xn Xm i1 k1 i1
Xm k i1
aij max
aik.
: Let S x : Ax b be a linear variety. Let x, y 2 S and 2 R. Then,
Ax 1 y Ax 1 Ay b 1 b b.
Therefore, x 1 y 2 S.
: If S is empty, we are done. So, suppose x0 2 S. Consider the set S0 S x0 x x0 : x 2 S.
Clearly, for all x,y 2 S0 and 2 R, we have x1y 2 S0. Note that 0 2 S0. We claim that S0
is a subspace. To see this, let x,y 2 S0, and 2 R. Then, x x10 2 S0. Furthermore,
1 x 1 y 2 S0, and therefore x y 2 S0 by the previous argument. Hence, S0 is a subspace. Therefore, by 22
S S0x0yx0:y2NA yx0 :Ay0
yx0 :Ayx0b
x:Axb.
4.2
Letu,v2x2Rn :kxkr,and20,1. Supposezu1v. Toshowthatisconvex, weneedtoshowthatz2,i.e.,kzkr. Tothisend,
kzk2 u 1vu1v
2kuk2 21 uv 1 2kvk2.
Since u,v 2 , then kuk2 r2 and kvk2 r2. Furthermore, by the CauchySchwarz Inequality, we have uv kukkvk r2. Therefore,
kzk2 2r2 21r2 12r2 r2.
Hence, z 2 , which implies that is a convex set, i.e., the any point on the line segment joining u and v
is also in .
4.3
Letu,v2x2Rn :Axb,and20,1. Supposezu1v. Toshowthatisconvex, weneedtoshowthatz2,i.e.,Azb. Tothisend,
Az Au1v Au1Av.
Exercise 3.13, there exists A such that S0 NA x : Ax 0. Define b Ax0. Then,
15

Since u,v 2 , then Au b and Av b. Therefore,
Az b 1 b b,
and hence z 2 .
4.4
Letu,v2x2Rn :x0,and20,1. Supposezu1v. Toshowthatisconvex, we need to show that z 2 , i.e., z 0. To this end, write x x1,…,xn, y y1,…,yn, and z z1,…,zn. Then, zi xi 1yi, i 1,…,n. Since xi,yi 0, and ,1 0, we have zi 0. Therefore, z 0, and hence z 2 .
5. Elements of Calculus
5.1
5.2
Observe that
Therefore, if kAk 1, then limk!1 kAkk O which implies that limk!1 Ak O.
kAkk kAk1kkAk kAk2kkAk2 kAkk.
For the case when A has all real eigenvalues, the proof is simple. Let be the eigenvalue of A with largest
absolute value, and x the corresponding normalized eigenvector, i.e., Ax x and kxk 1. Then, kAk kAxk kxk kxk ,
which completes the proof for this case.
In general, the eigenvalues of A and the corresponding eigenvectors may be complex. In this case, we
proceed as follows see 41. Consider the matrix BA,
kAk where is a positive real number. We have
kBk kAk 1. kAk
By Exercise 5.1, Bk ! O as k ! 1, and thus by Lemma 5.1, iB 1, i 1,…,n. On the other hand,
for each i 1,…,n,
and thus
which gives
Since the above arguments hold for any 0, we have iA kAk.
5.3
a. rfx ab bax. b. Fxabba.
iB iA , kAk
iB iA 1. kAk
iA kAk .
16

5.4
We have and
By the chain rule,
5.5
We have and
By the chain rule,
and
5.6
We have and
Dfx x13,x22, dgt 3.
d Ft dt
dt 2 Dfgtdgt
5t1.
Dfx x22,x12,

dt 3 3t 53, 2t 62 2
gs,t 4, gs,t 3. s 2 t 1
fgs,t s
fgs,t t
Dfgtgs,t
s
12st,4s3t 4 22
8s5t,
Dfgtgs,t
t 12st,4s3t 3
21 5s3t.
Dfx 3x21x2x23
d x t 64 2 t 75 .
x2, x31x23 x1, 2x31x2x3 1 2et 3t23
dt 1
17

By the chain rule,
d fxt dt
Dfxtdxt dt
2et 3t23 3x1t2x2tx3t2 x2t, x1t3x3t2 x1t, 2x1t3x2tx3t 1 64 2t 75
12tet 3t23 2tet 6t2 2t1. Let 0 be given. Since fx ogx, then
lim kfxk 0. x!0 gx
Hence, there exists 0 such that if kxk , then
kfxk ,
5.7
1
which can be rewritten as
gx
kfxk gx.
5.8
By Exercise 5.7, there exists 0 such that if kxk , then ogx gx2. Hence, if kxk , x 6 0,
then
5.9
We have that
and
fx gx ogx gx gx2 1gx 0. 2
x:f1x12x:x21 x2 12,
x:f2x16x:x2 8×1.
To find the intersection points, we substitute x2 8×1 into x21 x2 12 to get x41 12×21 64 0.
Solving gives x21 16, 4. Clearly, the only two possibilities for x1 are x1 4, 4, from which we obtain x2 2, 2. Hence, the intersection points are located at 4, 2 and 4, 2.
The level sets associated with f1x1, x2 12 and f2x1, x2 16 are shown as follows.
18

f1x1,x2 12
f1x1,x2 12 4,2
x2
f2x1,x2 16
3 2 1
12 12312 4,2
f2x1,x2 16
x1
5.10
a. We have We compute
Hence,
b. We compute
fx fxo Dfxox xo 1x xoD2fxox xo . 2
Dfx ex2,x1ex2 1, 0 ex2
D2fx . ex2 x1 ex2
21,0×1 1 1×1 1,×2 0 1×1 1 x22 11×2
fx
1x1x2x1x21x2.
2
Dfx 4×31 4x1x2,4x21x2 4×32,
D2fx 12×21 4×2 8x1x2 . 8x1x2 4×21 12×2
Expanding f about the point xo yields
fx 48,8×1 1 1×1 1,×2 116 8×1 1
x21 2 8 16 x21 8×21 8×2 16×1 16×2 8x1x2 12 .
19

c. We compute
D2fx Expanding f about the point xo yields
.
Dfx ex1x2 ex1x2 1,ex1x2 ex1x2 1,
ex1x2 ex1x2 ex1x2 ex1x2
ex1x2 ex1x2 ex1x2 ex1x2
fx 22e2e1,1×1 1 1×1 1,x22e 0x1 1 x22 02ex2
1×1 x2 e1x21 x2 .
20

6. Basics of Unconstrained Optimization
6.1
a. In this case, x is definitely not a local minimizer. To see this, note that d 1,2 is a feasible direction at x. However, drfx 1, which violates the FONC.
b. In this case, x satisfies the FONC, and thus is possibly a local minimizer, but it is impossible to be definite based on the given information.
c. In this case, x satisfies the SOSC, and thus is definitely a strict local minimizer.
d. In this case, x is definitely not a local minimizer. To see this, note that d 0, 1 is a feasible direction
at x, and drfx 0. However, dFxd 1, which violates the SONC.
6.2
Because there are no constraints on x1 or x2, we can utilize conditions for unconstrained optimization. To proceed, we first compute the function gradient and find the critical points, that is, the points that satisfy
the FONC,
The components of the gradient rfx1,x2 are
rfx1, x2 0.
f x214 and f x216.
x1 x2
x1 2, x2 2 , x3 2, and x4 2.
Thus there are four critical points:
4 4 4 4
We next compute the Hessian matrix of the function f: Fx2x1 0 .
Note that Fx1 0 and therefore, x1 is a strict local minimizer. Next, Fx4 0 and therefore, x4 is a strict local maximizer. The Hessian is indefinite at x2 and x3 and so these points are neither maximizer nor minimizers.
6.3
Supposex isaglobalminimizeroff over,andx 20 . Letx20. Then,x2andtherefore fx fx. Hence, x is a global minimizer of f over 0.
6.4
Suppose x is an interior point of . Therefore, there exists 0 such that y : ky xk . Since x isalocalminimizeroffover,thereexists0 0suchthatfxfxforallx2y:kyxk0. Take00 min,0. Then,y:kyxk000,andfxfxforallx2y:kyxk00. Thus, x is a local minimizer of f over 0.
To show that we cannot make the same conclusion if x is not an interior point, let 0, 0 1, 1, and fx x. Clearly, 0 2 is a local minimizer of f over . However, 0 2 0 is not a local minimizer of f over 0.
6.5
a. The TONC is: if f000 0, then f0000 0. To prove this, suppose f000 0. Now, by the FONC, we also have f00 0. Hence, by Taylors theorem,
fx f0 f0000x3 ox3. 3!
Since 0 is a local minimizer, fx f0 for all x suciently close to 0. Hence, for all such x, f0000x3 ox3.
3!
21
0 2×2

Now, if x 0, then
which implies that f0000 0. On the other hand, if x 0, then
ox3 f0000 3! x3 ,
which implies that f0000 0. This implies that f0000 0, as required.
b. Let fx x4. Then, f00 0, f000 0, and f0000 0, which means that the FONC, SONC, and
TONC are all satisfied. However, 0 is not a local minimizer: fx 0 for all x 6 0. c. The answer is yes. To see this, we first write
fx f0 f00x f000x2 f0000x3. 2 3!
Now, if the FONC is satisfied, then
fx f0 f000x2 f0000x3. 2 3!
Moreover, if the SONC is satisfied, then either i f000 0 or ii f000 0. In the case i, it is clear from the above equation that fx f0 for all x suciently close to 0 because the third term on the righthand side is ox2. In the case ii, the TONC implies that fx f0 for all x. In either case, fx f0 for all x suciently close to 0. This shows that 0 is a local minimizer.
6.6
a. The TONC is: if f00 0 and f000 0, then f0000 0. To prove this, suppose f00 0 and f000 0. By Taylors theorem, for x 0,
fx f0 f0000x3 ox3. 3!
ox3 f0000 3! x3 ,
Since 0 is a local minimizer, fx f0 for suciently small x 0. Hence, for all x 0 suciently small, ox3
f0000 3! x3 . This implies that f0000 0, as required.
b. Let fx x4. Then, f00 0, f000 0, and f0000 0, which means that the FONC, SONC, and TONC are all satisfied. However, 0 is not a local minimizer: fx 0 for all x 0.
6.7
For convenience, let z0 x0 argminx2 fx. Thus we want to show that z0 argminy20 fy; i.e., for all y 2 0, fy x0 fz0 x0. So fix y 2 0. Then, y x0 2 . Hence,
which completes the proof.
6.8
a. The gradient and Hessian of f are
fz0x0, rfx 21 3×3
fyx0

Fx 21 3. 37
minfx x2
f arg min f x x2
22
375

Hence, rf1,1 11,25, and F1,1 is as shown above.
b. The direction of maximal rate of increase is the direction of the gradient. Hence, the directional derivative
with respect to a unit vector in this direction is
rfx rfx rfxrfx krfxk.
krf xk p krf xk Atx1,1,wehavekrf1,1k 11225227.31.
c. The FONC in this case is rfx 0. Solving, we get
x 32.
1
The point above does not satisfy the SONC because the Hessian is not positive semidefinite its determinant
is negative.
6.9
a. A dierentiable function f decreases most rapidly in the direction of the negative gradient. In our problem, r f x h f f i h 2 x 1 x 2 x 32 x 21 3 x 1 x 2 2 i .
rf x0 h5 10i . b. The rate of increase of f at x0 in the direction rf x0 is
rfx0
rf x0 krf x0 k 1255 5.
x1 x2 Hence, the direction of most rapid decrease is
p p rfx0 d h5 10i3111.
krf x0 k
c. The rate of increase of f at x0 in the direction d is
kdk 45 fx1x4 4xx37.
rfx 4 4×3, 424
Fx 4 4. 42
Hence rf0,1 7,6. The directional derivative is
1, 0rf0, 1 7.
6.10
a. We can rewrite f as
The gradient and Hessian of f are
2424
23

b. The FONC in this case is rfx 0. The only point satisfying the FONC is x 1 5.
42
The point above does not satisfy the SONC because the Hessian is not positive semidefinite its determinant
is negative. Therefore, f does not have a minimizer.
6.11
a. Write the objective function as fx x2. In this problem the only feasible directions at 0 are of the form d d1,0. Hence, drf0 0 for all feasible directions d at 0.
b. The point 0 is a local maximizer, because f0 0, while any feasible point x satisfies fx 0.
The point 0 is not a strict local maximizer because for any x of the form x x1,0, we have fx
0 f0, and there are such points in any neighborhood of 0.
The point 0 is not a local minimizer because for any point x of the form x x1,x21 with x1 0, we
have fx x41 0, and there are such points in any neighborhood of 0. Since 0 is not a local minimizer, it is also not a strict local minimizer.
6.12
a. We have rfx 0,5. The only feasible directions at x are of the form d d1,d2 with d2 0. Therefore, for such feasible directions, drfx 5d2 0. Hence, x 0,1 satisfies the first order necessary condition.
b. We have F x O. Therefore, for any d, dF xd 0. Hence, x 0, 1 satisfies the second order necessary condition.
c. Consider points of the form x x1, x21 1, x1 2 R. Such points are in , and are arbitrarily close to x. However, for such points x 6 x,
fx5x21 155×21 5fx. Hence, x is not a local minimizer.
6.13
a. We have rfx 3,0. The only feasible directions at x are of the form d d1,d2 with d1 0. Therefore, for such feasible directions, drfx 3d1 0. Hence, x 2,0 satisfies the first order necessary condition.
b. We have F x O. Therefore, for any d, dF xd 0. Hence, x 2, 0 satisfies the second order necessary condition.
c. Yes, x is a local minimizer. To see this, notice that any feasible point x x1,x2 6 x is such that x1 2. Hence, for such points x 6 x,
In fact, x is a strict local minimizer.
6.14
fx 3×1 6 fx.
a. We have rfx 0,1, which is nonzero everywhere. Hence, no interior point satisfies the FONC. Moreover, any boundary point with a feasible direction d such that d2 0 cannot be satisfy the FONC, because for such a d, drfx d2 0. By drawing a picture, it is easy to see that the only boundary point remaining is x 0, 1. For this point, any feasible direction satisfies d2 0. Hence, for any feasible direction, drfx d2 0. Hence, x 0,1 satisfies the FONC, and is the only such point.
b. We have F x O. So any point and in particular x 0, 1 satisfies the SONC.
c. The point x 0, 1 is not a local minimizer. To see this, consider points of the form x p1 x2, x2 where x2 2 12, 1. It is clear that such points are feasible, and are arbitrarily close to x 0, 1. However, for such points, fx x2 1 fx.
24

6.15
a. We have rfx 3,0. The only feasible directions at x are of the form d d1,d2 with d1 0. Therefore, for such feasible directions, drfx 3d1 0. Hence, x 2,0 satisfies the first order necessary condition.
b. We have F x O. Therefore, for any d, dF xd 0. Hence, x 2, 0 satisfies the second order necessary condition.
c. Consider points of the form x x2 2, x2, x2 2 R. Such points are in , and could be arbitrarily close to x. However, for such points x 6 x,
fx3x2 266×2 6fx. Hence, x is not a local minimizer.
6.16
a. We have rfx 0. Therefore, for any feasible direction d at x, we have drfx 0. Hence, x
satisfies the firstorder necessary condition. b. We have
Any feasible direction d at x has the form d d1,d2 where d2 2d1, d1,d2 0. Therefore, for any feasible direction d at x, we have
dFxd8d21 2d2 8d21 22d12 0. Hence, x satisfies the secondorder necessary condition.
c. We have fx 0. Any point of the form x x1,x21 2×1, x1 0, is feasible and has objective function value given by
fx4x21 x21 2×12 x41 4x310fx,
Moreover, there are such points in any neighborhood of x. Therefore, the point x is not a local minimizer.
6.17
a. We have rfx 1×1,1×2. If x were an interior point, then rfx 0. But this is clearly impossible. Therefore, x cannot possibly be an interior point.
b. We have F x diag1x21, 1×2, which is negative definite everywhere. Therefore, the secondorder necessary condition is satisfied everywhere. Note that because we have a maximization problem, negative definiteness is the relevant condition.
6.18
so that x is the minimizer of f. By the FONC, and hence Xn
Fx8 0. 0 2
Given x 2 R, let Xn fx
which on solving gives
x xi2, f 0 x 0 ,
2x xi 0, i1
1 Xn x n xi.
i1 25
i1

6.19
Let 1 be the angle from the horizontal to the bottom of the picture, and 2 the angle from the horizontal to the top of the picture. Then, tan tan2 tan11 tan2 tan1. Now, tan1 bx and tan2 a bx. Hence, the objective function that we wish to maximize is
fx abxbx a 1babx2 xbabx
. a2 bab
We have
Let x be the optimal distance. Then, by the FONC, we have f0x 0, which gives
f0x x ba bx2 1 x2 . 1bab p0
x 2
x bab.
The squared distance from the sensor to the babys heart is 1 x2, while the squared distance from the sensor to the mothers heart is 1 2 x2. Therefore, the signal to noise ratio is
6.20
We have
12×2 fx 1 x2 .
22x1x22x12x2 1 x22
f0x
1×22 .
4×2 2×1
By the FONC, at the optimal position x, we have f0x 0. Hence, either x 1 p2 or x 1 p2.
From the figure, it easy to see that x 1 p2 is the optimal position. 6.21
a. Let x be the decision variable. Write the total travel time as fx, which is given by
p1x2 p1dx2 fx v v .
12
Dierentiating the above expression, we get
f0x px p dx .
v1 1×2 v2 1dx2
By the first order necessary condition, the optimal path satisfies f0x 0, which corresponds to
px pdx , v1 1×2 v2 1dx2
or sin 1v1 sin 2v2. Upon rearranging, we obtain the desired equation. b. The second derivative of f is given by
f00x 1 1 . v11 x232 v21 d x232
Hence, f00x 0, which shows that the second order sucient condition holds. 26

6.22
a. We have fx U1x1 U2x2 and x : x1, x2 0, x1 x2 1. A picture of looks like: x2
1
0 1×1
b. We have rfx a1,a2. Because rfx 6 0, for all x, we conclude that no interior point satisfies the FONC. Next, consider any feasible point x for which x2 0. At such a point, the vector d 1, 1 is a feasible direction. But then drfx a1 a2 0 which means that FONC is violated recall that the problem is to maximize f. So clearly the remaining candidates are those x for which x2 0. Among these, if x1 1, then d 0,1 is a feasible direction, in which case we have drfx a2 0. This leaves the point x 1,0. At this point, any feasible direction d satisfies d1 0 and d2 d1. Hence, for any feasible direction d, we have
drfx d1a1 d2a2 d1a1 d1a2 d1a1 a2 0. So, the only feasible point that satisfies the FONC is 1,0.
c. We have Fx O 0. Hence, any point satisfies the SONC again, recall that the problem is to maximize f.
6.23
We have
Setting rfx 0 we get
rfx 4×1 x23 2×1 2 . 4×1 x23 2×2 2
4×1 x23 2×1 2 0 4×1 x23 2×2 2 0.
Adding the two equations, we obtain x1 x2, and substituting back yields x1 x2 1.
Hence, the only point satisfying the FONC is 1,1.
We have Hence
Fx12x1 x22 2 12×1 x22 . 12×1 x22 12×1 x22 2
F1,12 0 0 2
Since F 1, 1 is not positive semidefinite, the point 1, 1 does not satisfy the SONC.
6.24
Suppose d is a feasible direction at x. Then, there exists 0 0 such that x d 2 for all 2 0, 0. Let 0 be given. Then, xd 2 for all 2 0,0. Since 0 0, by definition d is also a feasible direction at x.
6.25
: Suppose d is feasible at x 2 . Then, there exists 0 such that xd 2 , that is, Axd b.
Since Ax b and 6 0, we conclude that Ad 0. 27

: Suppose Ad 0. Then, for any 2 0,1, we have Ad 0. Adding this equation to Ax b, we obtainAxdb,thatis,xd2forall20,1. Therefore,disafeasibledirectionatx.
6.26
The vector d 1, 1 is a feasible direction at 0. Now,
drf0 f 0 f 0.
x1 x2
Since rf0 0 and rf0 6 0, then
Hence, by the FONC, 0 is not a local minimizer.
6.27 We have rfx c 6 0. Therefore, for any x 2, we have rfx 6 0. Hence, by Corollary 6.1, x 2 cannot be a local minimizer and therefore it cannot be a solution.
6.28
The objective function is fx c1x1 c2x2. Therefore, rfx c1,c2 6 0 for all x. ThuSs, by FONC, the optimal solutionSx cannot lie in the interior of the feasible set. Next, for all x 2 L1 L2, d 1,1 is a feasible direction. Therefore, drfx c1 c2 0. Hence, by FONC, the optimal solution x cannot lie in L1 L2. Lastly, for all x 2 L3, d 1,1 is a feasible direction. Therefore, drfx c2 c1 0. Hence, by FONC, the optimal solution x cannot lie in L3. Therefore, by elimination, the unique optimal feasible solution must be 1,0.
6.29
drf0 0.
a. We write
1 Xn
fa,b n a2x2i b2 yi2 2xiab2xiyia2yib i1 ! !
1 Xn 1 Xn
a2 x2ib22 xiab
n i1
1Xn ! 1Xn ! 1Xn !
n i1
2 n xiyi a2 n yi b n yi2
i1 i1 i1
1 Pn xa n i1 i n i1 i
1 Pn x2 ab1Pnxi 1 b
n i1
1Xn 1Xn a1Xn
2 n xiyi,n
i1 i1 i1
yi b n yi2
b. If the point z a,b is a solution, then byPthe FONC, we have rfz 2Qz 2c 0,
zQz2czd, where z, Q, c and d are defined in the obvious way.
which means Qz c. Now, since X2 X2 1 n xi X2, and the xi are not all equal, then n i1
X2 X2 X X2 Y X2Y XXY X2X2
det Q X2 X2 6 0. Hence, Q is nonsingular, and hence
2 XY XY 3 zQ1c 1 1 X XY 4 X2X2 5.
Since Q 0, then by the SOSC, the point z is a strict local minimizer. Since z is the only point satisfying the FONC, then z is the only local minimizer.
c. We have
a X b X Y X Y X X 2 Y X X Y Y . X2 X2 X2 X2
28

6.30
Given x 2 Rn, let
be the average squared error between x and x1, . . . , xp. We can rewrite f as
fx
1 Xp p i1
!
1Xp 1
fx
1p X
kx xik2 x xix xi
p i1
Hence, we get
i.e., x is just the average, or centroid, or center of gravity, of x1 , . . . , xp .
xx 2 xi x kxik2. p i1 p
So f is a quadratic function. Since x is the minimizer of f, then by the FONC, rfx 0, i.e.,
1 Xp p i1
2 x 2 x
x i 0 .
1 Xp p i1
x i ,
The Hessian of f at x is
which is positive definite. Hence, by the SOSC, x is a strict local minimizer of f in fact, it is a strict global
F x 2 I n , minimizer because f is a convex quadratic function.
6.31
Fix any x 2 . The vector d x x is feasible at x by convexity of . By Taylors formula, we have fx fx drfx okdk fx ckdk okdk.
Therefore, for all x suciently close to x, we have fx fx. Hence, x is a strict local minimizer. 6.32
Since f 2 C2, F x F x. Let d 6 0 be a feasible directions at x. By Taylors theorem, fx d fx 1drfx dFxd okdk2.
and the proof is completed.
6.33
Necessity follows from the FONC. To prove suciency, we write f as
fx 1x xQx x 1xQx
where x Q1b is the unique vector satisfying the FONC. Clearly, since 1xQx is a constant, and
2
Using conditions a and b, we get
Therefore, for all d such that kdk is suciently small,
fx d fx ckdk2 okdk2, fx d fx,
22
Q 0, then
2
fx fx 1xQx, 2
29

and fx fx if and only if x x. 6.34
Write u u1,…,un. We have
xn
aaxn2 bun1 bun
cu,
where c an1b, . . . , ab, b. Therefore, the problem can be written as
minimize ruu qcu, which is a positive definite quadratic in u. The solution is therefore
u q c, 2r
or, equivalently, ui qanib2r, i 1, . . . , n. 7. One Dimensional Search Methods
7.1
axn1 bun
abun1
anx0 an1bu1 abun1 bun
a2 xn2 .
bun
The range reduction factor for 3 iterations of the Golden Section method is
the Fibonacci method with 0 is 1F31 0.2. Hence, if the desired range reduction factor is anywhere between 0.2 and 0.236 e.g., 0.21, then the Golden Section method requires at least 4 iterations, while the Fibonacci method requires only 3. So, an example of a desired final uncertainty range is 0.2185 0.63.
7.2
a. The plot of fx versus x is as below: 3.2
3.1 3 2.9 2.8 2.7 2.6 2.5 2.4
2.3
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
x
b. The number of steps needed for the Golden Section method is computed from the inequality:
p
5123 0.236, while that of
0.61803N 0.2 21
30
N 3.34.
fx

Therefore, the fewest possible number of steps is 4. Applying 4 steps of the Golden Section method, we end up with an uncertainty interval of a4,b0 1.8541,2.000. The table with the results of the intermediate steps is displayed below:
c. The number of steps needed for the Fibonacci method is computed from the inequality: 12 0.2 N4.
FN1 21
Therefore, the fewest possible number of steps is 4. Applying 4 steps of the Fibonacci method, we end up with an uncertainty interval of a4 , b0 1.8750, 2.000. The table with the results of the intermediate steps is displayed below:
Iteration k
ak
bk
fak
fbk
New uncertainty interval
1
1.3820
1.6180
2.6607
2.4292
1.3820,2
2
1.6180
1.7639
2.4292
2.3437
1.6180,2
3
1.7639
1.8541
2.3437
2.3196
1.7639,2
4
1.8541
1.9098
2.3196
2.3171
1.8541,2
Iteration k
k
ak
bk
fak
fbk
New unc. int.
1
0.3750
1.3750
1.6250
2.6688
2.4239
1.3750,2
2
0.4
1.6250
1.7500
2.4239
2.3495
1.6250,2
3
0.3333
1.7500
1.8750
2.3495
2.3175
1.7500,2
4
0.45
1.8750
1.8875
2.3175
2.3169
1.8750,2
d. Wehavef0x2x4sinx,f00x24cosx. Hence,Newtonsalgorithmtakestheform: xk 2 sin xk
xk1 xk 12cosxk .
Applying 4 iterations with x0 1, we get x1 7.4727, x2 14.4785, x3 6.9351, x4 16.6354.
Apparently, Newtons method is not eective in this case.
7.3
a. We first create the Mfile f.m as follows:
f.m
function yfx
y8exp1x7logx;
The MATLAB commands to plot the function are:
fplotf,1 2;
xlabelx;
ylabelfx;
The resulting plot is as follows:
31

8
7.95
7.9
7.85
7.8
7.75
7.7
7.65
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
x
b. The MATLAB routine for the Golden Section method is:
Matlab routine for Golden Section Search
left1;
right2;
uncert0.23;
rho3sqrt52;
Nceilloguncertrightleftlog1rho print N
lowera;
aleft1rhorightleft;
fafa;
for i1:N,
if lowera
ba
fbfa
aleftrhorightleft
fafa
else ab
fafb
bleft1rhorightleft
fbfb
end if
if fafb
rightb;
lowera
else
lefta;
lowerb
end if
NewInterval left,right
end for i

Using the above routine, we obtain N 4 and a final interval of 1.528, 1.674. The table with the results of the intermediate steps is displayed below:
32
fx

Iteration k
ak
bk
fak
fbk
New uncertainty interval
1
1.382
1.618
7.7247
7.6805
1.382,2
2
1.618
1.764
7.6805
7.6995
1.382,1.764
3
1.528
1.618
7.6860
7.6805
1.528,1.764
4
1.618
1.674
7.6805
7.6838
1.528,1.674
c. The MATLAB routine for the Fibonacci method is:
Matlab routine for Fibonacci Search technique
left1;
right2;
uncert0.23;
epsilon0.05;
F11;
F21;
N0;
while FN2 12epsilonrightleftuncert
FN3FN2FN1;
NN1;
end while
N print N
lowera;
aleftFN1FN2rightleft;
fafa;
for i1:N,
if iN
rho1FN2iFN3i
else
rho0.5epsilon
end if
if lowera
ba
fbfa
aleftrhorightleft
fafa
else ab
fafb
bleft1rhorightleft
fbfb
end if
if fafb
rightb;
lowera
else
lefta;
lowerb
end if
NewInterval left,right
end for i

Using the above routine, we obtain N 3 and a final interval of 1.58, 1.8. The table with the results of the intermediate steps is displayed below:
Iteration k
k
ak
bk
fak
fbk
New uncertainty interval
1
0.4
1.4
1.6
7.7179
7.6805
1.4,2
2
0.333
1.6
1.8
7.6805
7.7091
1.4,1.8
3
0.45
1.58
1.6
7.6812
7.6805
1.58,1.8
7.4
Now, k 1 FNk1 . Hence, FN k2
1
k
1 k
1 1FNk1FNk2 FNk1FNk2 1 FNk2 FNk1
FN k1 1 FNk
FN k1 k1
To show that 0 k 12, we proceed by induction. Clearly 1 12 satisfies 0 1 12. Suppose 0k 12,wherek21,…,N1. Then,
and hence Therefore,
Sincek1 1 k ,then 1k
as required.
7.5
1 1 k 1 2
1 1 2. 1k
1 k 1. 2 1k
0 k1 1 2
We proceed by induction. For k 2, we have F0F3 F1F2 13 12 1 12. Suppose Fk2Fk1 Fk1Fk 1k. Then,
7.6
Fk1Fk2 FkFk1 Fk1Fk1 Fk Fk1 Fk2Fk1 Fk1Fk Fk2Fk1
1k 1k1.
Define yk Fk and zk Fk1. Then, we have
yk1 Ayk,
where
zk1 zk A1 1,
10 34

with initial condition
We can write
y0 1. z0 0
Fn yn 1,0yn 1,0An 1. zn 0
Since A is symmetric, it can be diagonalized as
where
and
Therefore, we have
7.7
A u1 0 hu vi v 02
1 1p5, 2 1p5, 22
1 2q2p513 u514 4qp5125,
1 0 n Fn u 0 u
1 0 1 p5!n1 1 p5!n11A p5 2 2 .
1 2 qp512 3 v 514 4q2p515.
2 u 21 n1 u 2 2 n2
The number log2 is the root of the equation gx 0, where gx expx 2. The derivative of g is g0x expx. Newtons method applied to this root finding problem is
expxk 2
xk1 xk expxk xk 1 2 expxk.
Performing two iterations, we get x1 0.7358 and x2 0.6940.
7.8
a. We compute g0x 2exex 12. Therefore Newtons method of tangents for this problem takes the form
exk 1exk 1 2exkexk 12
xk1 xk
xk
xk sinhxk.
e2xk 1 2exk
b. By symmetry, we need x1 x0 for cycling. Therefore, x0 must satisfy x0 x0 sinh x0.
The algorithm cycles if x0 c, where c 0 is the solution to 2c sinh c. 35

c. The algorithm converges to 0 if and only if x0 c, where c is from part b.
7.9
The quadratic function that matches the given data xk, xk1, xk2, fxk, fxk1, and fxk2 can be computed by solving the following three linear equations for the parameters a, b, and c:
axki2 bxki cfxki, i0,1,2.
Then, the algorithm is given by xk1 b2a so, in fact, we only need to find the ratio of a and b. With some elementary algebra e.g., using Cramers rule without needing to calculate the determinant in the denominator, the algorithm can be written as:
xk1 12fxk 20fxk1 01fxk2 212fxk 20fxk1 01fxk2
where ij xki2 xkj2 and ij xki xkj. 7.10
a. A MATLAB routine for implementing the secant method is as follows.
function x,v secantg,xcurr,xnew,uncert;
Matlab routine for finding root of gx using secant method

secant;
secantg;
secantg,xcurr,xnew;
secantg,xcurr,xnew,uncert;

xsecant;
xsecantg;
xsecantg,xcurr,xnew;
xsecantg,xcurr,xnew,uncert;

x,vsecant;
x,vsecantg;
x,vsecantg,xcurr,xnew;
x,vsecantg,xcurr,xnew,uncert;

The first variant finds the root of gx in the Mfile g.m, with
initial conditions 0 and 1, and uncertainty 105.
The second variant finds the root of the function in the Mfile specified
by the string g, with initial conditions 0 and 1, and uncertainty 105.
The third variant finds the root of the function in the Mfile specified
by the string g, with initial conditions specified by xcurr and xnew, and
uncertainty 105.
The fourth variant finds the root of the function in the Mfile specified
by the string g, with initial conditions specified by xcurr and xnew, and
uncertainty specified by uncert.

The next four variants returns the final value of the root as x.
The last four variants returns the final value of the root as x, and
the value of the function at the final value as v.
if nargin 4
uncert105;
if nargin 3
if nargin 1
xcurr0;
xnew1;
elseif nargin 0
36

gg; else
dispCannot have 2 arguments.;
return; end
end end
gcurrfevalg,xcurr;
while absxnewxcurrxcurruncert,
xoldxcurr;
xcurrxnew;
goldgcurr;
gcurrfevalg,xcurr;
xnewgcurrxoldgoldxcurrgcurrgold;
end while
print out solution and value of gx
if nargout 1
xxnew;
if nargout 2
vfevalg,xnew;
end
else
finalpointxnew
valuefevalg,xnew
end if

b. We get a solution of x 0.0039671, with corresponding value gx 9.908 108. 7.11
function alphalinesearchsecantgrad,x,d
Line search using secant method
epsilon104; line search tolerance
max 100; maximum number of iterations
alphacurr0;
alpha0.001;
dphizerofevalgrad,xd;
dphicurrdphizero;
i0;
while absdphicurrepsilonabsdphizero,
alphaoldalphacurr;
alphacurralpha;
dphiolddphicurr;
dphicurrfevalgrad,xalphacurrdd;
alphadphicurralphaolddphioldalphacurrdphicurrdphiold;
ii1;
if i max absdphicurrepsilonabsdphizero,
dispLine search terminating with number of iterations:;
dispi;
break;
end
end while

7.12
a. We could carry out the bracketing using the onedimensional function 0 fx0 d0, where d0 is the negative gradient at x0, as described in Section 7.8. The decision variable would be . However, here we will directly represent the points in R2 which is equivalent, though unnecessary in general.
The uncertainty interval is calculated by the following procedure:
Therefore,
fx 1x 2 1x, rfx 2 1x 212 12
d rfx0 2 1 0.8 1.35
1 2 0.25
0.3
x1 x0 d 0.8 0.0751.35 0.6987 0.25 0.3 0.2725
Then, we proceed as follows to find the uncertainty interval:
fx1 f 0.6987 ! 0.3721
0.2725
x2 x0 2d 0.8 0.151.35 0.5975
0.25 0.3 0.2950 fx2 f 0.5975 ! 0.2678
0.2950
x3 x0 4d 0.8 0.31.35 0.3950
0.25 0.3 0.3400 fx3 f 0.3950 ! 0.1373
0.3400
x4 x0 8d 0.8 0.6 1.35 0.0100
0.25 0.3 0.4300 fx4 f 0.0100! 0.1893
0.4300
Between fx3 and fx4 the function increases, which means that the minimizer must occur on the interval x2, x4 0.5975 , 0.0100, with d 1.35.
As the problem requires, we use 0.075. First, we begin calculating fx0 and x1:
0.2950 0.4300 MATLAB code to solve the problem is listed next.
0.3
Coded by David Schvartzman
fx0 f 0.8 ! 0.5025, 0.25
38

In our case we have:
Q 2 1; 1 2;
x0 0.8; .25;
e 0.075;
f zeros1,10;
X zeros2,10;
x1 x0;
d Qx1;
f1 0.5x1Qx1;
for i2:10
X:,i x1ed;
fi 0.5X:,iQX:,i;
e 2e;
iffi fi1
break; end
end
The interval is defined by:
a X:,i2;
b X:,i;
str sprintfThe minimizer is located in: a, b, where a .4f; .4f…
and b .4f; .4f, a1,1, a2,1, b1,1, b2,1;
dispstr;
b. First, we determine the number of necessary iterations:
The initial uncertainty interval width is 0.6223. This width will be 0.62230.618N after N stages. We
choose N so that
We show the first iteration of the algorithm; the rest are analogous and shown in the following table.
From part a, we know that a0, b0 x2, x4, then:
a0,b0 0.5975 ,0.0100, with fa0 0.2678, fb0 0.1893.
0.618N 0.01 N 9 0.6223
0.2950 0.4300
a1 a0 b0 a0 0.3655 0.3466T
b1 a0 1b0 a00.2220 0.3784T fa1 0.1270
fb1 0.1085
We can see fa1 fb1, hence the uncertainty interval is reduced to: a1, b0 0.3655 , 0.0100
0.3466 0.4300
So, calculating the norm of b0 a1, we see that the uncertainty region width is now 0.38461.
39

Iteration
1
2
3
4
5
6
7
8
9
ak 0.3655
0.3466 0.2220
0.3784 0.2768
0.3663 0.2220
0.3784 0.2430
0.3738 0.2559
0.3709 0.2430
0.3738 0.2479
0.3727 0.2430
bk 0.2220
0.3784 0.1334
0.3981 0.2220
0.3784 0.1882
0.3860 0.2220
0.3784 0.2430
0.3738 0.2350
0.3756 0.2430
0.3738 0.2399
fak 0.1270
0.1085
0.1094
0.1085
0.1079
0.1081
0.1079
0.1080
0.1079
fbk 0.1085
0.1232
0.1085
0.1117
0.1085
0.1079
0.1080
0.1079
0.1079
New Uncertainty Interval

.3655 , 0.3466
.3655 , 0.3466
.2768 , 0.3663
.2768 , 0.3663
.2768 , 0.3663
0.0100 0.4300
.1334 0.3981
.1334 0.3981
.1882 0.3860
.2220 0.3784
0.3738
We can now see that the minimizer is located within 0.2479 , 0.2399 , and its uncertainty
interval width is 0.00819.
Matlab code used to perform calculations is listed next
0.3745
0.2559 , 0.2220 0.3709 0.3784
0.2559 , 0.2350 0.3709 0.3756
0.2479 , 0.2350 0.3727 0.3756
0.2479 , 0.2399 0.3727 0.3745
Coded by David Schvartzman
To succesfully run this program, run
the previous script to obatin a and b.
e 0.01;
Q 2 1; 1 2;
ro 0.53sqrt5;
First we determine the number of necessary iterations.
d normab;N ceil logedlog1ro;
fa 0.5aQa;
fb 0.5bQb;
str1 sprintfInitial values:;
str2 sprintfa0 .4f,.4f., a1, a2;
str3 sprintfb0 .4f,.4f., b1, b2;
str4 sprintffa0 .4f., fa;
str5 sprintffb0 .4f., fb;
40
0.3727 0.3745

strn sprintfn;
dispstrn;
dispstr1;dispstr2;
dispstr3;dispstr4;
dispstr5;
s a roba;
t a 1roba;
fs 0.5sQs;
ft 0.5tQt;
for i1:N
str1 sprintfIteration number: d, i;
str2 sprintfad .4f,.4f., i, s1, s2;
str3 sprintfbd .4f,.4f., i, t1, t2;
str4 sprintffad .4f., i, fs;
str5 sprintffbd .4f., i, ft;
if ftfs
b t;
fb ft;
t s;
ft fs;
s a roba;
fs 0.5sQs;
else
a s;
fa fs;
s t;
fs ft;
t a 1roba;
ft 0.5tQt;
end
str6 sprintfNew uncertainty interval: ad .4f,.4f, …
bd .4f,.4f., i, a1, a2, i, b1, b2;
dispstrn;
dispstr1
dispstr2
dispstr3
dispstr4
dispstr5
dispstr6
end
The interval where the minimizer is boxed in is given by:
an a;
bn b;
We can return anbn2 as the minimizer.
min anbn2;dispstrn;
str sprintfThe minimizer x is: .4f; .4f, min1,1, min2,1;
dispstr;
c. We need to determine the number of necessary iterations:
The initial uncertainty interval width is 0.6223. This width will be 0.6223 12 , where Fk is the kth
FN 1
element of the Fibonacci sequence. We choose N so that
12 0.01 0.0161 FN1 12
FN 1 0.6223 0.0161 41

For 0.05, we require FN 1 68.32, thus F10 89 is enough, and we have N 10 1, 9 iterations. We show the first iteration of the algorithm; the rest are analogous and shown in the following table.
From part a, we know that a0, b0 x2, x4, then:
a0,b0 0.5975 ,0.0100, with fa0 0.2678, fb0 0.1893.
0.2950 0.4300 RecallthatintheFibonaccimethod,1 1 FN
155 0.3820. 89
a1 a0 1b0 a0 0.3654 0.3466T
b1 a0 1 1b0 a0 0.2221 0.3784T fa1 0.1270
fb1 0.1085
We can see fa1 fb1, hence the uncertainty interval is reduced to: a1, b0 0.3654 , 0.0100
0.3466 0.4300
So, calculating the norm of b0 a1, we see that the uncertainty region width is now 0.38458.
FN1
k k
1 0.3820
2 0.3818
3 0.3824
ak
0.3654
0.3466 0.2221
0.3784 0.2767
0.3663
bk
0.2221
0.3784 0.1333
0.3981 0.2221
0.3784
fak 0.1270
0.1085
0.1094
fbk 0.1085
0.1232
0.1085
New Uncertainty Interval 0.3654 , 0.0100
0.3466 0.4300 0.3654 , 0.1333
0.3466 0.3981 0.2767 , 0.1333
0.3663 0.3981
42

k k ak bk fak
fbk 0.1118
0.1085
0.1079
0.1080
0.1079
0.1079
New Uncertainty Interval
4 0.3810
5 0.3846
6 0.3750
7 0.4000
8 0.3333
9 0.4500
0.2221 0.3784
0.2426 0.3739
0.2562 0.3708
0.2426 0.3739
0.2494 0.3724
0.1879 0.3860
0.2221 0.3784
0.2426 0.3739
0.2357 0.3754
0.2426 0.3739
0.1085
0.1079
0.1082
0.1079
0.1080
0.1079

0.2767 , 0.3663
0.2767 , 0.3663
0.2562 , 0.3708
0.2562 , 0.3708
0.2494 , 0.3724
0.1879 0.3860
0.2221 0.3784
0.2221 0.3784
0.2357 0.3754
0.2357 0.3754
0.2426 0.3739
0.2419 0.3740
0.2494 , 0.2419
0.3724
We can now see that the minimizer is located within 0.2494 , 0.2419 , and its uncertainty
0.3740
interval width is 0.00769. 0.3724 0.3740
Matlab copde used to perform calculations is listed next.
Coded by David Schvartzman
To succesfully run this program, run the first of the above scripts
to obtain a and b.
We take
e 0.01;
Q 2 1; 1 2;
First determine the number of necessary iterations.
d normab;
FN1 20.051ed;
F zeros1,20;
F1 0;
F2 1;
for i1:20
Fi1 FiFi1;
ifFi2 FN1
break; end
end
N i1;
ro zeros1, N1;
for i 1:N
roi 1 FN3iFN4i;
end
roN roN 0.05;
fa 0.5aQa;
fb 0.5bQb;
43

str1 sprintfInitial values:;
str2 sprintfa0 .4f,.4f., a1, a2;
str3 sprintfb0 .4f,.4f., b1, b2;
str4 sprintffa0 .4f., fa;
str5 sprintffb0 .4f., fb;
strn sprintfn;
dispstrn;
dispstr1;
dispstr2;
dispstr3;
dispstr4;
dispstr5;
s a ro1ba;
t a 1ro1ba;
fs 0.5sQs;
ft 0.5tQt;
for i1:N
str1 sprintfIteration number: d, i;
str2 sprintfad .4f,.4f., i, s1, s2;
str3 sprintfbd .4f,.4f., i, t1, t2;
str4 sprintffad .4f., i, fs;
str5 sprintffbd .4f., i, ft;
if ftfs
b t;
t s;
ft fs;
s a roi1ba;
fs 0.5sQs;
else
a s;
s t;
fs ft;
t a 1roi1ba;
ft 0.5tQt;
end
str6 sprintfNew uncertainty interval: ad .4f,.4f,…
bd .4f,.4f., i, a1, a2, i, b1, b2;
str7 sprintfUncertainty interval width: .5f, normab;
dispstrn;
dispstr1
dispstr2
dispstr3
dispstr4
dispstr5
dispstr6
dispstr7
end
The minimizer is boxed in the interval:
an a;
bn b;
We can return anbn2 as the minimizer.
min anbn2;
dispstrn;
str sprintfThe minimizer x is: .4f; .4f, min1,1, min2,1;
44

dispstr;
8. Gradient Methods
8.1
The function f is a quadratic and so we can represent it in standard form as f1x1 0xx131xQxxbc.
202 12 2
The first iteration is x1x00rf x0 .
To find x1, we need to compute rf x0 g0. We have The step size, 0, can be computed as
g0Qx0bh1 12i.
Hence,
The second iteration is where
and Hence,
0 g0g0 5. g0Qg0 6
x1 0g0 5 1 56. 6 12 512

x2x11rf x1 , rfx1g1 Qx1 b 16 ,
13 56 516 25
1 g1g1 g1Qg1
5. 9
x2 x1 1g1
The optimal solution is x 1, 14 obtaind by solving the equation Qx b.
8.2
Let s be the order of convergence of xk. Suppose there exists c 0 such that for all k suciently large, kxk1 xk ckxk xkp.
27 . 512 9 13 25
108
Hence, for all k suciently large,
kxk1 xk
Taking limits yields
kxk1 xk 1
kxk xkp kxk xksp
kxk xks
lim kxk1xk c .
c. kxk xksp
k!1 kxk xks limk!1 kxk xksp 45

Since by definition s is the order of convergence,
lim kxk1 xk 1.
k!1 kxk xks Combining the above two inequalities, we get
c 1. limk!1 kxk xksp
Therefore, since limk!1 kxk xk 0, we conclude that s p, i.e., the order of convergence is at most p. 8.3
We use contradiction. Suppose xk ! x and
lim kxk1 xk 0
k!1 kxk xkp
for some p 1. We may assume that xk 6 x for an infinite number of k for otherwise, by convention,
the ratio above is eventually 0. Fix 0. Then, there exists K1 such that for all k K1, kxk1 xk .
kxk xkp Dividing both sides by kxk xk1p, we obtain
kxk1 xk . kxk xk kxk xk1p
Because xk ! x and p 1, we have kxk xk1p ! 0. Hence, there exists K2 such that for all k K2, kxk xk1p . Combining this inequality with the previous one yields
kxk1 xk 1 kxk xk
8.4
for all k maxK1, K2; i.e.,
which contradicts the assumption that xk ! x.
kxk1 xk kxk xk,
a. The sequence converges to 0, because the exponent 2k2 grows unboundedly negative as k ! 1.
b. The order of convergence of xk is 1. To see this, we first write, for p 1, xk1 22k12
22k2 p
22k2 2k1 p2k2
22k2 22k1p.
But notice that the exponent 2k2 22k1 p grows unboundedly negative as k ! 1, regardless of the value
of p. Therefore, for any p,
which means that the order of convergence is 1.
xkp

lim xk1 0, k!1 xkp
46

8.5
a. We have
ak x0 . Because0a1,wehaveak !0,andhencexk !0.
b. Similarly, we have
xk axk1
a axk2
a2xk2 .
yk yk1b
yk2bb
yk2b2 .
y0bk . Becausey01andb1,wehavebk !1andhenceyk !0.
c. The order of convergence of xk is 1 because
lim xk1 lim a a,
lim yk1 lim 1 1, k!1 ykb k!1
d. Suppose xk cx0. Using part a, we have ak c, which implies that k logcloga. So the smallest number of iterations k such that xk cx0 is dlogc logae the smallest integer not smaller than logc loga.
e. Suppose yk cy0. Using part b, we have y0bk cy0. Taking logs twice and rearranging, we
k!1 xk k!1 The order of convergence of yk is b because
and 0 a 1. and 0 1 1.
have
Denote the righthand side by z. So the smallest number of iterations k such that yk cy0 is dze.
f. Comparing the answer in part e with that of part d, we can see that as c ! 0, the answer in part d is logc, whereas the answer in part e is Ologlogc. Hence, in the regime where c is very small, the number of iterations in part d linear convergence is at least exponentially larger than that in part e superlinear convergence.
8.6
k 1 log1 logc . logb logy0
We have uk1 1 uk, and uk ! 0. Therefore,
lim uk1 10
k!1 uk and thus the order of convergence is 1.
47

8.7
a. The value of x in terms of a, b, and c that minimizes f is x ba.
b. We have f0x ax b. Therefore, the recursive equation for the DDS algorithm is
xk1 xk axk b 1 axk b.
c. Let x limk!1 xk. Taking limits of both sides of xk1 xk axk b from part b, we get
x x a x b . d. To find the order of convergence, we compute
Hence, we get x ba x.
1 axk b ba xk bap
Let zk 1 axk ba1p. Note that zk converges to a finite nonzero number if and only if p 1 if p 1, then zk ! 0, and if p 1, then zk ! 1. Therefore, the order of convergence of xk is 1,
e. Let yk xk ba. From part d, after some manipulation we obtain yk1 1 ayk 1 ak1y0.
The sequence xk converges to ba if and only if yk ! 0. This holds if and only if 1 a 1, which is equivalent to 0 2a.
8.8
We rewrite f as f x 1 x Qx b x, where 2
Q6 4 46
The characteristic polynomial of Q is 2 12 20. Hence, the eigenvalues of Q are 2 and 10. Therefore, the largest range of values of for which the algorithm is globally convergent is 0 210.
8.9
a. WecanwritehxQxb,whereb4,1 and Q3 2
xk1 ba xk bap
1 axk 1 aba xk bap

1 axk ba1p.
23
Q1b 1 3 24 2.
5231 1
b. By part a, the algorithm is a fixedstepsize gradient algorithm for a problem with gradient h. The eigenvalues of Q are 1 and 5. Hence, the largest range of values of such that the algorithm is globally convergent to the solution is 0 25.
c. The eigenvectors of Q corresponding to eigenvalue 5 has the form c1, 1, where c 2 R. Hence, to violate the descent property, we pick 1 3
x0Q1bc1 0 48
is positive definite. Hence, the solution is

where we choose c 1 so that x0 has the specified form. 8.10
a. We have
fx 1x 3 1axx1b. 21a31
b. The unique global minimizer exists if and only if the Hessian is positive definite, which holds if and only if 1 a2 9 by Sylvesters criterion. Hence, the largest set of values of a and b such that the global minimizer of f exists is given by 4 a 2 and b 2 R unrestricted.
The minimizer is given by
x 1 3 1a1 31a 1 1 1
91a2 1a 3 1 91a2 1 4a 1
c. The algorithm is a gradient algorithm with fixed step size 25. The eigenvalues of the Hessian are after some calculations 4 a and 2 a. For global convergence, we need 25 2max, or max 5, where max max4 a, 2 a. From this we deduce that 3 a 1. Hence, the largest set of values of a and b such that the algorithm is globally convergent is given by 3 a 1 and b 2 R unrestricted.
8.11
a. We have
b. We have xk ! c if and only if fxk ! 0. HQence, the algorithm is globally conveQrgent if and only if
8.12 p p
The only local minimizer of f is x 1 3. Indeed, we have f0x 0 and f00x 2 3. To find the largest range of values of such that the algorithm is locally convergent, we use a linearization argument: The algorithm is locally convergent if and only if the linearized algorithm xk1 xkf00xxkx is globally convergent. But the linearized algorithm is just a fixed step size algorithm applied to a quadratic with second derivative f00x. Therefore, the largest range of values of such that the algorithm is locally convergent is 0 2f00x 1p3.
8.13
We use the formula from Lemma 8.1:
fxk1 1 kfxk
we have V f in this case. Using the expression for k, we get, assuming xk 6 1,
k 42k12k.
Hence, k 0, which means that fxk1 fxk if xk 6 1 for k 0. This implies that the algorithm has the descent property for k 0.
fxk1
xk1 c22
xk kxk c c22
1 k2xk c22
1 k2fxk.
fxk ! 0 for any x0. From part a, we deduce that fxk ! 0 for any x0 if and only if Because 0 1, this condition is equivalent to 1k01 k 0, which holds if and only if
X1 k 1 . k0
1k01k2 0.
49

We also note that
X1 k4 X1 2kX1 4k!424infty. k0 k0 k0 3
Since k 0 for all k 0, we can apply the theorem given in class to deduce that the algorithm is not globally convergent.
8.14
We have
By Taylors Theorem,
xk1 x xk x f0xkf00x.
f0xk f0x f00xxk x Oxk x2. Since f0x 0 by the FONC, we get
xk x f0xkf00x Oxk x2. Combining the above with the first equation, we get
xk1 x Oxk x2, which implies that the order of convergence is at least 2.
8.15
a. The objective function is a quadratic that can be written as
fx ax bax b kak2x2 2abx kbk2.
Hence, the minimizer is x abkak2.
b. Note that f00x 2kak2. Thus, by the result for fixed step size gradient algorithms, the required largest
range for is 0, 1kak2. 8.16
a. We have
fxkAxbk2 AxbAxb
xA bAx b
xAAx 2Abx bb
which is a quadratic function. The gradient is given by rfx 2AAx 2Ab and the Hessian is
given by F x 2AA.
b. The fixed step size gradient algorithm for solving the above optimization problem is given by
xk1 xk 2AAxk 2Ab xk 2AAxk b.
c. The largest range of values for such that the algorithm in part b converges to the solution of the problem
is given by
8.17
0 2 1. max 2A A 4
a. We use contraposition. Suppose an eigenvalue of A is negative: Av v, where 0 and v is a corresponding eigenvector. Choose x0 v x. Then,
x1 vx AvAx bvx v, 50

and hence
x1 x 1 x0 x. Since 1 1, we conclude that the algorithm is not globally monotone.
b. Note that the algorithm is identical to a fixed step size gradient algorithm applied to a quadratic with Hessian A. The eigenvalues of A are 1 and 5. Therefore, the largest range of values of for which the algorithm is globally convergent is 0 25.
8.18
The steepest descent algorithm applied to the quadratic function f has the form gkgk
xk1 xk kgk xk gkQgk gk.
: If x1 Q1b, then Rearranging the above yields
Since g0 Qx0 b 6 0, we have
Q1b x0 0g0.
Qx0 b 0Qg0.
Qg0 1 g0 0
which means that g0 is an eigenvector of Q with corresponding eigenvalue 10.
: By assumption, Qg0 g0, where 2 R. We want to show that Qx1 b. We have
Qx1
g0g0 Qx0 g0
g0Qg0
1 g0g0
Qx0
Qx0g0
b.
8.19
g0g0
Qg0
a. Possible. Pick f such that max 2min and x0 such that g0 is an eigenvector of Q with eigenvalue
min. Then,
b. Not possible. Indeed, using Rayleighs inequality,
0 g0g0 1 2 .
g0Qg0 min g0g0
max
1
.
Q 3 2, b 3. 231
0 g0Qg0 fx 1xQxbx22,
8.20
a. We rewrite f as where
2
min
The eigenvalues of Q are 1 and 5. Therefore, the range of values of the step size for which the algorithm converges to the minimizer is 0 25.
51

b. An eigenvector of Q corresponding to the eigenvalue 5 is v 1, 15. We have x Q1b 11, 95. Hence, an initial condition that results in the algorithm diverging is
x0 x v 2 . 2
8.21
In both cases, we compute the Hessian Q of f, and find its largest eigenvalue max. Then the range we seek is 0 2max.
a. In this case,
with eigenvalues 2 and 10. Hence, the answer is 0 15.
Q6 4, 46
b. In this case, again we have
with eigenvalues 2 and 10. Hence, the answer is 0 15.
8.22
For the given algorithm we have
k 2 gkgk2
gkQgkgkQ1gk If 0 2, then 2 0, and by Lemma 8.2,
k 2minQ0 max Q
which implies that P1k0 k 1. Hence, by Theorem 8.1, xk ! x for any x0. If 0 or 2, then 2 0, and by Lemma 8.2,
k 2maxQ0. min Q
By Lemma 8.1, Vxk Vx0. Hence, if x0 6 x, then Vxk does not converge to 0, and consequently xk does not converge to x.
8.23
By Lemma 8.1, V xk1 1 kV xk for all k. Note that the algorithm has a descent property if and only if V xk1 V xk whenever gk 6 0. Clearly, whenever gk 6 0, V xk1 V xk if and only if 1 k 1. The desired result follows immediately.
Q6 4, 46
8.24
We have and hence
xk1 xk kdk
hxk1 xk,rfxk1ikhdk,rfxk1i.
Now, let k fxk dk. Since k minimizes k, then by the FONC, 0kk 0. By the chain rule, 0k dkrfxk dk. Hence,
00kkdkrfxk kdkhdk,rfxk1i, 52

and so
hxk1 xk,rfxk1i0.
A simple MATLAB routine for implementing the steepest descent method is as follows.
function x,Nsteepdescgrad,xnew,options;
STEEPDESCgrad,x0;
STEEPDESCgrad,x0,OPTIONS;

x STEEPDESCgrad,x0;
x STEEPDESCgrad,x0,OPTIONS;

x,N STEEPDESCgrad,x0;
x,N STEEPDESCgrad,x0,OPTIONS;

The first variant finds the minimizer of a function whose gradient
is described in grad usually an Mfile: grad.m, using a gradient
descent algorithm with initial point x0. The line search used in the
secant method.
The second variant allows a vector of optional parameters to
defined. OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results, default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required of the gradient.
OPTIONS14 is the maximum number of iterations.
For more information type HELP FOPTIONS.

xcurrxnew;
gcurrfevalgrad,xcurr;
if normgcurr epsilong
dispTerminating: Norm of gradient less than;
dispepsilong;
kk1;
break;
end if
alphalinesearchsecantgrad,xcurr,gcurr;
xnew xcurralphagcurr;
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispGradient ;
dispgcurr; print gradient
dispNew point ;
dispxnew; print new point
end if
if normxnewxcurr epsilonxnormxcurr
dispTerminating: Norm of difference between iterates less than;
dispepsilonx;
break;
end if
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxnew;
dispNumber of iterations ;
dispk;
end if

To apply the above MATLAB routine to the function in Example 8.1, we need the following Mfile to specify the gradient.
function ygx
y4x14.3; 2×23; 16×35.3;
We applied the algorithm as follows:
options2 106;
options3 106;
54

steepdescg,4;5;1,options
Terminating: Norm of gradient less than
1.0000e06
Final point
4.0022e00 3.0000e00 4.9962e00
Number of iterations
25
As we can see above, we obtained the final point 4.002, 3.000, 4.996 after 25 iterations. The value of the objective function at the final point is 7.2 1010.
8.26
The algorithm terminated after 9127 iterations. The final point was 0.99992,0.99983. 9. Newtons Method
9.1
a. We have f0x 4x x03 and f00x 12x x02. Hence, Newtons method is represented as xk1 xk xk x0 ,
which upon rewriting becomes
3
xk1 x0 2 xk x0 3
b. From part a, yk xk x0 23xk1 x0 23yk1.
c. From part b, we see that yk 23ky0 and therefore yk ! 0. Hence xk ! x0 for any x0.
d. From part b, we have
lim xk1x0 lim 220 k!1 xk x0 k!1 3 3
and hence the order of convergence is 1.
e. The theorem assumes that f00x 6 0. However, in this problem, x x0, and f00x 0.
9.2
a. We have
xk1 x xk x kf0xk. By Taylors theorem applied to f0,
f0xk f0x f00xxk x oxk x. Since f0x 0 by the FONC, we get
xk x kf0xk 1 kf00xxk x koxk x oxk x koxk x
1 koxk x.
Because k converges, it is bounded, and so 1 koxk x oxk x. Combining the above
with the first equation, we get
which implies that the order of convergence is superlinear.
xk1 x oxk x,
b. In the secant algorithm, if xk ! x, then f0xk f0xk1xk xk1 ! f00x. Since the
secant algorithm has the form xk1 xk kf0xk with k xk xk1f0xkf0xk1, we 55

deduce that k ! 1f00x. Hence, if we apply the secant algorithm to a function f 2 C2, and it converges to a local minimizer x such that f00x 6 0, then the order of convergence is superlinear.
9.3
a. We compute f 0 x 4×13 3 and f 00 x 4×23 9. Therefore Newtons algorithm for this problem
takes the form
4xk133
xk1 xk 4xk239 2xk.
b. From part a, we have xk 2kx0. Therefore, as long as x0 6 0, the sequence xk does not converge to 0.
9.4
a. Clearly fx 0 for all x. We have
fx0 , x2 x21 0 and 1×1 0
, x1,1.
Hence, fx f1,1 for all x 6 1,1, and therefore 1,1 is the unique global minimizer.
b. We compute
rfx 400×31 400x1x2 2×1 2
200×2 x21 1200×21 400×2 2
400×1
Fx1 1 200 400×1 .
80000×21 x2 400 400×1 1200×21 400×2 2
Applying two iterations of Newtons method, we have x1 1,0, x2 1,1. Therefore, in this particular case, the method converges in two steps! We emphasize, however, that this fortuitous situation is by no means typical, and is highly dependent on the initial condition.
c. Applying the gradient algorithm xk1 xk krfxk with a fixed step size of k 0.05, we obtain x1 0.1, 0, x2 0.17, 0.1.
9.5
If x0 x, we are done. So, assume x0 6 x. Since the standard Newtons method reaches the point x in one step, we have
fx fx0 Q1g0 minfx
0
F x
To apply Newtons method we use the inverse of the Hessian, which is
400×1 . 200

fx0 Q1g0
0 argminfx0 Q1g01.
for any 0. Hence,
Hence, in this case, the modified Newtons algorithm is equivalent to the standard Newtons algorithm, and
thus x1 x.
10. Conjugate Direction Methods
56

10.1
We proceed by induction to show that for k 0,…,n 1, the set d0,…,dk is Qconjugate. We assume that di 6 0, i 1, . . . , k, so that diQdi 6 0 and the algorithm is well defined.
For k 0, the statement trivially holds. So, assume that the statement is true for k n 1, i.e., d0, . . . , dk is Qconjugate. We now show that d0, . . . , dk1 is Qconjugate. For this, we need only to show that for each j 0,…,k, we have dk1Qdj 0. To this end,
k1 j k1 Xk pk1 Qdi i ! j d Qd p i0 diQdi d Qd
k1 j Xk pk1 Qdi i j p Qd i0 diQdi d Qd .
In the above, we have assumed that the vectors dk are nonzero so that dkQdk 6 0 and the algorithm is well defined. To prove that this assumption holds, we use induction to show that dk is a nonzero linear combination of p0,…,pk which immediately implies that dk is nonzero because of the linear independence of p0, . . . , pk.
P For k 0, we have d0 p0 by definition. Assume that the result holds for k n 1; i.e., dk
By the induction hypothesis, diQdj 0 for i 6 j. Therefore
k1 j k1 j pk1Qdj j j
d Qd p Qd djQdj d Qd 0.
k kpj, where the coecients k are not all zero. Consider dk1: j0 j j
Xk i0
Xk i0
j0 ij
So, clearly dk1 is a nonzero linear combination of p0, . . . , pk1.
10.2
Let k 2 0, . . . , n 1 and k fxk dk. By the chain rule, we have 0k rfxk kdkdk gk1dk.
dk1
pk1
idi
Xi j0
pk1 pk1
i Xk Xk
kpj j
ikpj. j
Since gk1dk 0, we have 0k 0. Note that
k 1 dkQdk 2 gkdk constant.
2
As is a quadratic function of with positive coecient in the quadratic term, we conclude that k
argmin fxk dk.
Note that since gkdk 6 0 is the coecient of the linear term in k, we have k 6 0. For i 2
0,…,k1, we have
1 xk1 xkQdi k
1 gk1 gkdi k
1 gk1di gkdi k
0
57
dkQdi

by assumption, which completes the proof.
10.3
From the conjugate gradient algorithm we have
k k gkQdk1 k1 d g dk1Qdk1 d .
Premultiplying the above by dkQ and using the fact that dk and dk1 are Qconjugate, yields
10.4
k k k k gkQdk1 k
d Qd d Qg dk1Qdk1 d Qd
dkQgk.
k1
a. Since Q is symmetric, then there exists a set of vectors d1, . . . , dn such that Qdi idi, i 1,…,n, and didj 0, j 6 i, where the i are real eigenvalues of Q. Therefore, if i 6 j, we have diQdj dijdj jdidj 0. Hence the set d1,…,dn is Qconjugate.
b. Define i diQdididi. Let
26 d1 37
D64 . 75. dn
Since Q is positive definite and the set d1, . . . , dn is Qconjugate, then by Lemma 10.1, the set is also linearly independent. Hence, D is nonsingular. By Qconjugacy, we have that for all i 6 j, diQdj 0. By assumption, we have dijdj jdidj 0. Hence, diQdj jdidj. Moreover, for each i 1,…,n, we have diQdi diidi ididi. We can write the above conditions in matrix form:
Since D is nonsingular, then we have which completes the proof.
10.5
DQdi Didi. Qdi idi,
We have
Hence, in order to have dkQdk1 0, we need
dkQdk1 kdkQgk1 dkQdk. dkQdk
10.6
We use induction. For k 0, we have
k dkQgk1 . d0 a0g0 a0b 2 V1.
Moreover, x0 0 2 V0. Hence, the proposition is true at k 0. Assume it is true at k. To show that it is
also true at k 1, note first that
xk1 xk kdk. 58

Because xk 2 Vk Vk1 and dk 2 Vk1 by the induction hypothesis, we deduce that xk1 2 Vk1. Moreover,
dk1 akgk1 bkdk
akQxk1 b bkdk.
But because xk1 2 Vk1, Qxk1 b 2 Vk2. Moreover, dk 2 Vk1 Vk2. Hence, dk1 2 Vk2. This completes the induction proof.
b. The conjugate gradient algorithm is an instance of the algorithm given in the question. By the expanding subspace theorem, we can say that in the conjugate gradient algorithm with x0 0, at each k, xk is the global minimizer of f on the Krylov subspace Vk. Note that for all k n, Vk1 Vk, because of the CayleyHamilton theorem, which allows us to express Qn as a linear combination of I , Q, . . . , Qn1 .
10.7
Expanding a yields
a 1×0 DaQx0 Da x0 Dab
2 1a DQD aa DQx0 Db 1×0 Qx0 x0 b .
22
Clearly is a quadratic function on Rr. It remains to show that the matrix in the quadratic term, DQD,
is positive definite. Since Q 0, for any a 2 Rr, we have
a DQD a DaQDa 0
and
a D QD aDa QDa0
ifandonlyifDa0. SincerankDr,Da0ifandonlyifa0. Hence,thematrixDQDispositive definite.
10.8
a. Let0kn1and0ik. Then,
gk1T gi gk1T i1di1 di
i1gk1T di1 gk1T di 0
by Lemma 10.2.
b. Let0kn1. and0ik1. Then,
gk1T Qgi
kdk dk1T Qi1di1 di
ki1dkT Qdi1 kdkT Qdi i1dk1T Qdi1 dk1T Qdi 0
by Qconjugacy of dk1, dk, di and di1 note that the iteration indices here are all distinct.
10.9
We represent f as
fx1x5 3xx07. 232 1
59

The conjugate gradient algorithm is based on the following formulas:
xk1 k1
xk kdk, k gkdk dkQdk
k gk1Qdk kd , k dkQdk .
d
g
d0 g0 Qx0 bb 0 .
k1
h0 1i0
g0d0 1 1
0 d0Qd0 h0 1i 5 3 0 2. 3 2 1
x1 x0 0d0 0 1 0 0 . 0 21 12
We have,
We then proceed to compute
1
Hence,
We next proceed by evaluating the gradient of the objective function at x1,
g1 Qx1 b 5 3 0 032.
3 2 12 1 0
Because the gradient is nonzero, we can proceed with the next step where we compute
h32 0i 5 3 0 g1Qd0 3 2 1 9
Hence, the direction d1 is
d1 g1 0d0 32 9 0 32 . 0 4 1 94
It is easy to verify that the directions d0 and d1 are Qconjugate. Indeed, d0Qd1 h0 1i 5 3 32 0.
0 d0Qd0 h0 1i 5 3 0 4. 3 2 1
10.10
a. We have f x 1 x Qx b x where 2
3 2 94 Q 5 2 , b 3 .
211 60

b. Since f is a quadratic function on R2, we need to perform only two iterations. For the first iteration we compute
For the second iteration we compute
d0 g0 3,1
5
29
0.51724, 0.17241 0.06897, 0.20690.
0 x1 g1
0 d1
1 x2
0.0047534
0.08324, 0.20214 5.7952
1.000, 1.000.
c. The minimizer is given by x Q1 b 1, 1 , which agrees with part b. 10.11
A MATLAB routine for the conjugate gradient algorithm with options for dierent formulas of k is:
function x,Nconjgradgrad,xnew,options;
CONJGRADgrad,x0;
CONJGRADgrad,x0,OPTIONS;

x CONJGRADgrad,x0;
x CONJGRADgrad,x0,OPTIONS;

x,N CONJGRADgrad,x0;
x,N CONJGRADgrad,x0,OPTIONS;

The first variant finds the minimizer of a function whose gradient
is described in grad usually an Mfile: grad.m, using initial point
x0.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results, default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required of the gradient.
OPTIONS5 specifies the formula for beta:
0Powell;
1FletcherReeves;
2PolakRibiere;
3HestenesStiefel.
OPTIONS14 is the maximum number of iterations.
For more information type HELP FOPTIONS.

The next two variants return the value of the final point.
The last two variants return a vector of the final point and the
number of iterations.
if nargin 3
options ;
if nargin 2
dispWrong number of arguments.;
return; end
61

end
numvars lengthxnew;
if lengthoptions 14
if options140
options141000numvars;
end else
options141000numvars;
end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
epsilonx options2;
epsilong options3;
maxiteroptions14;
gcurrfevalgrad,xnew;
if normgcurr epsilong
dispTerminating: Norm of initial gradient less than;
dispepsilong;
return;
end if
dgcurr;
resetcnt 0;
for k 1:maxiter,
xcurrxnew;
alphalinesearchsecantgrad,xcurr,d;
alphadgcurrdQd;
xnew xcurralphad;
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispGradient ;
dispgcurr; print gradient
dispNew point ;
dispxnew; print new point
end if
if normxnewxcurr epsilonxnormxcurr
dispTerminating: Norm of difference between iterates less than;
dispepsilonx;
break;
end if
goldgcurr;
gcurrfevalgrad,xnew;
if normgcurr epsilong
dispTerminating: Norm of gradient less than;
62

dispepsilong;
break;
end if
resetcnt resetcnt1;
if resetcnt 3numvars
dgcurr;
resetcnt 0;
else
if options50 Powell
beta max0,gcurrgcurrgoldgoldgold;
elseif options51 FletcherReeves
beta gcurrgcurrgoldgold;
elseif options52 PolakRibiere
beta gcurrgcurrgoldgoldgold;
else HestenesStiefel
beta gcurrgcurrgolddgcurrgold;
end if
dgcurrbetad;
end
if print,
dispNew beta ;
dispbeta;
dispNew d ;
dispd;
end
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxnew;
dispNumber of iterations ;
dispk;
end if

We created the following Mfile, g.m, for the gradient of Rosenbrocks function: function ygx
y400x2x1.2x121x1, 200x2x1.2;
We tested the above routine as follows:
options2107;
options3107;
options14100;
options50;
conjgradg,2;2,options;
Terminating: Norm of difference between iterates less than
63

1.0000e07
FinalPoint
1.0000e00 1.0000e00
Numberofiteration
8
options51;
conjgradg,2;2,options;
Terminating: Norm of difference between iterates less than
1.0000e07
FinalPoint
1.0000e00 1.0000e00
Numberofiteration
10
options52;
conjgradg,2;2,options;
Terminating: Norm of difference between iterates less than
1.0000e07
FinalPoint
1.0000e00 1.0000e00
Numberofiteration
8
options53;
conjgradg,2;2,options;
Terminating: Norm of difference between iterates less than
1.0000e07
FinalPoint
1.0000e00 1.0000e00
Numberofiteration
8
The reader is cautioned not to draw any conclusions about the superiority or inferiority of any of the formulas for k based only on the above single numerical experiment.
11. QuasiNewton Methods
11.1
a. Let
Then, using the chain rule, we obtain
Hence
Since 0 is continuous, then, if dkgk 0, there exists 0 such that for all 2 0, , 0,
i.e., fxk dk fxk.
b. By part a, 0 for all 2 0, . Hence,
which implies that k 0.
fxk dk. 0 dkrfxk dk.
00 dkgk.
k argmin 6 0 0
64

c. Now,
dkgk1 dkrfxk kdk 0kk.
Since k arg min0 fxk dk 0, we have 0kk 0. Hence, gk1dk 0. d.
i. We have dk gk. Hence, dkgk kgkk2. If gk 6 0, then kgkk2 0, and hence dkgk 0.
ii. We have dk Fxk1gk. Since Fxk 0, we also have Fxk1 0. Therefore, dkgk gkF xk1gk 0 if gk 6 0.
iii. We have Hence,
dk gk k1dk1. dkgk kgkk2 k1dk1gk.
By part c, dk1gk 0. Hence, if gk 6 0, then kgkk2 0, and dkgk kgkk2 0.
iv. We have dk Hkgk. Therefore, if Hk 0 and gk 6 0, then dkgk gkHkgk 0.
e. Using the equation rf x Qx b, we get
dkgk1 dkQxk1 b
dkQxk kdk b
kdkQdk dkQxk b kdkQdk dkgk.
By part c, dkgk1 0, which implies
k dkQdk .
11.2
Yes, because:
1. The search direction is of the form dk Hkrfxk for matrix Hk Fxk1;
2. The matrix Hk Fxk1 is symmetric for f 2 C2;
3. If f is quadratic, then the quasiNewton condition is satisfied: Hk1gi xi, 0 i k. To see this, note that if the Hessian is Q, then Qxi gi. Multiplying both sides by Hk Q1, we obtain the desired result.
11.3
a. We have
Using the chain rule, we obtain
d f xk dk xk dk Qdk dkb. d
dkgk
f xk dk 1 xk dk Q xk dk xk dk b c. 2
65

Equating the above to zero and solving for gives
xkQ b dk dkQdk.
Taking into account that gk xkQ b and that dkQdk 0 for gk 6 0, we obtain gkdk gkHkgk
k dkQdk dkQdk .
b. The matrix Q is symmetric and positive definite; hence k 0 if Hk Hk 0.
11.4
a. The appropriate choice is H F x1. To show this, we can apply the same argument as in the proof of the theorem on the convergence of Newtons method. We wont repeat it here.
b. Yes provided we incorporate the usual step size. Indeed, if we apply the algorithm with the choice of H in part a, then when applied to a quadratic with Hessian Q, the algorithm uses H Q1, which definitely satisfies the quasiNewton condition. In fact, the algorithm then behaves just like Newtons algorithm.
11.5
Our objective is to minimize the quadratic
fx 1xQxxbc. 2
We first compute the gradient rf and evaluate it at x0, rfx0g0 Qx0 b1.
1
It is a nonzero vector, so we proceed with the first iteration. Let H0 I2. Then,
The step size 0 is
Hence,
d0 H0g0 1 . 1
h1 1i1 g0d0 1 2
0 d0Qd0 h1 1i1 0 1 3. 0 2 1
x1 x0 0d0 23 . 23
We evaluate the gradient rf and evaluate it at x1 to obtain
rfx1g1 Qx1 b1 0 23 1 13.
0223 1 13
It is a nonzero vector, so we proceed with the second iteration. We compute H1, where
x0 H0g0 x0 H0g0 H1 H0 g0 x0 H0g0 .
66

To find H1 we need to compute,
x0 x1 x0 23 and g0 g1 g0 23 .
Using the above, we determine,
Then, we obtain
23
x0 H0g0 x0 H0g0
23 x0 H0g0 0
43 g0 x0 H0g0 8.
9
and
and
We next compute Therefore,
H1 H0 g0 x0 H0g0 0 0
1 0 0 49 0 1 89
10 0 12
d1 H1g1 13 . 16
1 g1d1 1. d1 Qd1
x2xx11d1 1 .
Note that g2 Qx2 b 0 as expected.
11.6
12
We are guaranteed that the step size satisfies k 0 if the search direction is in the descent direction, i.e., the search direction dk Mkrfxk has strictly positive inner product with rfxk see Exercise 11.1. Thus, the condition on Mk that guarantees k 0 is rfxkMkrfxk 0, which corresponds to 1 a 0, or a 1. Note that if a 1, the search direction is not in the descent direction, and thus we cannot guarantee that k 0.
11.7
Let x 2 Rn. Then xHk1x
!
xHkx x xk Hkgkxk Hkgk x gkxk Hkgk
xxk Hkgk2 Hkx gkxk Hkgk.
The complement of the Rank One update equation is
Bk1 Bk gk Bkxkgk Bkxk . xkgk Bkxk
x
Note that since Hk 0, we have xHkx 0. Hence, if gkxkHkgk 0, then xHk1x 0.
11.8
67

Using the matrix inverse formula, we get
B1 k1
B1 k
B1gk Bkxkgk BkxkB1 kk
xkgk Bkxk gk BkxkB1gk Bkxk k
xk B1gkxk B1gk B1k k.
k gkxk B1gk k
Substituting Hk for B1, we get a formula identical to the Rank One update equation. This should not k
be surprising, since there is only one update equation involving a rank one correction that satisfies the quasiNewton condition.
11.9
We first compute the gradient rf and evaluate it at x0, rfx0g0 Qx0 b1.
1
It is a nonzero vector, so we proceed with the first iteration. Let H0 I2. Then,
The step size 0 is
d0 H0g0 1 . 1
h1 1i1 g0d0 1 2
0 d0Qd0 h1 1i1 0 1 3. 0 2 1
Hence,
We evaluate the gradient rf and evaluate it at x1 to obtain
To find H1 we need
Using the above, we determine,
23 x0 H0g0 0
43
g0 x0 H0g0 8. 9
x1 x0 0d0 23 . 23
rfx1g1 Qx1 b1 0 23 1 13. 0223 1 13
It is a nonzero vector, so we proceed with the second iteration. We compute H1, where x0 H0g0 x0 H0g0
H1 H0 g0 x0 H0g0 .
x0 x1 x0 23 and g0 g1 g0 23 .
23
and
68

Then, we obtain
0 0 1 0 0 49
0 1 89 10
16
The calculations are similar until we get to the second step:
H1 12 12
12 12 d0 0.
So the algorithm gets stuck at this point, which illustrates that it doesnt work.
11.11
a. Since f is quadratic, and k arg min0 fxk dk, then gkdk
k dkQdk .
b. Now, dk Hkgk, where Hk Hk 0. Substituting this into the formula for k in part a, yields
gkHkgk
k dkQdk 0.
11.12
Our solution to this problem is based on a solution that was furnished to us by Michael Mera, a student in ECE 580 at Purdue in Spring 2005. To proceed, we recall the formula of Lemma 11.1,
A uv1 A1 A1uvA1 1 vA1u
for 1 vA1u 6 0. Recall the definitions from the hint, gk
x0 H0g0 x0 H0g0 H1 H0 g0 x0 H0g0
and
Note that d0Qd1 0, that is, d0 and d1 are Qconjugate.
11.10
0 12
d1 H1g1 13 .
and and
A0 Bk, u0 gkxk , v0 gk, gkgk Bkxk
A1 Bk gkxk A0 u0v0 , u1 xkBkxk ,
v1 xkBk. 69

Using the above notation, we represent Bk1 as
Bk1 A0 u0v0 u1v1
Applying to the above Lemma 11.1 gives
HBFGSA10000 k10 1
000 00 0 . 11
A1u1v1. HBFGS B 1
k1 k1
A1u1v11
A1u vA1 A1 1 1 1 1 .
1 1 vA1u 111
Substituting into the above the expression for A1 yields 1
1 A1u0v0 A1 1 A1u0v0 A1 1 1 A0 0 uvA0 0 AuvA 0 1vA1u 11 0 1vA1u
1v0A0 u0
Note that A B . Hence, A1 B1 H . Using this and the notation introduced at the beginning of
0k0kk the solution, we obtain
HBF GS H HkgkgkHk
k1 H
k gkxk gkHkgk
HkgkgkHk BkxkxkBk
1v A1A0 u0v0A0 u 1 0 1vA1u0 1
00
k
1xkB H HkgkgkHk Bkxk
gkxkgkHkgk xkBkxk
k k gkxkgkHkgk xkBkxk
Hk HkgkgkHk . gkxk gkHkgk
We next perform some multiplications taking into account that Hk B1 and hence k
We obtain

HkBk BkHk In. HkgkgkHk
k
1 Hkgkgk xkxk1 gkgkHk
xkB xk xk B gkgk xk . k k gkxkgkHkgk
HBF GS H
k1
gkxk gkHkgk
gkxkgkHkgk gkxkgkHkgk
We proceed with our manipulations. We first perform multiplications by xk and xk to obtain
HBF GS H k1
HkgkgkHk gkxk gkHkgk

gkxkgkHkgk gkxkgkHkgk xkB xk xkB xk xkgkgkxk

k
Hkgkgkxk xkxk xkgkgkHk

.
k k gkxkgkHkgk
70

Cancelling the terms in the denominator of the last term above and performing further multiplications gives
HBFGS H k1
k

HkgkgkHk gkxk gkHkgk
HkgkgkxkxkgkgkHk gkxkgkHkgk xkgk gkxk
xkxk gkxk gkHkgk xkgk gkxk
Hkgk gkxk xk xkgkHk xkgk gkxk .
Further simplification of the third and the fifth terms on the right handside of the above equation gives
HBFGS H k1
k

HkgkgkHk gkxk gkHkgk
HkgkgkHk gkxk gkHkgk
xkxk gkxk gkHkgk xkgk gkxk
Hkgkxk xkgkHk . xkgk
Note that the second and the third terms cancel out each other. We then represent the fourth term in alternative manner to obtain
HBF GS H k1 k
xkxk 1 gkHkgk xkgk gkxk
Hkgkxk xkgkHk , xkgk
which is the desired BFGS update formula.
11.13
The first step for both algorithms is clearly the same, since in either case we have x1 x0 0g0.
For the second step,
In 1 !
1
d H1g
1
!
g0g0 g0x0
x0x0 x0g0
g0x0 g0x0 1 !g
g0x0
g1 1 g0g0 x0x0g1
Since the line search is exact, we have
g0x0 x0g0 g0x0g1 x0g0g1
g0x0 .
x0g1 0d0g1 0. 71

Hence,
where
is the HestenesStiefel update formula for 0. Since d0 g0, and g1g0 0, we have
g1g1 g0 0 g0g0 ,
which is the PolakRibiere formula. Applying g1g0 0 again, we get g1g1
0 g0g0 ,
a. Suppose the three conditions hold whenever applied to a quadratic. We need to show that when applied
toaquadratic,fork0,…,n1andi0,…,k,Hk1gi xi. Forik,wehave
Hk1gk Hkgk Ukgk by condition 1
Hkgk xk Hkgk by condition 2
xk,
asrequired. Fortherestoftheproofi0,…,k1,weuseinductiononk.
For k 0, there is nothing to prove covered by the i k case. So suppose the result holds for k 1.
Toshowtheresultfork,firstfixi20,…,k1. Wehave
Hk1gi Hkgi Ukgi
xi Ukgi by the induction hypothesis
xi akxkgi bkgkHkgi by condition 3.
So it suces to show that the second and third terms are both 0. For the second term, xkgi xkQxi
kidkQdi 0
because of the induction hypothesis, which implies Qconjugacy where Q is the Hessian of the given quadratic. Similarly, for the third term,
gkHkgi gkxi by the induction hypothesis xkQxi
kidkQdi 0,
1 1 g0g1 ! 0 d g ! x
g0x0
1 g1g0 0
g g0d0 d g1 0d0
0 g1g0 g1g1 g0 d0g0 d0g1 g0
which is the FletcherReeves formula.
11.14
72

again because of the induction hypothesis, which implies Qconjugacy. This completes the proof.
b. All three algorithms satisfy the conditions in part a. Condition 1 holds, as described in class. Condition 2 is straightforward to check for all three algorithms. For the rankone and DFP algorithms, this is shown in the book. For BFGS, some simple matrix algebra establishes that it holds. Condition 3 holds by appropriate definition of the vectors ak and bk. In particular, for the rankone algorithm,
ak xk Hkgk , xk Hkgkgk
For the DFP algorithm,
k xk
a xkgk ,
Finally, for the BFGS algorithm,
k gkHkgk ! xk Hkgk k xk
bk
xk Hkgk . xk Hkgkgk
k b
Hkgk
gkHkgk .
a 1 gkxk xkgk gkxk , b gkxk .
11.15
a. Suppose we apply the algorithm to a quadratic. Then, by the quasiNewton property of DFP, we have
HDFPgi xi,0ik. ThesameholdsforBFGS.Thus,forthegivenH ,wehavefor0ik, k1 k
H
gi HDFP gi 1 HBFGSgi k1 k1 k1
xi 1 xi xi ,
which shows that the above algorithm is a quasiNewton algorithm and hence also a conjugate direction algorithm.
b. By Theorem 11.4 and the discussion on BFGS, we have HDFP 0 and HBFGS 0. Hence, for any
x 6 0,
since and 1 are nonnegative. Hence, Hk 0, from which we conclude that the algorithm has the
descent property if k is computed by line search by Proposition 11.1.
11.16
To show the result, we will prove the following precise statement: In the quadratic case with Hessian Q, suppose that Hk1gi ixi, 0 i k, k n1. If i 6 0, 0 i k, then d0,…,dk1 are Qconjugate.
We proceed by induction. We begin with the k 0 case: that d0 and d1 are Qconjugate. Because 0 6 0, we can write d0 x00. Hence,
d1Qd0 g1H1Qd0
0
g1 0x0
0 0g1d0.
But g1d0 0 as a consequence of 0 0 being the minimizer of fx0 d0. Hence, d1Qd0 0.
xH xxHDFPx1xHBFGSx0 kkk
kk
g1H1
g1 H1g0
Qx0 0
73

Assume that the result is true for k 1 where k n 1. We now prove the result for k, that is, that d0,…,dk1 are Qconjugate. It suces to show that dk1Qdi 0, 0 i k. Given i, 0 i k, using the same algebraic steps as in the k 0 case, and using the assumption that i 6 0, we obtain
dk1Qdi gk1Hk1Qdi .
igk1di.
Because d0, . . . , dk are Qconjugate by assumption, we conclude from the expanding subspace lemma
Lemma 10.2 that gk1di 0. Hence, dk1Qdi 0, which completes the proof. 11.17
A MATLAB routine for the quasiNewton algorithm with options for dierent formulas of Hk is:
function x,Nquasinewtongrad,xnew,H,options;
QUASINEWTONgrad,x0,H0;
QUASINEWTONgrad,x0,H0,OPTIONS;

x QUASINEWTONgrad,x0,H0;
x QUASINEWTONgrad,x0,H0,OPTIONS;

x,N QUASINEWTONgrad,x0,H0;
x,N QUASINEWTONgrad,x0,H0,OPTIONS;

The first variant finds the minimizer of a function whose gradient
is described in grad usually an Mfile: grad.m, using initial point
x0 and initial inverse Hessian approximation H0.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results, default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required of the gradient.
OPTIONS5 specifies the formula for the inverse Hessian update:
0Rank One;
1DFP;
2BFGS;
OPTIONS14 is the maximum number of iterations.
For more information type HELP FOPTIONS.

The next two variants return the value of the final point.
The last two variants return a vector of the final point and the
number of iterations.
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return; end
end
numvars lengthxnew;
if lengthoptions 14
if options140
options141000numvars;
end else
74

options141000numvars;
end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
epsilonx options2;
epsilong options3;
maxiteroptions14;
resetcnt 0;
gcurrfevalgrad,xnew;
if normgcurr epsilong
dispTerminating: Norm of initial gradient less than;
dispepsilong;
return;
end if
dHgcurr;
for k 1:maxiter,
xcurrxnew;
alphalinesearchsecantgrad,xcurr,d;
xnew xcurralphad;
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispGradient ;
dispgcurr; print gradient
dispNew point ;
dispxnew; print new point
end if
if normxnewxcurr epsilonxnormxcurr
dispTerminating: Norm of difference between iterates less than;
dispepsilonx;
break;
end if
goldgcurr;
gcurrfevalgrad,xnew;
if normgcurr epsilong
dispTerminating: Norm of gradient less than;
dispepsilong;
break;
end if
palphad;
qgcurrgold;
resetcnt resetcnt1;
if resetcnt 3numvars
75

dgcurr;
resetcnt 0;
else
if options50 Rank One
qpHq
H HpHqpHqqpHq;
elseif options51 DFP
H HpppqHqHqqHq;
else BFGS
H H1qHqqppppqHqpHqpqp;
end if
dHgcurr;
end
if print,
dispNew H ;
dispH;
dispNew d ;
dispd;
end
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxnew;
dispNumber of iterations ;
dispk;
end if

We created the following Mfile, g.m, for the gradient of Rosenbrocks function: function ygx
y400x2x1.2x121x1, 200x2x1.2;
We tested the above routine as follows:
options2107;
options3107;
options14100;
x02;2;
H0eye2;
options50;
quasinewtong,x0,H0,options;
Terminating: Norm of difference between iterates less than
1.0000e07
Final point
1.0000e00 1.0000e00
Number of iterations
8
76

options51;
quasinewtong,x0,H0,options;
Terminating: Norm of difference between iterates less than
1.0000e07
Final point
1.0000e00 1.0000e00
Number of iterations
8
options52;
quasinewtong,x0,H0,options;
Terminating: Norm of difference between iterates less than
1.0000e07
Final point
1.0000e00 1.0000e00
Number of iterations
8
The reader is again cautioned not to draw any conclusions about the superiority or inferiority of any of the formulas for Hk based only on the above single numerical experiment.
11.18
a. The plot of the level sets of f were obtained using the following MATLAB commands:
X,Ymeshdom2:0.1:2, 1:0.1:3;
ZX.44Y.22 X.Y XY;
V0.72, 0.6, 0.2, 0.5, 2;
contourX,Y,Z,V
The plot is depicted below:
3 2.5 2 1.5 1 0.5 0 0.5
1
2 1.5 1 0.5 0 0.5 1 1.5 2
x1
b. With the initial condition 0,0, the algorithm converges to 1,0, while with the initial condition 1.5, 1, the algorithm converges to 1, 2. These two points are the two strict local minimizers of f as can be checked using the SOSC. The algorithm apparently converges to the minimizer closer to the initial point.
12. Solving Ax b
77
x2

12.1
Write the least squares cost in the usual notation kAx bk2 where
26 3 37 h i 26 1 37
A4565, x m , b4235. The least squares estimate of the mass is
m AA1Ab 31. 70
12.2
Write the least squares cost in the usual notation kAx bk2 where
26 1 1 37 a 26 3 37 A41 25, x b , b445.
145 The least squares estimate for a, b is
a b
AA1Ab 3 7112
721 31 121 7112
1473 31 1 35
14 9 52.
914
2 12 23 A 6422275 ,

12.3
a. We form
2 5.003 b 6419.575 .
44.0 g AA1Ab 9.776.
b. We start with P 0 0.040816, and x0 9.776. We have a1 422 8, and b1 78.5. Using the RLS formula, we get x1 9.802, which is our updated estimate of g.
12.4
Let x x1, x2, . . . , xn and y y1, y2, . . . , yn. This leastsquares estimation problem can be expressed
32 2 The least squares estimate of g is then given by
as
with as the decision variable. Assuming that x 6 0, the solutPion is unique and is given by
minimize kx yk2,
xy n xiyi 1 i1
x x x y P . xx ni1 x2i
78

12.5
The least squares estimate of R is the least squares solution to 1R V1
Therefore, the least squares solution is
12.6
.
1R Vn.
0 21311 2V13 V V R B1,…,164.75CA 1,…,164 . 75 1 n.
1Vn n
We represent the data in the table and the decision variables a and b using the usual least squares matrix
notation:
26 1 2 37 26 6 37 a A41 15, b445, x b .
explicitly compute:
12.8
Ratio 0.54 0.32 0.32 0.34
0.22 11. 0.02
325
x a AA1Ab 11 91 25 1 9 925 12.
The least squares estimate is given by
b 9 9 26 18 9 11 26 6118
12.7
The problem can be formulated as a leastsquares problem with
26 0 . 3 0 . 1 37 26 5 37 A 40.4 0.25, b 435,
0.3 0.7 4
where the decision variable is x x1,x2, and x1 and x2 are the amounts of A and B, respectively. After
some algebra, we obtain the solution:
x AA1Ab 1 0.54 0.32 3.9 .
0.340.54 0.322 0.32 0.34 3.9
Since we are only interest in the ratio of the first component of x to the second component, we need only
For each k, we can write
yk ayk1bukvk
a2yk2 abuk1 avk1 buk vk .
ak1bu1 ak2bu2 buk ak1v1 ak2v2 vk 79

Writeuu1,…,un,vv1,…,vn,andyy1,…,yn. Then,yCuDv,where 26 b 0 037 26 1 0 037
C 6 ab b … .7, D 6 a 1 … .7. 64 . . . . . . . . . 0 75 64 . . . . . . 0 75
an1b ab b an1 an2 1
Write b D1y and A D1C so that b Au v. Therefore, the linear leastsquares estimate of u
given y is
But C bD. Hence,
u 1D1y 1 6a 1 … .7y. b b 64 . . . . . . . . . 0 75
0 a 1
Notice that D1 has the simple form shown above.
An alternative solution is first to define z z1,…,zn by zk yk ayk1. Then, we have z buv.
Therefore, the linear leastsquares estimate of u given y or, equivalently, z is 261 0 037
u AA1Ab CDD1C1CDD1y. 261 0 037
12.9
Define
u 1z 1 6a 1 … .7y. b b 64 . . . . . . . . . 0 75
0 a 1
26 x 1 1 37 26 y 1 37 X 64 . . . . . . 75 , y 64 . . . 75 .
xp1 yp
Since the xi are not all equal, we have rank X 2. The objective function can be written as
2 fa,bX ab y .
Therefore, by Theorem 12.1 there exists a unique minimizer a,b given by
a b

4X2X2 5.
X2Y XXY X2X2
XX1Xy
Ppi1 x2i Ppi1 xi1 Ppi1 xiyi
Ppi1 xi p Ppi1 yi X2 X1XY
X1Y
1 1 XXY
X2X2 X X2 Y 2 XY XY 3
80

As we can see, the solution does not depend on Y 2. 12.10
a. Wewishtofind!andsuchthat
sin!t1
sin!tp Taking arcsin, we get the following system of linear equations:
!t1 arcsin y1 .
!tp arcsinyp.
b. We may write the system of linear equations in part a as Ax b, where
26t1 137 A 64 . . . . . . 75 ,
tp 1
x

,
26arcsin y137 b 64 . . . 75 .
arcsinyp
Since the ti are not all equal, the first column of A is not a scalar multiple of the second column. Therefore, rank A 2. Hence, the least squares solution is
x

AA1Ab
Ppi1 t2i Ppi1 ti1 Ppi1 ti arcsin yi Ppi1 ti p Ppi1 arcsin yi
T2 T1TY T1Y
1 1 TTY
y1 .
yp.
T2 T2
1
T T2 Y
TY TY .
TTY T2Y
The given line can be expressed as the range of the matrix A 1, m. Let b x0, y0 be the given point.
T2 T2
Therefore, the problem is a linear least squares problem of minimizing kAx bk2. The solution is given by
x AA1Ab x0 my0 . 1m2
Therefore, the point on the straight line that is closest to the given point x0,y0 is given by x,mx. 12.12
12.11
a. Write
xp 1
The objective function can then be written as kAz bk2.
26×1 137
A64 . .752Rpn1,

z ac 2Rn1,
26y137 b64 .752Rp.
yp
81

b. Let X x1,…,xp 2 Rpn, and e 1,…,1 2 Rp. Then we may write A X e. The solution to the problem is AA1Ab. But
AA XX Xe XX 0 eX p 0 p
since Xe x1 xp 0 by assumption. Also, AyXy 0
ey ey
since Xy y1x1 ypxp 0 by assumption. Therefore, the solution is given by
z AA1AbXX1 0 0 0 . 0 1p ey 1ey
The ane function of best fit is the constant function fx c, were 1 Xp
12.13
a. Using the least squares formula, we have
cp yi. i1
p
0 2 u1 311 2 y1 3 Pn
un yn
u y
Bu ,…,u 6475CA u ,…,u 6475 P k k.
k1 n1n 1n nk1u2k
b. Givenuk 1forallk,wehave
1Xn1Xn 1Xn
n n yk n ek n ek. k1 k1 k1
Hence,n !ifandonlyiflimn!1 1 Pn ek 0. n k1
12.14
Weposetheproblemasaleastsquaresproblem: minimizekAxbk2 wherexa,b,and
26 x 0 1 37 26 x 1 37 A 4×1 15, b 4×25.
We have
Therefore, the least squares solution is
x21 x3
P2i0 x2i P2i0 xi P2i0 xixi1
AAP2x 3 , AbP2x . i0 i i0 i1
a P2i0 x2i
P2i0 xi1 P2i0 xixi1 5 31 18 72 3 P2 x 3 3 11 16 .
b P2 x i0 i
i0 i1
82

12.15
Weposetheproblemasaleastsquaresproblem: minimizekAxbk2 wherexa,b,and
note that h0 0. We have AA
squares solution is
a Pn1 h2 01 Pn1 h h Pn1 h h Pn1 h2
260 137 A6 h1 07,
26h137 b6h27
where we use s0 0. We have
i1 i i1 i i1 i i1
64 . . . . . . 75
64 . . . 75 hn
0
Pn1 h2 0 Pn1 h h
hn1 i1 i
Ab i1 i i1 . 01 h1
,
The matrix AA is nonsingular because we assume that at least one hk is nonzero. Therefore, the least
i1 i i1 i i1 i1 i i1 i1 i .
b 0 1 h1 Weposetheproblemasaleastsquaresproblem: minimizekAxbk2 wherexa,b,and
12.16
Pn1 s2
AAPn1s n,AbPns.
Pn1 s
i1 i i1 i
The matrix AA is nonsingular because we assume that at least one sk is nonzero. Therefore, the least squares solution is
12.17
This leastsquares estimation problem can be expressed as minimize kax yk2.
If x 0, then the problem has an infinite number of solutions: any a solves the problem. Assuming that x 6 0, the solution is unique and is given by
a xx1xy xy . xx
a Pn1 s2 i1 i
Pn1 s 1 Pn1 s s i1 i i1 i i1
b Pn1si i1
n Pn si
i1
n Pn1 sisi1 Pn1 si Pn
PPPP.
1
si P P n1 i1 n1 i1 n1 i1 n
h1
260 137 26s137 A6 s1 17, b6s27
64 . . . . . . 75
sn1 1 sn
64 . . . 75
Pn1 s s
n n1s2 n1s 2 i1 si i1 sisi1 i1 s2i i1si i1 i i1 i
83

12.18
The solution to this problem is the same as the solution to:
minimize 1 kx bk2 2
subject to x 2 RA.
Substituting x Ay, we see that this is simply a linear least squares problem with decision variable y. The solution to the least squares problem is y AA1Ab, which implies that the solution to the given problem is x AAA1Ab.
12.19
We solve the problem using two dierent methods. The first method would be to use the Lagrange multiplier technique to solve the equivalent problem,
minimize kx x0k2
subjectto hxh1 1 1ix10,
The lagrangian for the above problem has the form,
lx,x21 x2 32 x23 x1 x2 x3 1.
Applying the FONC gives
26 2 x 1 37 rxl42x2 65 and 2×3
Solving the above yields
x 64 3 5 75 .
x1 x2 x3 10. 243
3 4
3
The second approach is to use the wellknown solution to the minimum norm problem. We first derive a general solution formula for the problem,
minimize kx x0k subject to Ax b,
where A 2 Rmn, m n, and rankA m. To proceed, we first transform the above problem from the x coordinates into the z x x0 coordinates to obtain,
minimize kzk
subject to Az b Ax0.
The solution to the above problem has the form,
z A AA1 b Ax0
A AA1 b A AA1 Ax0. Therefore, the solution to the original problem is
A AA1 bAx0x0 84
x
A AA1 b A AA1 Ax0 x0
A AA1 b In A AA1 A x0.

We substitute into the above formula the given numerical data to obtain
12.20
213 22 1 13203 6376333767
x 41541 2 15435 3333
1 1 1 2 0 233333
The solution is therefore
x BB1Bc pAA1Ab1 bp p
Alternatively: Write
4
643575. 3
4 3
For each x 2 Rn, let y x x0. Then, the original problem is equivalent to minimize kyk
subject to Ay b Ax0,
in the sense that y is a solution to the above problem if and only if x y x0 is a solution to the original
problem. By Theorem 12.2, the above problem has a unique solution given by
y AAA1b Ax0 AAA1b AAA1Ax0.
Therefore, the solution to the original problem is
x AAA1b AAA1Ax0 x0 AAA1b In AAA1Ax0.
Note that
kxx0k y
kAAA1b Ax0k
kAAA1b AAA1Ax0k. The objective function of the given problem can be written as
12.21
where
fx kBx ck2,
26 A 37 26 b 1 37 B 64 . . . 75 , c 64 . . . 75 .
A bp
kAx bik2 xAAx 2xAbi kbik2 Therefore, the given objective function can be written as
pxAAx2xAb1 bpkb1k2 kbik2. The solution is therefore 1 Xp
x pAA1Ab1 bp p xi i1
1 Xp i1
1 Xp AA1Abi p xi
i1
85

Note that the original problem can be written as the least squares problem minimize kAx bk2,
where
12.22
Write
b b1 bp . p
kAx bik2 xAAx 2xAbi kbik2 Therefore, the given objective function can be written as
1 pxAAx2xA1b1 pbp1kb1k2 pkbik2. The solution is therefore by inspection
1 Xp Xp x 1 pAA1A1b1 pbp ixi
ixi,
1 p i1 i1 Note that the original problem can be written as the least squares problem
where i i1 p. where
12.23
minimize kAx bk2, b 1b1 pbp.
1 p
Let x AAA1b. Suppose y is a point in RA that satisfies Ay b. Then, there exists z 2 Rm such that y Az. Then, subtracting the equation AAAA1b b from the equation
AAz b, we get
AAz AA1b 0.
Since rank A m, AA is nonsingular. Therefore, z AA1b 0, which implies that
y Az AAA1b x.
Hence, x AAA1b is the only vector in RA that satisfies Ax b.
12.24
a. We have Similarly,
b. Now,
Hence,
x0 AA 1Ab0 G1Ab0. 00000
x1 AA 1Ab1 G1Ab1. 11111
G0 hA a1i A1 1 a1
A 1 A 1 a 1 a 1 G1a1a1.
G 1 G 0 a 1 a 1 . 86

c. Using the ShermanMorrison formula,
P G1
d. We have
e. Finally,
11
G0a1a11
G1a aG1 G10 110
0 1 a G1a 101
P0P0a1a1P0. 1 a 1 P 0 a 1
Ab0 G G1Ab0 0000
G0 x0
G1a1a1x0
G1x0 a1a1 x0.
G1 A b1 11
x1
G1Ab1abab
x0P1a1b1a1x0. The general RLS algorithm for removals of rows is:
Pk1 Pk Pkak1ak1Pk 1 ak1P kak1
11 1111 G1 Ab0ab
1011
G1Gx0aax0ab 111111
x0 G1a b ax0 1111
xk1 xk P k1ak1 bk1 ak1xk . Using the notation of the proof of Theorem 12.3, we can write
12.25
xk1 xk bRk1 aRk1xk
aRk1 . kaRk1k2
Hence,
X
k1 2
xk
which means that xk is in spana1, . . . , am RA.
12.26
a. We claim that x minimizes kx x0k subject to x : Ax b if and only if y x x0 minimizes kyk subject to Ay b Ax0.
To prove suciency, suppose y minimizes kyk subject to Ay b Ax0. Let x y x0. Consider any point x1 2 x : Ax b. Now,
Ax1 x0 b Ax0. 87
kaRk1k2 bRi1 aRi1xi
aRi1
i0

Hence, by definition of y,
kx1 x0k kyk kx x0k.
Therefore x minimizes kx x0k subject to x : Ax b.
To prove necessity, suppose x minimizes kx x0k subject to x : Ax b. Let y x x0.
Consider any point y1 2 y : Ay b Ax0. Now,
Ay1 x0 b.
Hence, by definition of x,
ky1k ky1 x0 x0k kx x0k kyk.
Therefore, y minimizes kyk subject to Ay b Ax0.
By Theorem 12.2, there exists a unique vector y minimizing kyk subject to Ay b Ax0. Hence,
by the above claim, there exists a unique x minimizing kx x0k subject to x : Ax b b. Using the notation of the proof of Theorem 12.3, Kaczmarzs algorithm is given by
xk1 xk bRk1 aRk1xkaRk1. Subtract x0 from each side to give
xk1 x0 xk x0 bRk1 aRk1x0 aRk1xk x0aRk1. Writing yk xk x0, we get
yk1 yk bRk1 aRk1x0 aRk1ykaRk1.
Note that y0 0. By Theorem 12.3, the sequence yk converges to the unique point y that minimizes kyk subject to Ay b Ax0. Hence xk converges to y x0. From the proof of part a, x y x0 minimizes kx x0k subject to x : Ax b. This completes the proof.
12.27
Following the proof of Theorem 12.3, assuming kak 1 without loss of generality, we arrive at kxk1 xk2 kxk xk2 2 axk x2.
Since xk, x 2 RA Ra by Exercise 12.25, we have xk x 2 RA. Hence, by the Cauchy
Schwarz inequality,
axk x2 kak2kxk xk2 kxk xk2, since kak 1 by assumption. Thus, we obtain
kxk1 xk2 1 2 kxk xk2 2kxk xk2 wherep12. Itiseasytocheckthat0121forall20,2. Hence,01.
12.28
In Kaczmarzs algorithm with 1, we may write
xk1 xk bRk1 aRk1xk aRk1 .
kaRk1k2 Subtracting x and premultiplying both sides by aRk1 yields
aRk1 xk x bRk1 aRk1xk
aRk1 kaRk1k2
aRk1xk1 x
aRk1xk aRk1x bRk1 aRk1xk
bRk1 aRk1x
0.
88

Substituting aRk1x bRk1 yields the desired result.
12.29
We will prove this by contradiction. Suppose Cx is not the minimizer of kBy bk2 over Rr. Let y be the minimizer of kBy bk2 over Rr. Then, kBy bk2 kBCx bk2 kAx bk2. Since C is of full rank, there exists x 2 Rn such that y Cx. Therefore,
kAxbk2 kBCxbk2 kBybk2 kAx bk2 which contradicts the assumption that x is a minimizer of kAx bk2 over Rn.
12.30
a. Let A BC be a full rank factorization of A. Now, we have A CB, where B BB1B and C CCC1. On the other hand A CB. Since A CB is a full rank factorization of A, we have A CB BC. Therefore, to show that A A, it is enough to show that
B B C C.
To this end, note that B BBB1, and C CC1C. On the other hand, B BB1B BBB1, and C CCC1 CC1C, which completes the proof.
b. Note that A CB, which is a full rank factorization of A. Therefore, A BC. Hence, to show that A A, it is enough to show that
B B C C.
To this end, note that B BB1B B since B is a full rank matrix. Similarly, C CCC1 C since C is a full rank matrix. This completes the proof.
12.31
: We prove properties 14 in turn.
1. This is immediate.
2. Let A BC be a full rank factorization of A. We have A CB, where B BB1B and
C CCC1. Note that BB I and CC I. Now,
3. We have
AAA CBBCCB CB
A. AA BCCB
BB BB
BB1BB
BBB1B BB
BCCB AA.
89

4. We have
which is a full rank factorization. Therefore,
But
260 0 037 A1A2 40 12 125 .
000
260 0 037 A2A140 0 15.
000
AA CBBC CC
CC
CCCC1 CCC1C
CC
CBBC
AA.
: By property 1, we immediately have AAA A. Therefore, it remains to show that there exist matricesU andV suchthatA UA andA AV.
For this, we note from property 2 that A AAA. But from property 3, AA AA AA. Hence, A AAA. Setting U AA, we get that A UA.
Similarly, we note from property 4 that AA AAT AA. Substituting this back into property 2 yields A AAA AAA. Setting V AA yields A AV . This completes the proof.
12.32
Taken from 23, p. 24 Let
We compute
We have
26 0 0 0 37 26 1 0 0 37 A140 1 15, A240 1 05.
010 000
26 0 0 0 37 A140 0 15,
26 1 0 0 37 A240 1 05A2.
0 1 1
A1A240 1 054150 1 0
0 0 0 260 0 037 26037h i
010 1
Hence, A1A2 6 A2A1.
13. Unconstrained Optimization and Feedforward Neural Networks
13.1
a. The gradient of f is given by
r f w X d y d X d w . 90

b. The Conjugate Gradient algorithm applied to our training problem is: 1. Set k : 0; select the initial point w0.
2. g0 Xdyd Xd w0. If g0 0, stop, else set d0 g0. 3. dkgk
k dkXdXd dk
4. wk1 wk kdk
5. gk1 Xdyd Xd wk1. If gk1 0, stop. 6. gk1XdXd dk
k dkXdXd dk
7. dk1 gk1 kdk
8. Setk:k1;goto3. c. We form the matrix Xd as
Xd 0.5 0.5 0.5 0 0 0 0.5 0.5 0.5 0.5 0 0.5 0.5 0 0.5 0.5 0 0.5
and the vector yd as
yd 0.42074,0.47943,0.42074,0,0,0,0.42074,0.47943,0.42074.
Running the Conjugate Gradient algorithm, we get a solution of w 0.8806, 0.000. d. The level sets are shown in the figure below.
0.5 0.4 0.3 0.2 0.1
0 0.1 0.2 0.3 0.4
0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
w1
The solution in part c agrees with the level sets. e. The plot of the error function is depicted below.
91
w2

0.5 0.4 0.3 0.2 0.1
0 0.1 0.2 0.3 0.4 0.5
1
0
x2
11 0.5 0 x1
0.5
1
13.2
a. The expression we seek is To derive the above, we write
ek1 1ek.
yd xd wk1 yd xd wk
ek1 ek
Substituting for wk1 wk from the WidrowHo algorithm yields
xd wk1 wk.
e e xekxd e.
k1 k dxx k dd
Hence, ek1 1 ek.
b. Forek !0,itisnecessaryandsucientthat11,whichisequivalentto02.
13.3
a. The error satisfies
To derive the above expression, we write
ek1 ek
Substituting for wk1 wk from the algorithm yields
ek1 ek Xd XdXd Xd1ek ek.
Hence, ek1 Ip ek.
b. From part a, we see that ek Ip ke0. Hence, by Lemma 5.1, a necessary and sucient condition for ek ! 0 for any e0 is that all the eigenvalues of Ip must be located in the open unit circle. From Exercise 3.6, it follows that the above condition holds if and only if 1 i 1 for each eigenvalue i of . This is true if and only if 0 i 2 for each eigenvalue i of .
13.4
We modified the MATLAB routine of Exercise 8.25, by fixing the step size at a value 100. We need the following Mfile for the gradient:
ek1 Ip ek.
yd Xd wk1 yd Xd wk
Xd wk1 wk.
92
error

function yDfbpw;
wh11w1;
wh21w2;
wh12w3;
wh22w4;
wo11w5;
wo12w6;
xd10; xd21; yd1;
v1wh11xd1wh12xd2;
v2wh21xd1wh22xd2;
z1sigmoidv1;
z2sigmoidv2;
y1sigmoidwo11z1wo12z2
d1ydy1y11y1;
y1d1wo11z11z1xd1;
y2d1wo12z21z2xd1;
y3d1wo11z11z1xd2;
y4d1wo12z21z2xd2;
y5d1z1;
y6d1z2;
yy;
After 20 iterations of the backpropagation algorithm, we get the following weights: wo20 2.883
wo20 3.194 12
wh20 0.1000 11
wh20 0.8179 12
wh20 0.3000 21
wh20 1.106. 22
The corresponding output of the network is y20 0.9879. 1
13.5
We used the following MATLAB routine:
function x,Nbackpropgrad,xnew,options;
BACKPROPgrad,x0;
BACKPROPgrad,x0,OPTIONS;

x BACKPROPgrad,x0;
x BACKPROPgrad,x0,OPTIONS;

x,N BACKPROPgrad,x0;
x,N BACKPROPgrad,x0,OPTIONS;

The first variant trains a net whose gradient
is described in grad usually an Mfile: grad.m, using a backprop
algorithm with initial point x0.
The second variant allows a vector of optional parameters to
defined. OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results, default is no display: 0.
11
93

OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required of the gradient.
OPTIONS14 is the maximum number of iterations.
For more information type HELP FOPTIONS.

The next two variants returns the value of the final point.
The last two variants returns a vector of the final point and the
number of iterations.
if nargin 3
options ;
if nargin 2
dispWrong number of arguments.;
return; end
end
if lengthoptions 14
if options140
options141000lengthxnew;
end
else
options141000lengthxnew;
end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
epsilonx options2;
epsilong options3;
maxiteroptions14;
for k 1:maxiter,
xcurrxnew;
gcurrfevalgrad,xcurr;
if normgcurr epsilong
dispTerminating: Norm of gradient less than;
dispepsilong;
kk1;
break;
end if
alpha10.0;
xnew xcurralphagcurr;
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispGradient ;
dispgcurr; print gradient
94

dispNew point ;
dispxnew; print new point
end if
if normxnewxcurr epsilonxnormxcurr
dispTerminating: Norm of difference between iterates less than;
dispepsilonx;
break;
end if
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxnew;
dispNumber of iterations ;
dispk;
end if

To apply the above routine, we need the following Mfile for the gradient.
function ygradw,xd,yd;
wh11w1;
wh21w2;
wh12w3;
wh22w4;
wo11w5;
wo12w6;
t1w7;
t2w8;
t3w9;
xd1xd1; xd2xd2;
v1wh11xd1wh12xd2t1;
v2wh21xd1wh22xd2t2;
z1sigmoidv1;
z2sigmoidv2;
y1sigmoidwo11z1wo12z2t3;
d1ydy1y11y1;
y1d1wo11z11z1xd1;
y2d1wo12z21z2xd1;
y3d1wo11z11z1xd2;
y4d1wo12z21z2xd2;
y5d1z1;
y6d1z2;
y7d1wo11z11z1;
y8d1wo12z21z2;
y9d1;
95

yy;
We applied our MATLAB routine as follows.
options2107;
options3107;
options1410000;
w00.1,0.3,0.3,0.4,0.4,0.6,0.1,0.1,0.1;
wstar,Nbackpropgrad,w0,options
Terminating with maximum number of iterations
wstar
7.7771e00
5.5932e00
8.4027e00
5.6384e00
1.1010e01
1.0918e01
3.2773e00
8.3565e00
5.2606e00 N
10000
As we can see from the above, the results coincide with Example 13.3. The table of the outputs of the trained network corresponding to the training input data is shown in Table 13.2.
14. Global Search Algorithms
14.1
The MATLAB program is as follows.
function outputargs nmsimplex inputargs
NelderMead simplex method
Based on the program by the Spring 2007 ECE580 student, Hengzhou Ding
disp We minimize a function using the NelderMead method.
disp There are two initial conditions.
disp You can enter your own starting point.
disp
dispSelect one of the starting points
disp 0.55;0.7 or 0.9;0.5
x0input
disp
clear
close all;
dispSelect one of the starting points, or enter your own point
disp0.55;0.7 or 0.9;0.5
dispCopy one of the above points and paste it at the prompt
x0input
hold on
axis square
Plot the contours of the objective function
X1,X2meshgrid1:0.01:1;
96

YX2X1.412.X1.X2X1X23;
C,h contourX1,X2,Y,20;
clabelC,h;
Initialize all parameters
lambda0.1;
rho1;
chi2;
gamma12;
sigma12;
e11 0;
e20 1;
x00.55 0.7;
x00.9 0.5;
Plot initial point and initialize the simplex
plotx01,x02,;
x:,3×0;
x:,1x0lambdae1;
x:,2x0lambdae2;
while 1
Check the size of simplex for stopping criterion
simpsizenormx:,1x:,2normx:,2x:,3normx:,3x:,1;
ifsimpsize1e6
break;
end
lastptx:,3;
Sort the simplex
xsortpointsx,3;
Reflection
centro12x:,1x:,2;
xrcentrorhocentrox:,3;
Accept condition
ifobjfunxrobjfunx:,1 objfunxrobjfunx:,2
x:,3xr;
Expand condition
elseifobjfunxrobjfunx:,1
xecentrorhochicentrox:,3;
ifobjfunxeobjfunxr
x:,3xe;
else
x:,3xr;
end
Outside contraction or shrink
elseifobjfunxrobjfunx:,2
objfunxrobjfunx:,3
xccentrogammarhocentrox:,3;
ifobjfunxcobjfunx:,3
x:,3xc;
else
xshrinkx,sigma;
end
Inside contraction or shrink
else
xcccentrogammacentrox:,3;
ifobjfunxccobjfunx:,3
x:,3xcc;
97

else
xshrinkx,sigma;
end
end
Plot the new point and connect
plotlastpt1,x1,3,lastpt2,x2,3,;
end
Output the final simplex minimizer
x:,1
objfun
function y objfunx
yx1x2412x1x2x1x23;
sortpoints
function y sortpointsx,N
for i1:N1
for j1:Ni
ifobjfunx:,jobjfunx:,j1
tmpx:,j;
x:,jx:,j1;
x:,j1tmp;
end
end
end
yx;
shrink
function y shrinkx,sigma
x:,2x:,1sigmax:,2x:,1;
x:,3x:,1sigmax:,3x:,1;
yx;
When we run the MATLAB code above with initial condition 0.55,0.7, we obtain the following plot: 0.8
3.8287
5.3062
2.3512
3.0899
3.8287
1.6124
0.6 0.4 0.2
0 0.2 0.4 0.6 0.8
0.55 0.6
0.65 0.7
0.75 0.8 0.85 0.9
2.3512
0.87367
1.6124
0.13491
0.60384
2.0814
0.60384
5.7751
2.8201
3.5589
2.0814 3.5589
1.3426
5.0364
4.2976
5.0364
5.7751
5.7751
When we run the MATLAB code above with initial condition 0.9, 0.5, we obtain the following plot: 98

1
0.5
0
0.5
0.9 0.8 0.7 0.6 0.5 0.4
Note that this function has two local minimizers. The algorithm terminates at these two minimizers with the two dierent initial conditions. This behavior depends on the value of lambda, which determines the initial simplex. It is possible to reach both minimizers starting from the same initial point by using dierent values of lambda. In the solution above, the initial simplex is smalllambda is just 0.1.
14.2
A MATLAB routine for a naive random search algorithm is given by the Mfile rsdemo shown below:
function x,Nrandomsearchfuncname,xnew,options;
Naive random search demo
x,Nrandomsearchfuncname,xnew,options;
print options1;
alpha options18;
if nargin 3
options ;
if nargin 2
dispWrong number of arguments.;
return; end
end
if lengthoptions 14
if options140
options141000lengthxnew;
end
else
options141000lengthxnew;
end
if lengthoptions 18
options181.0; optional step size
end
format compact;
format short e;
options foptionsoptions;
print options1;
epsilonx options2;
2.8201
1.3426
2.0814
0.13491 0.60384
3.5589
2.8201
3.5589
3.5589
2.0814
3.5589
2.8201
2.8201
2.0814
1.3426
0.60384
0.87367
1.3426
0.13491
0.87367
0.60384
0.13491
1.6124
3.8287
99

epsilong options3;
maxiteroptions14;
alpha0 options18;
if funcname fr,
roscnt
elseif funcname fp,
pkscnt;
end if
if lengthxnew 2
plotxnew1,xnew2,o
textxnew1,xnew2,Start Point
xlower 2;1;
xupper 2;3;
end
f0fevalfuncname,xnew;
xbestcurr xnew;
xbestold xnew;
fbestfevalfuncname,xnew;
fbest10signfbestfbest;
for k 1:maxiter,
xcurrxbestcurr;
fcurrfevalfuncname,xcurr;
alpha alpha0;
xnew xcurr alpha2randlengthxcurr,11;
for i1:lengthxnew,
xnewi maxxnewi,xloweri;
xnewi minxnewi,xupperi;
end for
fnewfevalfuncname,xnew;
if fnew fbest,
xbestold xbestcurr;
xbestcurr xnew;
fbest fnew;
end
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispNew point ;
dispxnew; print new point
dispFunction value ;
dispfnew; print func value at new point
end if
if normxnewxbestold epsilonxnormxbestold
dispTerminating: Norm of difference between iterates less than;
dispepsilonx;
100

break;
end if
pltptsxbestcurr,xbestold;
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxbestcurr;
dispNumber of iterations ;
dispk;
end if
A MATLAB routine for a simulated annealing algorithm is given by the Mfile sademo shown below:
function x,Nsimulatedannealingfuncname,xnew,options;
Simulated annealing demo
randomsearchfuncname,xnew,options;
print options1;
gamma options15;
alpha options18;
if nargin 3
options ;
if nargin 2
dispWrong number of arguments.;
return; end
end
if lengthoptions 14
if options140
options141000lengthxnew;
end
else
options141000lengthxnew;
end
if lengthoptions 15
options155.0;
end
if options150
options155.0;
end
if lengthoptions 18
options180.5; optional step size
end
format compact;
101

format short e;
options foptionsoptions;
print options1;
epsilonx options2;
epsilong options3;
maxiteroptions14;
alpha options18;
gamma options15;
k02;
if funcname fr,
roscnt
elseif funcname fp,
pkscnt;
end if
if lengthxnew 2
plotxnew1,xnew2,o
textxnew1,xnew2,Start Point
xlower 2;1;
xupper 2;3;
end
f0fevalfuncname,xnew;
xbestcurr xnew;
xbestold xnew;
xcurr xnew;
fbestfevalfuncname,xnew;
fbest10signfbestfbest;
for k 1:maxiter,
fcurrfevalfuncname,xcurr;
xnew xcurr alpha2randlengthxcurr,11;
for i1:lengthxnew,
xnewi maxxnewi,xloweri;
xnewi minxnewi,xupperi;
end for
fnewfevalfuncname,xnew;
if fnew fcurr,
xcurr xnew;
fcurr fnew;
else
cointoss rand1;
Temp gammalogkk0;
Prob expfnewfcurrTemp;
if cointoss Prob,
xcurr xnew;
fcurr fnew;
end
end
if fnew fbest,
102

xbestold xbestcurr;
xbestcurr xnew;
fbest fnew;
end
if print,
dispIteration number k
dispk; print iteration index k
dispalpha ;
dispalpha; print alpha
dispNew point ;
dispxnew; print new point
dispFunction value ;
dispfnew; print func value at new point
end if
if normxnewxbestold epsilonxnormxbestold
dispTerminating: Norm of difference between iterates less
than;
dispepsilonx;
break;
end if
pltptsxbestcurr,xbestold;
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxbestcurr;
dispObjective function value ;
dispfbest;
dispNumber of iterations ;
dispk;
end if
To use the above routines, we also need the following Mfiles: pltpts.m:
function outpltptsxnew,xcurr
plotxcurr1,xnew1,xcurr2,xnew2,r,xnew1,xnew2,o,Erasemode,
none;
drawnow; Draws current graph now
pause1
out ;
fp.m:
function yfpx;
103

y31x1.2.expx1.2×21.2
10.x15x1.3×2.5.exp
x1.2×2.2 expx11.2×2.23;
yy;
pkscnt.m:
echo off
X 3:0.2:3;
Y 3:0.2:3;
x,ymeshgridX,Y ;
func 31x.2.expx.2y1.2
10.x5x.3y.5.expx.2y.2
expx1.2y.23;
func func;
clf
levels exp5:10;
levels 5:0.9:10;
contourX,Y,func,levels,k
xlabelx1
ylabelx2
titleMinimization of Peaks function
drawnow;
hold on
plot0.0303,1.5455,o
text0.0303,1.5455,Solution
To run the naive random search algorithms, we first pick a value of 0.5, which involves setting options180.5. We then use the command rsdemofp,0;2,options. The resulting plot of the algorithm trajectory is given below. As we can see, the algorithm is stuck at a local minimizer. By running the algorithm several times, the reader can verify that this nonconvergent behavior is typical.
3
2
1
0
1
2
Minimization of Peaks function
Solution
Start Point
x 2
3
3 2 1 0 1 2 3
x 1
Next, we try 1.5, which involves setting options181.5. We then use the command rsdemofp,0;2,options again, to obtain the plot shown below. This time, the algorithm reaches the global minimizer.
104

14.3
3
2
1
0
1
2
Minimization of Peaks function
Solution
Start Point
xx 22
3
3 2 1 0 1 2 3
x 1
Finally, we again set 0.5, using options180.5. We then run the simulated annealing code using sademofp,0;2,options. The algorithm can be seen to converge to the global minimizer, as plotted below.
3
2
1
0
1
2
Minimization of Peaks function
Solution
Start Point
3
3 2 1 0 1 2 3
x 1
A MATLAB routine for a particle swarm algorithm is:
A particle swarm optimizer
to find the minimummaximum of the MATLABs peaks function
D of inputs to the function dimension of problem
clear
Parameters
ps10;
D2;
pslb3;
psub3;
vellb1;
velub1;
iterationn 50;
range 3, 3; 3, 3; Range of the input variables
Plot contours of peaks function
x, y, z peaks;
pcolorx,y,z; shading interp; hold on;
105

contourx, y, z, 20, r;
meshx,y,z
hold off;
colormapgray;
setgca,Fontsize,14
axis3 3 3 3 9 9
axis square;
xlabelx1,Fontsize,14;
ylabelx2,Fontsize,14;
zlabelfx1,x2,Fontsize,14;
hold on
upper zerositerationn, 1;
average zerositerationn, 1;
lower zerositerationn, 1;
initialize population of particles and their velocities at time
zero,
format of pos particle, dimension
construct random population positions bounded by VR
need to bound positions
pspospslb psubpslb.randps,D;
need to bound velocities between mv,mv
psvelvellb velubvellb.randps,D;
initial pbest positions
pbest pspos;
returns column of cost values 1 for each particle
f131psposi,12exppsposi,12psposi,212;
f210psposi,15psposi,13psposi,25exppsposi,12psposi,22;
f313exppsposi,112psposi,22;
pbestfitzerosps,1;
for i1:ps
g1i31psposi,12exppsposi,12psposi,212;
g2i10psposi,15psposi,13psposi,25exppsposi,12psposi,22;
g3i13exppsposi,112psposi,22;
pbestfitig1ig2ig3i;
end
pbestfit;
handp3plot3pspos:,1,pspos:,2,pbestfit,k,markersize,15,erase,xor;
initial gbest
gbestval,gbestidx maxpbestfit;
gbestval,gbestidx minpbestfit; this is to minimize
gbestpsposgbestidx,:;
get new velocities, positions this is the heart of the PSO
algorithm
for k1:iterationn
for count1:ps
psvelcount,: 0.729psvelcount,:…
prev vel
106

1.494randpbestcount,:psposcount,:…
1.494randgbestpsposcount,:;
end
psvel;
update new position
pspos pspos psvel;
update pbest
for i1:ps
independent
social
g1i31psposi,12exppsposi,12psposi,212;
g2i10psposi,15psposi,13psposi,25exppsposi,12psposi,22;
g3i13exppsposi,112psposi,22;
pscurrentfitig1ig2ig3i;
if pscurrentfitipbestfiti
pbestfitipscurrentfiti;
pbesti,:psposi,:;
end end
pbestfit;
update gbest
gbestval,gbestidx maxpbestfit;
gbestpsposgbestidx,:;
Fill objective function vectors
upperk maxpbestfit;
averagek meanpbestfit;
lowerk minpbestfit;
sethandp3,xdata,pspos:,1,ydata,pspos:,2,zdata,pscurrentfit;
drawnow
pause
end
gbest
gbestval
figure;
x 1:iterationn;
plotx, upper, o, x, average, x, x, lower, ;
hold on;
plotx, upper average lower;
hold off;
legendBest, Average, Poorest;
xlabelIterations; ylabelObjective function value;
When we run the MATLAB code above, we obtain a plot of the initial set of particles, as shown below.
107

5
0 5
Then, after 30 iterations, we obtain:
5
0 5
Finally, after 50 iterations, we obtain:
5
0 5
2
0
2
x2
2 0
2
0
2
x2
2 0
2
x1
fx1,x2 fx1,x2 fx1,x2
2
0
2
x2
2 0
2
x1
108
2
x1

A plot of the objective function values best, average, and poorest is shown below.
9 8 7 6 5 4 3 2 1 0
Best
Average Poorest
1
0 10 20 30 40 50
Iterations
14.4
a. Expanding the right hand side of the second expression gives the desired result. b. Applying the algorithm, we get a binary representation of 11111001011, i.e.,
1995210 29 28 27 26 23 21 20. c. Applying the algorithm, we get a binary representation of 0.1011101, i.e.,
0.7265625 21 23 24 25 27.
d. We have 19 24 21 20, i.e., the binary representation for 19 is 10011. For the fractional part, we need at least 7 bits to keep at least the same accuracy. We have 0.95 21 22 23 24 27 , i.e., the binary representation is 0.1111001. Therefore, the binary representation of 19.95 with at least the same degree of accuracy is 10011.1111001.
14.5
It suces to prove the result for the case where only one symbol is swapped, since the general case is obtained by repeating the argument. We have two scenarios. First, suppose the symbol swapped is at a position corresponding to a dont care symbol in H. Clearly, after the swap, both chromosomes will still be in H. Second, suppose the symbol swapped is at a position corresponding to a fixed symbol in H. Since both chromosomes are in H, their symbols at that position must be identical. Hence, the swap does not change the chromosomes. This completes the proof.
14.6 T
Consider a given chromosome in Mk H. The probability that it is chosen for crossover is qc. If neither of its osprings is in H, then at least one of the crossover points must be between the corresponding first and last fixed symbols of H. The probability of this is 1 1 HL 12. To see this, note that the probability that each crossover point is not between the corresponding first and last fixed symbols is 1 HL 1, and thus the probability that both crossover points are not between the corresponding first and last fixed symbols of H is 1 HL 12. Hence, the probability that the given chromosome is chosen for crossover and neither of its osprings is in H is bounded above by
qc 11 H2!. L1
109
Objective function value

14.7
As for twopoint crossover, the npoint crossover operation is a composition of n onepoint crossover opera tions i.e., n onepoint crossover operations in succession. The required result for this case is as follows.
Lemma:
Given a chromosome in Mk is in H is bounded above by
T
H, the probability that it is chosen for crossover and neither of its osprings
For the proof, proceed as in the solution of Exercise 14.6, replacing 2 by n. 14.8
function Mroulettewheelfitness;
function Mroulettewheelfitness
fitness vector of fitness values of chromosomes in population
M vector of indices indicating which chromosome in the
given population should appear in the mating pool
fitness fitness minfitness; to keep the fitness positive
if sumfitness 0,
dispPopulation has identical chromosomes STOP;
break; else
fitness fitnesssumfitness;
end
cumfitness cumsumfitness;
for i 1:lengthfitness,
tmp findcumfitnessrand0;
Mi tmp1;
end
14.9
parent1, parent2 two binary parent chromosomes row vectors
L lengthparent1;
crossoverpt ceilrandL1;
offspring1 parent11:crossoverpt parent2crossoverpt1:L;
offspring2 parent21:crossoverpt parent1crossoverpt1:L;
14.10
matingpool matrix of 01 elements; each row represents a chromosome
pm probability of mutation
N sizematingpool,1;
L sizematingpool,2;
mutationpoints randN,L pm;
newpopulation xormatingpool,mutationpoints;
14.11
A MATLAB routine for a genetic algorithm with binary encoding is: 110
qc11 Hn. L1
2

function winner,bestfitness gaL,N,fitfunc,options
function winner GAL,N,fitfunc
Function call: GAL,N,f
L length of chromosomes
N population size must be an even number
f name of fitness value function

Options:
print options1;
selection options5;
maxiteroptions14;
pc options18;
pm pc100;

Selection:
options5 0 for roulette wheel, 1 for tournament
clf;
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return; end
end
if lengthoptions 14
if options140
options143N;
end
else
options143N;
end
if lengthoptions 18
options180.75; optional crossover rate
end
format compact;
format short e;
options foptionsoptions;
print options1;
selection options5;
maxiteroptions14;
pc options18;
pm pc100;
P randN,L0.5;
bestvaluesofar 0;
Initial evaluation
for i 1:N,
fitnessi fevalfitfunc,Pi,:;
end
bestvalue,best maxfitness;
if bestvalue bestvaluesofar,
bestsofar Pbest,:;
bestvaluesofar bestvalue;
111

end
for k 1:maxiter,
Selection
fitness fitness minfitness; to keep the fitness positive
if sumfitness 0,
dispPopulation has identical chromosomes STOP;
dispNumber of iterations:;
dispk;
for i k:maxiter,
upperiupperi1;
averageiaveragei1;
loweriloweri1;
end
break; else
fitness fitnesssumfitness;
end
if selection 0,
roulettewheel
cumfitness cumsumfitness;
for i 1:N,
tmp findcumfitnessrand0;
mi tmp1;
end
else
tournament
for i 1:N,
fighter1ceilrandN;
fighter2ceilrandN;
if fitnessfighter1fitnessfighter2,
mi fighter1;
else
mi fighter2;
end
end end
M zerosN,L;
for i 1:N,
Mi,: Pmi,:;
end
Crossover
Mnew M;
for i 1:N2
ind1 ceilrandN;
ind2 ceilrandN;
parent1 Mind1,:;
parent2 Mind2,:;
if rand pc
crossoverpt ceilrandL1;
offspring1 parent11:crossoverpt parent2crossoverpt1:L;
offspring2 parent21:crossoverpt parent1crossoverpt1:L;
Mnewind1,: offspring1;
Mnewind2,: offspring2;
end end
112

Mutation
mutationpoints randN,L pm;
P xorMnew,mutationpoints;
Evaluation
for i 1:N,
fitnessi fevalfitfunc,Pi,:;
end
bestvalue,best maxfitness;
if bestvalue bestvaluesofar,
bestsofar Pbest,:;
bestvaluesofar bestvalue;
end
upperk bestvalue;
averagek meanfitness;
lowerk minfitness;
end for
if k maxiter,
dispAlgorithm terminated after maximum number of iterations:;
dispmaxiter;
end
winner bestsofar;
bestfitness bestvaluesofar;
if print,
iter 1:maxiter;
plotiter,upper,o:,iter,average,x,iter,lower,;
legendBest, Average, Worst;
xlabelGenerations,Fontsize,14;
ylabelObjective Function Value,Fontsize,14;
setgca,Fontsize,14;
hold off;
end
a. To run the routine, we create the following Mfiles.
function dec bin2decbin,range;
function dec bin2decbin,range;
Function to convert from binary bin to decimal dec in a given range
index polyvalbin,2;
dec indexrange2range12lengthbin1 range1;
function yfmanymaxx;
y15sin2x2x22160;
function yfitfunc1binchrom;
1D fitness function
ffmanymax;
range10,10;
xbin2decbinchrom,range;
yfevalf,x;
We use the following script to run the algorithm: 113

clear;
options11;
x,yga8,10,fitfunc1,options;
ffmanymax;
range10,10;
dispGA Solution:;
dispbin2decx,range;
dispObjective function value:;
dispy;
Running the above algorithm, we obtain a solution of x 1.6078, and an objective function value of 159.7640. The figure below shows a plot of the best, average, and worst solution from each generation of the population.
160 150 140 130 120 110 100
90 80 70
60
0 5 10 15 20 25 30
Generations
b. To run the routine, we create the following Mfiles we also use the routine bin2dec from part a. function yfpeaksx;
y31x1.2.expx1.2×21.2
10.x15x1.3×2.5.exp
x1.2×2.2 expx11.2×2.23;
function yfitfunc2binchrom;
2D fitness function
ffpeaks;
xrange3,3;
yrange3,3;
Llengthbinchrom;
x1bin2decbinchrom1:L2,xrange;
x2bin2decbinchromL21:L,yrange;
yfevalf,x1,x2;
We use the following script to run the algorithm:
clear;
options11;
x,yga16,20,fitfunc2,options;
ffpeaks;
xrange3,3;
Best Average Worst
114
Objective Function Value

yrange3,3;
Llengthx;
x1bin2decx1:L2,xrange;
x2bin2decxL21:L,yrange;
dispGA Solution:;
dispx1,x2;
dispObjective function value:;
dispy;
A plot of the objective function is shown below.
160 140 120 100
80 60 40 20
0
10 5 0 5 10
x
Running the above algorithm, we obtain a solution of 0.0353, 1.4941, and an x 0.0588, 1.5412, and an objective function value of 7.9815. Compare this solution with that of Example 14.3. The figure below shows a plot of the best, average, and worst solution from each generation of the population.
8 6 4 2 0
2 4 6
Best Average Worst
14.12
0 10 20 30 40 50 60
Generations
A MATLAB routine for a realnumber genetic algorithm:
function winner,bestfitness garDomain,N,fitfunc,options
function winner GARDomain,N,fitfunc
Function call: GARDomain,N,f
Domain search space; e.g., 2,2;3,3 for the space 2,2×3,3
115
Objective Function Value
fx

number of rows of Domain dimension of search space
N population size must be an even number
f name of fitness value function

Options:
print options1;
selection options5;
maxiteroptions14;
pc options18;
pm pc100;

Selection:
options5 0 for roulette wheel, 1 for tournament
clf;
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return; end
end
if lengthoptions 14
if options140
options143N;
end
else
options143N;
end
if lengthoptions 18
options180.75; optional crossover rate
end
format compact;
format short e;
options foptionsoptions;
print options1;
selection options5;
maxiteroptions14;
pc options18;
pm pc100;
n sizeDomain,1;
lowb Domain:,1;
upb Domain:,2;
bestvaluesofar 0;
for i 1:N,
Pi,: lowb rand1,n.upblowb;
Initial evaluation
fitnessi fevalfitfunc,Pi,:;
end
bestvalue,best maxfitness;
if bestvalue bestvaluesofar,
bestsofar Pbest,:;
bestvaluesofar bestvalue;
end
116

for k 1:maxiter,
Selection
fitness fitness minfitness; to keep the fitness positive
if sumfitness 0,
dispPopulation has identical chromosomes STOP;
dispNumber of iterations:;
dispk;
for i k:maxiter,
upperiupperi1;
averageiaveragei1;
loweriloweri1;
end
break; else
fitness fitnesssumfitness;
end
if selection 0,
roulettewheel
cumfitness cumsumfitness;
for i 1:N,
tmp findcumfitnessrand0;
mi tmp1;
end
else
tournament
for i 1:N,
fighter1ceilrandN;
fighter2ceilrandN;
if fitnessfighter1fitnessfighter2,
mi fighter1;
else
mi fighter2;
end
end end
M zerosN,n;
for i 1:N,
Mi,: Pmi,:;
end
Crossover
Mnew M;
for i 1:N2
ind1 ceilrandN;
ind2 ceilrandN;
parent1 Mind1,:;
parent2 Mind2,:;
if rand pc
a rand;
offspring1 aparent11aparent2rand1,n0.5.upblowb10;
offspring2 aparent21aparent1rand1,n0.5.upblowb10;
do projection
for j 1:n,
if offspring1jlowbj,
offspring1jlowbj;
elseif offspring1jupbj,
offspring1jupbj;
end
117

if offspring2jlowbj,
offspring2jlowbj;
elseif offspring2jupbj,
offspring2jupbj;
end end
Mnewind1,: offspring1;
Mnewind2,: offspring2;
end
end
Mutation
for i 1:N,
if rand pm,
a rand;
Mnewi,: aMnewi,: 1alowb rand1,n.upblowb;
end end
P Mnew;
Evaluation
for i 1:N,
fitnessi fevalfitfunc,Pi,:;
end
bestvalue,best maxfitness;
if bestvalue bestvaluesofar,
bestsofar Pbest,:;
bestvaluesofar bestvalue;
end
upperk bestvalue;
averagek meanfitness;
lowerk minfitness;
end for
if k maxiter,
dispAlgorithm terminated after maximum number of iterations:;
dispmaxiter;
end
winner bestsofar;
bestfitness bestvaluesofar;
if print,
iter 1:maxiter;
plotiter,upper,o:,iter,average,x,iter,lower,;
legendBest, Average, Worst;
xlabelGenerations,Fontsize,14;
ylabelObjective Function Value,Fontsize,14;
setgca,Fontsize,14;
hold off;
end
To run the routine, we create the following Mfile for the given function.
function yfwavex;
yx1sinx1 x2sin5x2;
118

We use the following script to run the algorithm:
options11;
options1450;
x,ygar0,10;4,6,20,fwave,options;
dispGA Solution:;
dispx;
dispObjective function value:;
dispy;
Running the above algorithm, we obtain a solution of x 7.9711,5.3462, and an objective function value of 13.2607. The figure below shows a plot of the best, average, and worst solution from each generation of the population.
15
10
5
0
5
Best Average Worst
10
0 10 20 30 40 50
Generations
Using the MATLAB function fminunc from the Optimization Toolbox, we found the optimal point to be 7.9787,5.3482, with objective function value 13.2612. We can see that this solution agrees with the solution obtained using our realnumber genetic algorithm.
119
Objective Function Value

15. Introduction to Linear Programming
15.1
15.2
We have
minimize 2×1 x2 subjecttox1x3 x1x2x4 x12x2x5 x1,…,x5
2 3 5 0
x2 ax1 bu1 a2x0 abu0 bu1 a2 ab,bu
where u u0 , u1 is the decision variable. We can write the constraint as ui 1 and ui 1. Hence, the
problem is:
minimize a2 ab, bu subjectto 1ui 1, i1,2.
Since a2 is a constant, we can remove it from the objective function without changing the solution. Intro ducing slack variables v1, v2, v3, v4, we obtain the standard form problem
15.3
minimize subject to
ab, bu u0v11 u0 v2 1 u1 v3 1 u1 v4 1.
Let xi , xi 0 be such that xi xi xi , xi xi xi . Substituting into the original problem, we have
minimize c1x1 x1 c2x2 x2 cnx2 x2 subject to Ax x b
x,x 0,
where x x1 ,…,xn and x x1 ,…,xn . Rewriting, we get
minimize c , c z subject to A,Azb
z 0,
which is an equivalent linear programming problem in standard form.
Note that although the variables xi and xi in the solution are required to satisfy xi xi 0, we do
not need to explicitly include this in the constraint because any optimal solution to the above transformed problem automatically satisfies the condition xi xi 0. To see this, suppose we have an optimal solution with both xi 0 and xi 0. In this case, note that ci 0 for otherwise we can add any arbitrary constant to both xi and xi and still satisfy feasibility, but decrease the objective function value. Then, by subtracting minxi ,xi from xi and xi , we have a new feasible point with lower objective function value, contradicting the optimality assumption. See also M. A. Dahleh and I. J. DiazBobillo, Control of Uncertain Systems: A Linear Programming Approach, Prentice Hall, 1995, pp. 189190.
120

15.4
Not every linear programming problem in standard form has a nonempty feasible set. Example:
minimize x1 subject to x1 1
x1 0.
Not every linear programming problem in standard form even assuming a nonempty feasible set has an
optimal solution. Example:
15.5
minimize x1 subject to x2 1
x1,x2 0.
Let x1 and x2 represent the number of units to be shipped from A to C and to D, respectively, and x3 and x4 represent the number of units to be shipped from B to C and to D, respectively. Then, the given problem can be formulated as the following linear program:
minimize subject to
Introducing slack variables x5 and x6, we have the standard form problem
15.6
minimize subject to
x1 2×2 3×3 4×4 x1x350
x2 x4 60
x1 x2 x5 70
x3 x4 x6 80 x1,x2,x3,x4,x5,x6 0.
x1 2×2 3×3 4×4 x1 x3 50
x2 x4 60
x1 x2 70
x3 x4 80 x1,x2,x3,x4 0.
We can see that there are two paths from A to E ACDE and ACBFDE, and two paths from B to F BCDF and BF. Let x1 and x2 be the data rates for the two paths from A to E, respectively, and x3 and x4 the data rates for the two paths from B to F , respectively. The total revenue is then 2×1 x2 3×3 x4. For each link, we have a data rate constraint on the sum of all xis passing through that link. For example,
121

for link BC, there are two paths passing through it, with total data rate x2 x3. Hence, the constraint for link BC is x2 x3 7. Hence, the optimization problem is the following linear programming problem:
Converting this to standard form:
subject to
15.7
maximize subject to
2×1 x2 3×3 x4 x1 x2 10
x1 x2 12
x1 x3 8
x2 x3 7
x2 x3 4
x2 x4 3 x1,…,x4 0.
2x1x23x3x4 x1x2x510
x1 x2 x6 12
x1 x3 x7 8
x2 x3 x8 7 x2 x3 x9 4 x2 x4 x10 3 x1,…,x10 0.
minimize
Let xi 0, i 1,…,4, be the weight in pounds of item i to be used. Then, the total weight is x1x2x3x4. To satisfy the percentage content of fiber, fat, and sugar, and the total weight of 1000, we need
3×1 8×2 16×3 4×4 6×1 46×2 9×3 9×4 20×1 5×2 4×3 0x4
10×1 x2 x3 x4 2×1 x2 x3 x4 5×1 x2 x3 x4 1000
x1 x2 x3 x4
The total cost is 2×1 4×2 x3 2×4. Therefore, the problem is:
minimize 2×1 4×2 x3 2×4 subject to 7×1 2×2 6×3 6×4 4×1 44×2 7×3 7×4 15x1x35x4 x1 x2 x3 x4 x1,x2,x3,x4
0
0
0
1000 0
Alternatively, we could have simply replaced x1 x2 x3 x4 in the first three equality constraints above by 1000, to obtain:
3×1 8×2 16×3 4×4 10000 6×1 46×2 9×3 9×4 2000 20×1 5×2 4×3 0x4 5000
x1 x2 x3 x4 1000.
Note that the only vector satisfying the above linear equations is 179,175,573,422, which is not feasible. Therefore, the constraint does not have any any feasible points, which means that the problem does not have a solution.
122

15.8
The objective function is p1 pn. The constraint for the ith location is: gi,1p1 gi,npn P . Hence, the the optimization problem is:
minimize p1 pn
subject to gi,1p1 gi,npn P, i 1, . . . , m
p1,…,pn 0.
By defining the notation G gi,j m n, en 1,…,1 with n components, and p p1,…,pn,
we can rewrite the problem as
minimize en p subject to Gp Pem
p 0.
It is easy to check using MATLAB, for example that the matrix
26 2 1 2 1 3 37 A41 2 3 1 05
1 0 2 0 5
is of full rank i.e., rank A 3. Therefore, the system has basic solutions. To find the basic solutions, we first select bases. Each basis consists of three linearly independent columns of A. These columns correspond to basic variables of the basic solution. The remaining variables are nonbasic and are set to 0. The matrix A has 5 columns; therefore, we have 53 10 possible candidate basic solutions corresponding to the 10 combinations of 3 columns out of 5. It turns out that all 10 combinations of 3 columns of A are linearly independent. Therefore, we have 10 basic solutions. These are tabulated as follows:
15.9
15.10
Columns 1,2,3 1,2,4 1,2,5 1,3,4 1,3,5 1,4,5 2,3,4 2,3,5 2,4,5 3,4,5
Basic Solutions 417, 8017, 8317, 0, 0 10, 49, 0, 83, 0 10531, 2531, 0, 0, 8331 1211, 0, 4911, 8011, 0 10035, 0, 2535, 0, 8035 6518, 0, 0, 2518, 4918 0,6,5,2,0
0, 10023, 10523, 0, 423 0, 13, 0, 21, 2
0, 0, 6519, 10019, 1219
In the figure below, the shaded region corresponds to the feasible set. We then translate the line 2×1 5×2 0 across the shaded region until the line just touches the region at one point, and the line is as far as possible from the origin. The point of contact is the solution to the problem. In this case, the solution is 2,6, and the corresponding cost is 34.
123

x2
2 6
0 6
4 4
x1
4 0
15.11
We use the following MATLAB commands:
f0,10,0,6,20;
A1,1,1,0,0; 0,0,1,1,1;
b0;0;
vlbzeros5,1;
vub4;3;3;2;2;
x0zeros5,1;
neqcstr2;
xlinprogf,A,b,vlb,vub,x0,neqcstr
x
4.0000
2.0000
2.0000
0.0000
2.0000
The solution is 4, 2, 2, 0, 2 . 16. The Simplex Method
124
x1x28
2x15x234
2x15x20

16.1
a. Performing a sequence of elementary row operations, we obtain
26 1 2 1 3 2 37 26 1 2 1 3 2 37 26 1 2 1 3 2 37 A62 1 3 0 17!60 5 5 6 37!60 5 5 6 37B. 43 1 2 3 35 40 5 5 6 35 40 0 4 2 15
1 2 3 1 1 0 0 4 2 1 0 0 0 0 0
Because elementary row operations do not change the rank of a matrix, rankA rankB. Therefore
rank A 3.
b. Performing a sequence of elementary row operations, we obtain
261 1 237 261 10 6 137 261 10 6 137 A42155!41125!4010 5 15
1 10 6 1 2 1 5 0 21 12 3
261110 6372611 10 637 !40 1 10 5 5!40 1 10 5 5B
0 3 21 12 0 0 33 3
Because elementary row operations do not change the rank of a matrix, rankA rankB. Therefore
rank A 3 if 6 3 and rank A 2 if 3. 16.2
a.
A 3 1 0 1, b 4, c 2,1,1,0. 6211 5
b. Pivoting the problem tableau about the elements 1, 4 and 2, 3, we obtain
31014 31101 50001
c. Basic feasible solution: x 0, 0, 1, 4 , c x 1.
d. r1,r2,r3,r4 5,0,0,0.
e. Since the reduced cost coecients are all 0, the basic feasible solution in part c is optimal.
f. The original problem does indeed have a feasible solution, because the artificial problem has an optimal feasible solution with objective function value 0, as shown in the final phase I tableau.
g. Extract the submatrices corresponding to A and b, append the last row c,0, and pivot about the 2, 1th element to obtain
16.3
The problem in standard form is:
0 0 1 1 3
1 13 13 0 13 0 53 53 0 23
minimize subject to
x1 x2 3×3 x1 x3 1
x2 x3 2 x1,x2,x3 0.
125

We form the tableau for the problem:
Performing necessary row operations, we obtain a tableau in canonical form:
1011 0112 0 0 1 3
0, 1, 1 . The optimal cost is 4. 16.4
The problem in standard form is:
We form the tableau for the problem:
minimize subject to
2×1 x2
x1 x3 5
x2 x4 7
x1 x2 x5 9 x1,…,x5 0.
1011
0112 1 1 3 0
We pivot about the 1, 3th element to get:
1 1 0 1
1004
The reduced cost coecients are all nonnegative. Hence, the current basic feasible solution is optimal:
1011
101005 010107 110019
2 1 0 0 0 0
The above tableau is already in canonical form, and therefore we can proceed with the simplex procedure.
We first pivot about the 1, 1th element, to get
101005 010107 0 1 1 0 1 4 01 2 0010
Next, we pivot about the 3, 2th element to get
101005 0 0 1 1 1 3 0 1 1 0 1 4 0 0 1 0 1 14
The reduced cost coecients are all nonnegative. Hence, the optimal solution to the problem in standard form is 5, 4, 0, 3, 0. The corresponding optimal cost is 14.
126

16.5
a. Let B a2, a1 represent the first two columns of A ordered according to the basis corresponding to the given canonical tableau, and D the second two columns. Then,
Hence, Hence,
B1D1 2, 34
1 21 32 12 BD34 2 1 .
A12 32 0 1 1 2 1 0
An alternative approach is to realize that the canonical tableau is obtained from the problem tableau via elementary row operations. Therefore, we can obtain the entries of A from the 2 4 upperleft submatrix of the canonical tableau via elementary row operations also. Specifically, start with
0 1 1 2 1034
and then do two pivoting operations, one about 1, 4 and the other about 2, 3. b. The righthalf of c is given by
cD rD cBB1D1,17,81 21,131,4630,47. 34
So c 8,7,30,47.
c. First we calculate B1b, giving us the basic variable values:
B1b 2 15 16. 436 38
Hence, the BFS is 38, 16, 0, 0 .
d. The first two entries are 16 and 38, respectively. The last component is cBB1b 716 838
416. Hence, the last column is the vector 16, 38, 416 .
16.6
The columns in the constraint matrix A corresponding to xi and xi are linearly dependent. Hence they cannot both enter a basis at the same time. This means that only one variable, xi or xi , can assume a nonnegative value; the nonbasic variable is necessarily zero.
16.7
a. From the given information, we have the 4 6 canonical tableau
26 1 0 0 1 2 1 37 60 1 0 0 27 40 0 1 0 35 0 1 0 0 1 6
Explanations:
The given vector x indicates that A is 3 5.
127

In the above tableau, we assume that the basis is a1,a3,a4, in this order. Other permutations of orders will result in interchanging rows among the first three rows of the tableau.
The fifth column represents the coordinates of a5 with respect to the basis a1,a3,a4. Because 2, 0, 0, 0, 4 lies in the nullspace of A, we deduce that 2a1 4a5 0, which can be rewritten as a5 12a1 0a3 0a4, and hence the coordinate vector is 12, 0, 0.
b. Let d0 2, 0, 0, 0, 4. Then, Ad0 0. Therefore, the vector x0 x d0 also satisfies Ax b. Now, x0 x d0 1 2, 0, 2, 3, 4. For x0 to be feasible, we must have 12. Moreover, the objective function value of x0 is cx0 z0 r5x05 6 4, where z0 is the objective function value of x. So, if we pick any 2 0,12, then x0 will be a feasible solution with objective function value strictly less than 6. For example, with 12, x0 0, 0, 2, 3, 2 is such a point. We could also have obtained this solution by pivoting about the element 1, 5 in the tableau of part a.
16.8
a. The BFS is 6, 0, 7, 5, 0, with objective function value 8.
b. r 0,4,0,0,4.
c. Yes, because the 5th column has all negative entries.
d. We pivot about the element 3, 2. The new canonical tableau is:
26 0 0 1 3 1 0 8 3 37 61 0 23 0 0 437 401 13 01 735 00430 0 43
e. First note that based on the 5th column, the following point is feasible: 26 60 37 26 20 37
Note that x5 . Now, any solution of the form x , 0, , , has an objective function value given by
z z0 r5
where z0 8 and r5 4 from parts a and b. If z 100, then 23. Hence, the following point has
objective function value z 100:
607 607 6 0 7 x 677 23 637 6767 .
455 415 4285 0 1 23
f. The entries of the 2nd column of the given canonical tableau are the coordinates of a2 with respect to the
basis a4, a1, a3. Therefore,
a2 a4 2a1 3a3.
x 677 637 . 4 50 5 4 1 1 5
26 637 26 237 26 5237
Therefore, the vector 2, 1, 3, 1, 0 lies in the nullspace of A. Similarly, using the entries of the 5th column, we deduce that 2, 0, 3, 1, 1 also lies in the nullspace of A. These two vectors are linearly independent. Because A has rank 3, the dimension of the nullspace of A is 2. Hence, these two vectors form a basis for the nullspace of A.
128

16.9
a. We can convert the problem to standard form by multiplying the objective function by 1 and introducing a surplus variable x3. We obtain:
minimize x1 2×2 subject to x2 x3 1
x1,x2,x3 0.
Note that we do not need to deal with the absence of the constraint x2 0 in the original problem, since x2 1impliesthatx2 0also. Hadweusedtheruleofwritingx2 uvwithu,v0,weobtainthe standard form problem:
minimize x1 2u 2v subjectto uvx3 1 x1,u,v,x3 0.
b. For phase I, we set up the artificial problem tableau as:
0 1 1 1 1 00010
Pivoting about element 1, 4, we obtain the canonical tableau: 0 1 1 1 1
01 1 01
Pivoting now about element 1, 2, we obtain the next canonical tableau:
0 1 1 1 1 00010
Hence, phase I terminates, and we use x2 as our initial basic variable for phase II. For phase II, we set up the problem tableau as:
0 1 1 1 1200
Pivoting about element 1, 2, we obtain
0 1 1 1
1 0 2 2
Hence, the BFS 0,1,0 is optimal, with objective function value 2. Therefore, the optimal solution to the
original problem is 0,1 with objective function value 2. 16.10
a. 1,0
b.h1 1 1i 1 1 1
Note that the answer is not 0 1 1 , which is the canonical tableau.
c. We choose q 2 because the only negative RCC value is r2. However, y1,2 0. Therefore, the simplex
algorithm terminates with the condition that the problem is unbounded.
d. Anyvectoroftheformx1,x11,x1 1,isfeasible. Thereforethefirstcomponentcantakearbitrarily
large positive values. Hence, the objective function, which is x1, can take arbitrarily negative values. 129

16.11
The problem in standard form is:
We compute
c 1 1 0 0 0 1 1 21
minimize subject to
x1 x2
x1 2×2 x3 3 2×1 x2 x4 3 x1,x2,x3,x4 0.
We will use x1 and x2 as initial basic variables. Therefore, Phase I is not needed, and we immediately proceed with Phase II. The tableau for the problem is:
a1 a2 a3 a4 b 1 2 1 0 3 2 1 0 1 3
1, 1, 0, 0. Therefore, the solution to the original problem is 1, 1, and the corresponding cost is 2. 16.12
cBB 1,1 2 1 13,13,
rD cD D0,013,131 0 13,13r3,r4.
0 1
The reduced cost coecients are all nonnegative. Hence, the solution to the standard form problem is
a. The problem in standard form is:
minimize subject to
4×1 3×2
5×1 x2 x3 11 2×1 x2 x4 8 x1 2×2 x5 7 x1,…,x5 0.
We do not have an apparent basic feasible solution. Therefore, we will need to use the two phase method. Phase I: We introduce artificial variables x6, x7, x8 and form the following tableau.
a1 a2 a3 a4 a5 a6 a7 a8 b 5 1 1 0 0 1 0 0 11 2 1 0 1 0 0 1 0 8 1 2 0 0 1 0 0 1 7
c 0 0 0 0 0 1 1 1 0 We then form the following revised tableau:
We compute:
Variable B1 y0 x6 1 0 0 11 x7 0108 x8 0017
1,1,1
rD r1,r2,r3,r4,r5 8,4,1,1,1.
130

We form the augmented revised tableau by introducing y1 B1a1 a1:
Variable B1 y0 y1 x6 1 0 0 11 5 x7 01082 x8 00171
We now pivot about the first component of y1 to get
We compute
Variable B1 y0 x1 15 0 0 115 x7 25 1 0 185 x8 15 0 1 245
35, 1, 1
r2, r3, r4, r5, r6 125, 35, 1, 1, 85.
rD
We bring y2 B1a2 into the basis to get
Variable
x1 15 0 0 115 15 x7 25 1 0 185 35 x8 15 0 1 245 95
We pivot about the third component of y2 to get
B1 y0 y2
We compute
Variable B1 y0 x1 29 0 19 53 x7 13 1 13 2 x2 19 0 59 83
13, 1, 13
r3, r4, r5, r6, r8 13, 1, 13, 43, 43.
We bring y3 B1a3 into the basis to get Variable
x1 x7 x2
B1 y0 y3 29 0 19 53 29
13 1 13 2 13 19 0 59 83 19
rD
We pivot about the second component of y3 to obtain
We compute
Variable B1 y0 x1 0 23 13 3 x3 1 3 1 6 x2 0 13 23 2
0,0,0
rD r4,r5,r6,r7,r8 0,0,1,1,1 0.
131

Thus, Phase I is complete, and the initial basic feasible solution is 3, 2, 6, 0, 0 . Phase II
We form the tableau for the original problem:
a1 a2 a3 a4 a5 b 5 1 1 0 0 11 2 1 0 1 0 8 1 2 0 0 1 7
c 4 3 0 0 0 0
The initial revised tableau for Phase II is the final revised tableau for Phase I. We compute
0, 53, 23
rD r4, r5 53, 23 0.
Hence, the optimal solution to the original problem is 3,2. b. The problem in standard form is:
minimize subject to
6×1 4×2 7×3 5×4
x1 2×2 x3 2×4 x5 20 6×1 5×2 3×3 2×4 x6 100 3×1 4×2 9×3 12×4 x7 75 x1,…,x7 0.
We have an apparent basic feasible solution: 0, 0, 0, 20, 100, 75, corresponding to B I3. We form the revised tableau corresponding to this basic feasible solution:
B1 y0 x5 1 0 0 20
x6 0 1 0 100 x7 0 0 1 75
0,0,0
rD r1, r2, r3, r4 6, 4, 7, 5.
Variable
We compute
We bring y2 B1a3 a3 into the basis to obtain
Variable B1 y0 y3 x5 100201 x6 0 1 0 100 3 x7 001759
We pivot about the third component of y3 to get
We compute
Variable
x5 1 0 19 353
x6 0 1 13 75
x3 0 0 19 253
0, 0, 79
r1, r2, r4, r7 113, 89, 133, 79.
rD
B1 y0
132

We bring y1 B1a1 into the basis to obtain
Variable B1 y0 y1 x5 1 0 19 353 23 x6 0113755 x3 0 0 19 253 13
We pivot about the second component of y1 to obtain
B1 y0
We compute
Variable
x5 1 215 115 53 x1 0 15 115 15 x3 0 115 215 103
0, 1115, 815
r2, r4, r6, r7 2715, 4315, 1115, 815 0.
rD
The optimal solution to the original problem is therefore 15, 0, 103, 0 . 16.13
a. By inspection of r, we conclude that the basic variables are x1,x3,x4, and the basis matrix is
260 0 137 B41 0 05.
010
Since r 0, the basic feasible solution corresponding to the basis B is optimal. This optimal basic
feasible solution is 8, 0, 9, 7 .
b. An optimal solution to the dual is given by
where cB 6, 4, 5, and
c B B 1 ,
1 2601037 B 40 0 15.
100
c. We have rD cD D, where rD 1, cD c2, 5,6,4, and D 2,1,3. We get
We obtain 5, 6, 4. 1c210612,whichyieldsc2 29.
16.14
a. There are two basic feasible solutions: 1, 0 and 0, 2 .
b. The feasible set in R2 for this problem is the line segment joining the two basic feasible solutions 1, 0 and 0,2. Therefore, if the problem has an optimal feasible solution that is not basic, then all points in the feasible set are optimal. For this, we need
c1 2, c2 1
where 2 R.
c. Since all basic feasible solutions are optimal, the relative cost coecients are all zero.
133

16.15
a. 2 0, 0, 0, and anything. b. 2 0, 7, and anything.
c. 2 0, 0, either 0 or 5 4, and anything.
16.16
a. The value of must be 0, because the objective function value is 0 lower right corner, and is the value of an artificial variable.
The value of must be 0, because it is the RCC value corresponding to a basic column.
The value of must be 2, because it must be a positive value. Otherwise, there is a feasible solution to the artificial problem with objective function value smaller than 0, which is impossible.
The value of must be 0, because we must be able to bring the fourth column into the basis without changing the objective function value.
b. The given linear programming problem does indeed have a feasible solution: 0, 5, 6, 0. We obtain this by noticing that the rightmost column is a linear combination of the second and third columns, with coecients 5 and 6.
16.17
First, we convert the inequality constraint Ax b into standard form. To do this, we introduce a variable w 2 Rm of surplus variables to convert the inequality constraint into the following equivalent constraint:
A,Iwxb, w0.
Next, we introduce variables u, v 2 Rn to replace the free variable x by u v. We then obtain the following
equivalent constraint:
This form of the constraint is now in standard form. So we can now use Phase I of the simplex method to implement an algorithm to find a vectors u, v, and w satisfying the above constraint, if such exist, or to declare that none exists. If such exist, we output x u v; otherwise, we declare that no x exists such that Ax b. By construction, this algorithm is guaranteed to behave in the way specified by the question.
16.18
a. We form the tableau for the problem:
1 0 0 14 8 1 9 0 010 12 121230 0010 0 101 00034 20 1260
The above tableau is already in canonical form, and therefore we can proceed with the simplex procedure. We first pivot about the 1, 4th element, to get
4 0 0 1 32 4 36 0 2 1 0 0 4 32 15 0 00100101 3 0 0 0 4 72 33 0
Pivoting about the 2, 5th element, we get
12 8 0 1 0 8 84 0 12 14 0 0 1 38 154 0 00100101 1 1 0 0 0 2 18 0
26 u 37
A,A,I4wv5 b, u,v,w 0.
134

Pivoting about the 1, 6th element, we get
32 1 0 18 0 1 116 18 0 364 1 0 32 1 1 18 0 0 2 3 0 14 0 0
Pivoting about the 2, 7th element, we get
2 6 0 52 56 13 23 0 14 163 2 6 1 52 56
1 1 0 12 16 Pivoting about the 1, 1th element, we get
212 0 316 0 212 1
3 0
100 0 1 0 0 0 1 000
1 3 0 54 28 12 0 0 0130 16 41610 00100101 0 2 0 74 44 12 0 0
Pivoting about the 2, 2th element, we get
1 0 0 14 8 1 9 0 010 12 121230 0010 0 101 00034 20 1260
which is identical to the initial tableau. Therefore, cycling occurs.
b. We start with the initial tableau of part a, and pivot about the 1, 4th element to obtain
4 0 0 1 32 4 36 0 2 1 0 0 4 32 15 0 00100101 3 0 0 0 4 72 33 0
Pivoting about the 2, 5th element, we get
12 8 0 1 0 8 84 0 12 14 0 0 1 38 154 0 00100101 1 1 0 0 0 2 18 0
Pivoting about the 1, 6th element, we get
32 1 0 18 0 116 18 0 364 1 32 1 1 18 0 2 3 0 14 0
Pivoting about the 2, 1th element, we get
0 2 0 1 24 12034 16 0 2 1 1 24 01054 32
1 212 0 0 316 0 0 212 1 0 3 0
160 0 3 0 0 6 1 0 3 0
135

Pivoting about the 3, 2th element, we get
00100101 1 0 1 14 8 0 9 1 0112 12 120312 001234 20 0612
Pivoting about the 3, 4th element, we get
00100101 112340 2 015234 0 2 1 1 24 0 6 1 0 32 54 0 2 0 212 54
The reduced cost coecients are all nonnegative. Hence, the optimal solution to the problem is 34, 0, 0, 1, 0, 1, 0. The corresponding optimal cost is 54.
16.19
a. We have
Ad0 Ax1 x00 b b0 0.
b. From our discussion of moving from one BFS to an adjacent BFS, we deduce that
d0 yq . eqm
In other words, the first m components of d0 are y1q , . . . , ymq , and all the other components are 0 except the qth component, which is 1.
16.20
The following is a MATLAB function that implements the simplex algorithm.
function x,vsimplexc,A,b,v,options
SIMPLEXc,A,b,v;
SIMPLEXc,A,b,v,options;

x SIMPLEXc,A,b,v;
x SIMPLEXc,A,b,v,options;

x,v SIMPLEXc,A,b,v;
x,v SIMPLEXc,A,b,v,options;

SIMPLEXc,A,b,v solves the following linear program using the
Simplex Method:
min cx subject to Axb, x0,
where A b is in canonical form, and v is the vector of indices of
basic columns. Specifically, the vith column of A is the ith
standard basis vector.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS5 specifies how the pivot element is selected;
0choose the most negative relative cost coefficient;
1use Blands rule.
Hence, d0 2 N A.
136

if nargin 5
options ;
if nargin 4
dispWrong number of arguments.;
return; end
end
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
cBcv:;
r ccBA; row vector of relative cost coefficients
cost cBb;
tablA b;r cost;
if print,
disp ;
dispInitial tableau:;
disptabl;
end if
while ones1,nr zerosn,1 n
if options5 0;
rq,q minr;
else
Blands rule
q1;
while rq 0
qq1; end
end if
minratio inf;
p0;
for i1:m,
if tabli,q0
if tabli,n1tabli,q minratio
minratio tabli,n1tabli,q;
p i; end if
end if
end for
if p 0
dispProblem unbounded;
break;
end if
tablpivottabl,p,q;
137

if print,
dispPivot point:;
dispp,q;
dispNew tableau:;
disptabl;
end if
vp q;
r tablm1,1:n;
end while
xzerosn,1;
xv:tabl1:m,n1;
The above function makes use of the following function that implements pivoting:
function MnewpivotM,p,q
MnewpivotM,p,q
Returns the matrix Mnew resulting from pivoting about the
p,qth element of the given matrix M.
for i1:sizeM,1,
if ip
Mnewp,:Mp,:Mp,q;
else
Mnewi,:Mi,:Mp,:Mi,qMp,q;
end if
end for

We now apply the simplex algorithm to the problem in Example 16.2, as follows:
A1 0 1 0 0; 0 1 0 1 0; 1 1 0 0 1;
b4;6;8;
c2;5;0;0;0;
v3;4;5;
options11;
x,vsimplexc,A,b,v,options;
Initial Tableau: 101004 010106 110018
2 5 0 0 0 0
Pivot point:
22 New tableau:
101004 010106 1 0 0 1 1 2 2 0 0 5 0 30
Pivot point: 31
New tableau:
0 0 1 1 1 2 010106 1 0 0 1 1 2 0 0 0 3 2 34
dispx; 26200
138

dispv; 321
As indicated above, the solution to the problem in standard form is 2,6,2,0,0, and the objective function value is 34. The optimal cost for the original maximization problem is 34.
16.21
The following is a MATLAB routine that implements the twophase simplex method, using the MATLAB function from Exercise 16.20.
function x,vtpsimplexc,A,b,options
TPSIMPLEXc,A,b;
TPSIMPLEXc,A,b,options;

x TPSIMPLEXc,A,b;
x TPSIMPLEXc,A,b,options;

x,v TPSIMPLEXc,A,b;
x,v TPSIMPLEXc,A,b,options;

TPSIMPLEXc,A,b solves the following linear program using the
twophase simplex method:
min cx subject to Axb, x0.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS5 specifies how the pivot element is selected;
0choose the most negative relative cost coefficient;
1use Blands rule.
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return;
end end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
Phase I
if print,
disp ;
dispPhase I;
disp;
end
vnonesm,1;
for i1:m
vivii;
139

end
x,vsimplexzerosn,1;onesm,1,A eyem,b,v,options;
if allvn,
Phase II
if print
disp ;
dispPhase II;
disp;
dispBasic columns:
dispv
end
Convert A b into canonical augmented matrix
BinvinvA:,v;
ABinvA;
bBinvb;
x,vsimplexc,A,b,v,options;
if print
disp ;
dispFinal solution:;
dispx;
end else
assumes nondegeneracy
dispTerminating: problem has no feasible solution.;
end

We now apply the above MATLAB routine to the problem in Example 16.5, as follows:
A1 1 1 0; 5 3 0 1;
b4;8;
c3;5;0;0;
options11;
format rat;
tpsimplexc,A,b,options;
Phase I

Initial Tableau: 1110104 5 3 0 1 0 1 8
641 1 0 012 Pivot point:
21 New tableau:
0 25 1 15 1 15 125 1 35 015 0 1585 0 25 1 15 0 65 125
Pivot point: 13
New tableau:
0 25 1 15 1 15 125 1 35 015 0 1585 0011
140

Pivot point: 22
New tableau:
23 0 1 13 1 13 43
53 1 013 0 1383 0011
Pivot point: 14
New tableau:
2 0 3 1 3 1 4
1110104 0011
Basic columns: 42
Phase II

Initial Tableau:
2 0 3 1 4
11104 2 0 5 0 20
Final solution: 0404
16.22
The following is a MATLAB function that implements the revised simplex algorithm.
function x,v,Binvrevsimpc,A,b,v,Binv,options
REVSIMPc,A,b,v,Binv;
REVSIMPc,A,b,v,Binv,options;

x REVSIMPc,A,b,v,Binv;
x REVSIMPc,A,b,v,Binv,options;

x,v,Binv REVSIMPc,A,b,v,Binv;
x,v,Binv REVSIMPc,A,b,v,Binv,options;

REVSIMPc,A,b,v,Binv solves the following linear program using the
revised simplex method:
min cx subject to Axb, x0,
where v is the vector of indices of basic columns, and Binv is the
inverse of the basis matrix. Specifically, the vith column of
A is the ith column of the basis vector.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS5 specifies how the pivot element is selected;
0choose the most negative relative cost coefficient;
1use Blands rule.
if nargin 6
options ;
if nargin 5
dispWrong number of arguments.;
return; end
141

end
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
cBcv:;
y0 Binvb;
lambdaTcBBinv;
r clambdaTA; row vector of relative cost coefficients
if print,
disp ;
dispInitial revised tableau v B1 y0:;
dispv Binv y0;
dispRelative cost coefficients:;
dispr;
end if
while ones1,nr zerosn,1 n
if options5 0;
rq,q minr;
else
Blands rule
q1;
while rq 0
qq1; end
end if
yq BinvA:,q;
minratio inf;
p0;
for i1:m,
if yqi0
if y0iyqi minratio
minratio y0iyqi;
p i; end if
end if
end for
if p 0
dispProblem unbounded;
break;
end if
if print,
dispAugmented revised tableau v B1 y0 yq:
dispv Binv y0 yq;
dispp,q:;
dispp,q;
end
augrevtablpivotBinv y0 yq,p,m2;
142

Binvaugrevtabl:,1:m;
y0augrevtabl:,m1;
vp q;
cBcv:;
lambdaTcBBinv;
r clambdaTA; row vector of relative cost coefficients
if print,
dispNew revised tableau v B1 y0:;
dispv Binv y0;
dispRelative cost coefficients:;
dispr;
end if
end while
xzerosn,1;
xv:y0;
The function makes use of the pivoting function in Exercise 16.20.
We now apply the simplex algorithm to the problem in Example 16.2, as follows:
A1 0 1 0 0; 0 1 0 1 0; 1 1 0 0 1;
b4;6;8;
c2;5;0;0;0;
v3;4;5;
Binveye3;
options11;
x,v,Binvrevsimpc,A,b,v,Binv,options;
Initial revised tableau v B1 y0: 31004 40106 50018
Relative cost coefficients:
2 5 0 0 0
Augmented revised tableau v B1 y0 yq: 310040 401061 500181
p,q: 22
New revised tableau v B1 y0: 31004 20106
5 0 1 1 2
Relative cost coefficients:
2 0 0 5 0
Augmented revised tableau v B1 y0 yq: 310041 201060
5 0 1 1 2 1
p,q: 31
New revised tableau v B1 y0: 3 1 1 1 2 20106
1 0 1 1 2
143

Relative cost coefficients: 00032
dispx; 26200
dispv; 321
dispBinv;
1 1 1
010 0 1 1
16.23
The following is a MATLAB routine that implements the twophase revised simplex method, using the MATLAB function from Exercise 16.22.
function x,vtprevsimpc,A,b,options
TPREVSIMPc,A,b;
TPREVSIMPc,A,b,options;

x TPREVSIMPc,A,b;
x TPREVSIMPc,A,b,options;

x,v TPREVSIMPc,A,b;
x,v TPREVSIMPc,A,b,options;

TPREVSIMPc,A,b solves the following linear program using the
twophase revised simplex method:
min cx subject to Axb, x0.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS5 specifies how the pivot element is selected;
0choose the most negative relative cost coefficient;
1use Blands rule.
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return;
end end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
Phase I
if print,
disp ;
dispPhase I;
disp;
144

end
vnonesm,1;
for i1:m
vivii;
end
x,v,Binvrevsimpzerosn,1;onesm,1,A eyem,b,v,eyem,options;
Phase II
if print
disp ;
dispPhase II;
disp;
end
x,v,Binvrevsimpc,A,b,v,Binv,options;
if print
disp ;
dispFinal solution:;
dispx;
end

We now apply the above MATLAB routine to the problem in Example 16.5, as follows:
A4 2 1 0; 1 4 0 1;
b12;6;
c2;3;0;0;
options11;
format rat;
tprevsimpc,A,b,options;
Phase I

Initial revised tableau v B1 y0:
5 1 0 12
6016 Relative cost coefficients:
5 6 1 1 0 0
Augmented revised tableau v B1 y0 yq:
5 1 0 12 2
60164 p,q:
22
New revised tableau v B1 y0:
5 1 12 9
2 0 14 32
Relative cost coefficients:
72 0 1 12 0 32
Augmented revised tableau v B1 y0 yq:
5 1 12 9 72
2 0 14 32 14
p,q:
11
New revised tableau v B1 y0:
1 27 17 187
2 114 27 67
145

Relative cost coefficients: 000011
Phase II

Initial revised tableau v B1 y0:
1 27 17 187
2 114 27 67
Relative cost coefficients:
0 51447
Final solution:
187 67 0 0
17. Duality
17.1
Since x and are feasible, we have Ax b, x 0, and A c, 0. Postmultiplying both sides of
Ac byx0yields
SinceAxband 0,wehaveAxb. Hence,bcx.
17.2
The primal problem is:
Ax cx.
minimize en p subject to Gp Pem
p 0,
where G gi,j, en 1,…,1 with n components, and p p1,…,pn. The dual of the problem is
using symmetric duality:
17.3
maximize subject to
P em G en 0.
a. We first transform the problem into standard form:
The initial tableau is:
minimize 2×1 3×2 subject to x1 2×2 x3 4 2×1 x2 x4 5
x1,x2,x3,x4 0.
12104
21015 2 3 0 0 0
We now pivot about the 1, 2th element to get:
12 1 12 0 2
32 01213 120 32 06
146

Pivoting now about the 2, 1th element gives:
01 23 131 1013 23 2 0 0 43 13 7
Thus, the solution to the standard form problem is x1 2, x2 1, x3 0, x4 0. The solution to the original problem is x1 2, x2 1.
b. The dual to the standard form problem is
maximize subject to
41 52
1 22 2 21 2 3 1,2 0.
From the discussion before Example 17.6, it follows that the solution to the dual is cI rI 43, 13.
17.4
The dual problem is
maximize subject to
111 82 73 51 22 3 4 1 2 23 3 1,2,3 0.
Note that we may arrive at the above in one of two ways: by applying the asymmetric form of duality, or by applying the symmetric form of duality to the original problem in standard form. From the solution of Exercise 16.11a, we have that the solution to the dual is cBB1 0,53,23 using the proof of the duality theorem.
17.5
We represent the primal in the form
The corresponding dual is
that is,
maximize subjectto
10001
The solution to the dual can be obtained using the formula, cBB1, where
minimize subject to
maximize subject to
21 72 33
c x Ax b x 0.
b
A c,
h i262 1 1 0 037 h i 1 2 3 41 2 0 1 05 1 2 0 0 0 .
h i 262 1 137 cB 1 2 0 and B41 2 05.
147
100

Note that because the last element in cB is zero, we do not need to calculate the last row of B1 when computing , that is, these elements are dont care elements that we denote using the asterisk. Hence,
Note that as expected.
17.6
1 h i260 0 137 h i cBB 1 2 040 12 1250 1 2.
cx b 13,
a. Multiplying the objective function by 1, we see that the problem is of the form of the dual in the asymmetric form of duality. Therefore, the dual to the problem is of the form of the primal in the asymmetric form:
minimize b
subject to A c
0
b. The given vector y is feasible in the dual. Since b 0, any feasible point in the dual is optimal. Thus, y is optimal in the dual, and the objective function value for y is 0. Therefore, by the Duality Theorem, the primal also has an optimal feasible solution, and the corresponding objective function value is 0. Since the vector 0 is feasible in the primal and has objective function value 0, the vector 0 is a solution to the primal.
17.7
We introduce two sets of nonnegative variables: xi 0, xi 0, i 1, 2, . . . , 3. We can then represent the optimization problem in the form
minimize x1 x1 x2 x2 x3 x3

2 x 1 3
6 6 x 2 7 7 6×37 2 0 1 0 0 1 0 6 6 x 1 7 7 1
4 x 2 5 x3
subjectto 1 1 1 1 1 1
We form the initial tableau,
xi 0 , x i 0 .
x1 1 0
c 1
We next calculate the reduced cost coecients,
x1 1 0
x2 x3 x1 x2 x3 b 1 1 1 1 1 2 1 0 0 1 0 1 1 1 1 1 1 0
c 1
There is no apparent basic feasible solution. We add the second row to the first one to obtain,
x2 x3 x1 x2 x3 b 0 1 1 0 1 3 1 0 0 1 0 1 1 1 1 1 1 0
x1 x2 x3 x1 x2 x3 b 1 0 1 1 0 1 3 0 1 0 0 1 0 1
c 0 2 2 2 0 0 4 148

We have zeros under the basic columns. The reduced cost coecients are all nonnegative. The optimal
solution is,
xh3 0 0 0 1 0i. The optimal solution to the original problem is x 3, 1, 0 .
The dual of the above linear program is
maximize 21 2
subjectto h1 2i 1 1 1 1 1 1 h1 1 1 1 1 1i. 0 1 0 0 1 0
The optimal solution to the dual is
h1 1i 1 11
17.8
a. The dual asymmetric form is
01 h1 2i.
maximize
subject to ai 1, i 1,…,n.
cB B1
We can write the constraint as
Therefore, the solution to the dual problem is
min1ai : i 1,…,n 1an. 1an.
b. Duality Theorem: If the primal problem has an optimal solution, then so does the dual, and the optimal values of their respective objective functions are equal.
By the duality theorem, the primal has an optimal solution, and the optimal value of the objective function is 1an. The only feasible point in the primal with this objective function value is the basic feasible solution 0,…,0,1an.
c. Suppose we start at a nonoptimal initial basic feasible solution, 0, . . . , 1ai, . . . , 0, where 1 i n 1. The relative cost coecient for the qth column, q 6 i, is
rq 1 aq . ai
Since an aj for any j 6 n, rq is the most negative relative cost coecient if and only if q n. 17.9
a. By asymmetric duality, the dual is given by minimize
subject to ci, i 1,…,n.
b. The constraint in part a implies that is feasible if and only if c4. Hence, the solution is c4.
c. By the duality theorem, the optimal objective function value for the given problem is c4. The only solution thatachievesthisvalueisx4 1andxi 0foralli64.
149

17.10
a. The dual is
where e 1,…,1 and z x,y. b. The dual to the artificial problem is:
maximize subject to
minimize 0 subject to A c
0.
b. By the duality theorem, we conclude that the optimal value of the objective function is 0. The only
vector satisfying x 0 that has an objective function value of 0 is x 0. Therefore, the solution is x 0.
c. The constraint set contains only the vector 0. Any other vector x satisfying x 0 has at least one positive component, and consequently has a positive objective function value. But this contradicts the fact that the optimal solution has an objective function value of 0.
17.11
a. The artificial problem is:
minimize subject to
0, ez A,Iz b z 0,
b
A 0 e.
c. Suppose the given original linear programming problem has a feasible solution. By the FTLP, the original LP problem has a BFS. Then, by a theorem given in class, the artificial problem has an optimal feasible solution with y 0. Hence, by the Duality Theorem, the dual of the artificial problem also has an optimal feasible solution.
17.12
a. Possible. This situation arises if the primal is unbounded, which by the Weak Duality Lemma implies that the dual has no feasible solution.
b. Impossible, because the Duality Theorem requires that if the primal has an optimal feasible solution, then so does the dual.
c. Impossible, because the Duality Theorem requires that if the dual has an optimal feasible solution, then so does the primal. Also, the Weak Dual Lemma requires that if the primal is unbounded i.e., has a feasible solution but no optimal feasible solution, then the dual must have no feasible solution.
17.13
To prove the result, we use Theorem 17.3 Complementary Slackness. Since 0, we have A c c. Hence, is a feasible solution to the dual. Now, c Ax x 0. Therefore, by Theorem 17.3, x and are optimal for their respective problems.
17.14
To use the symmetric form of duality, we need to rewrite the problem as
minimize cu v, subject to Au v b
u, v 0, 150

which we represent in the form
uv 0 . By the symmetric form of duality, the dual is:
minimize subject to
c cuv,
A A uv b
maximize subject to
b
A A c c 0.
Note that for the constraint involving A, we have
AAc c , Ac andAc
Therefore, we can represent the dual as
17.15
,
minimize subject to
A c.
b
A c 0.
31 32 1221 21 2 1 1,2 0.
The corresponding dual can be written as:
maximize subject to
To solve the dual, we refer back to the solution of Exercise 16.11. Using the idea of the proof of the duality theorem Theorem 17.2, we obtain the solution to the dual as cBB1 13,13. The cost of the dual problem is 2, which verifies the duality theorem.
17.16
The dual to the above linear program asymmetric form is maximize 0
subject to O c.
The above dual problem has a feasible solution if and only if c 0. Since any feasible solution to the dual is also optimal, the dual has an optimal solution if and only if c 0. Therefore, by the duality theorem, the primal problem has a solution if and only if c 0.
If the solution to the dual exists, then the optimal value of the objective function in the primal is equal to that of the dual, which is clearly 0. In this case, 0 is optimal, since c0 0.
17.17
Consider the primal problem
minimize subject to
151
0 x Ax b x 0,

and its corresponding dual
17.18
a. The dual is
maximize yb subject to yA0
y 0.
: By assumption, there exists a feasible solution to the primal problem. Note that any feasible solution is also optimal, and has objective function value 0. Suppose y satisfies Ay 0 and y 0. Then, y is a feasible solution to the dual. Therefore, by the Weak Duality Lemma, by 0.
: Note that the feasible region for the dual is nonempty, since 0 is a feasible point. Also, by assumption, 0 is an optimal solution, since any other feasible point y satisfies by b0 0. Hence, by the duality theorem, the primal problem has an optimal feasible solution.
maximize yb subject to yA 0.
b. The feasible set of the dual problem is always nonempty, because 0 is clearly guaranteed to be feasible. c. Suppose y is feasible in the dual. Then, by assumption, by 0. But the point 0 is feasible and has
objective function value 0. Hence, 0 is optimal in the dual.
d. By parts b and c, the dual has an optimal feasible solution. Hence, by the duality theorem, the primal
problem also has an optimal feasible solution.
e. By assumption, there exists a feasible solution to the primal problem. Note that any feasible solution in the primal has objective function value 0 and hence so does the given solution. Suppose y satisfies Ay 0. Then, y is a feasible solution to the dual. Therefore, by weak duality, by 0.
17.19
Consider the primal problem
and its corresponding dual
minimize subject to
maximize subject to
yb yA0 y0
0 x Ax b.
: By assumption, there exists a feasible solution to the dual problem. Note that any feasible solution is also optimal, and has objective function value 0. Suppose y satisfies Ay 0 and y 0. Then, y is a feasible solution to the primal. Therefore, by the Weak Duality Lemma, by 0.
: Note that the feasible region for the primal is nonempty, since 0 is a feasible point. Also, by as sumption, 0 is an optimal solution, since any other feasible point y satisfies by b0 0. Hence, by the duality theorem, the dual problem has an optimal feasible solution.
17.20
Let e 1, . . . , 1 . Consider the primal problem minimize
0 x
Ax e
ey yA0 y 0.
152
and its corresponding dual
subject to
maximize subject to

: Suppose there exists Ax 0. Then, the vector x0 xminAxi is a feasible solution to the primal problem. Note that any feasible solution is also optimal, and has objective function value 0. Suppose y satisfies Ay 0, y 0. Then, y is a feasible solution to the dual. Therefore, by the Weak Duality Lemma, ey 0. Since y 0, we conclude that y 0.
: Suppose 0 is the only feasible solution to the dual. Then, 0 is clearly also optimal. Hence, by the duality theorem, the primal problem has an optimal feasible solution x. Since Ax e and e 0, we get Ax 0.
17.21
a. Rewrite the primal as
By asymmetric duality, the dual is
minimize subject to
maximize subject to
e x
P Ix 0 x 0.
0
P I e.
b. To make the notation simpler, we rewrite the dual as: maximize 0
subject to P Iy e.
Suppose the dual is feasible. Then, there exists a y such that P y y e y. Let yi be the largest element of y, and pi the ith row of P. Then, piy yi. But, by definition of yi, y yie. Hence, piy yipie yi, which contradicts the inequality piy yi. Hence, the dual is not feasible.
c. The primal is certainly feasible, because 0 is a feasible point. Therefore, by part b and strong duality, the primal must also be unbounded.
d. Because 0 is an achievable objective function value it is the objective function value of 0, and the problem is unbounded, we deduce that 1 is also achievable. Hence, there exists a feasible x such that xe 1. This proves the desired result.
17.22
Write the LP problem
and the corresponding dual problem
minimize subject to
maximize subject to
c x Ax b x0
b
A c 0.
By a theorem on duality, if we can find feasible points x and for the primal and dual, respectively, such that cx b, then x and are optimal for their respective problems. We can rewrite the previous set
of relations as
2c b 3 203 6c b7 607
6A 0 7 x 6b7. 6In 0 7 607
40 A5 4c5 0Im 0
153

Therefore,writingtheaboveasAyb,whereA2R2m2n2mnandb2R2m2n2,wehavethat the first n components of 2m 2n 2, m n, A , b is a solution to the given linear programming problem.
17.23
a. Consider the dual; b does not appear in the constraint but it does appear in the dual objective function. Thus, provided the level sets of the dual objective function do not exactly align with one of the faces of the constraint set polyhedron, the optimal dual vector will not change if we perturb b very slightly. Now, by the duality theorem, zb b. Because is constant in a neighborhood of b, we deduce that rzb .
b. By part a, we deduce that the optimal objective function value will change by 3b1.
17.24
a. Weak duality lemma: if x0 and y0 are feasible points in the primal and dual, respectively, then f1x0 f2y0.
Proof: Because y0 0 and Ax0 b 0, we have y0 Ax0 b 0. Therefore, f1x0 f1x0 y0 Ax0 b
Now, we know that
where x Ay0. Hence,
1×0 x0 y0 Ax0 y0 b. 2
1×0 x0 y0 Ax0 1xx y0 Ax, 22
1×0 x0 y0 Ax0 1y0 AAy0 y0 AAy0 1y0 AAy0. 222
Combining this with the above, we have f1x0
Alternatively, notice that
f1x0 f2y0
b. Suppose f1x0 f2y0 for feasible points x0 and y0. Let x be any feasible point in the primal. Then, by part a, f1x f2y0 f1x0. Hence x0 is optimal in the primal.
Similarly, let y be any feasible point in the dual. Then, by part a, f2y f1x0 f2y0. Hence y0 is optimal in the dual.
18. NonSimplex Methods
18.1
The following is a MATLAB function that implements the ane scaling algorithm.
function x,N affscalec,A,b,u,options;
AFFSCALEc,A,b,u;
AFFSCALEc,A,b,u,options;

1y0 AAy0 y0 b 2
f2 x0 .
1×0 x0 1y0 AAy0 by0

22
1×0 x0 1y0 AAy0 x0 Ay0 22

0.
1kx0Ay0k2 2
154

x AFFSCALEc,A,b,u;
x AFFSCALEc,A,b,u,options;

x,N AFFSCALEc,A,b,u;
x,N AFFSCALEc,A,b,u,options;

AFFSCALEc,A,b,u solves the following linear program using the
affine scaling Method:
min cx subject to Axb, x0,
where u is a strictly feasible initial solution.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required cost value.
OPTIONS14 max number of iterations.
OPTIONS18 alpha.
if nargin 5
options ;
if nargin 4
dispWrong number of arguments.;
return; end
end xnewu;
if lengthoptions 14
if options140
options141000lengthxnew;
end
else
options141000lengthxnew;
end
if lengthoptions 18
options180.99; optional step size
end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
epsilonx options2;
epsilonf options3;
maxiteroptions14;
alphaoptions18;
nlengthc;
mlengthb;
for k 1:maxiter,
xcurrxnew;
D diagxcurr;
155

Abar AD;
Pbar eyen AbarinvAbarAbarAbar;
d DPbarDc;
if d zerosn,1,
nonzd findd0;
r minxcurrnonzd.dnonzd;
else
dispTerminating: d 0;
break; end
xnew xcurralphard;
if print,
dispIteration number k
dispk; print iteration index k
dispalphak ;
dispalphar; print alphak
dispNew point ;
dispxnew; print new point
end if
if normxnewxcurr epsilonxnormxcurr
dispTerminating: Relative difference between iterates ;
dispepsilonx;
break;
end if
if abscxnewxcurr epsilonfabscxcurr,
dispTerminating: Relative change in objective function ;
dispepsilonf;
break;
end if
if k maxiter
dispTerminating with maximum number of iterations;
end if
end for
if nargout 1
xxnew;
if nargout 2
Nk;
end else
dispFinal point ;
dispxnew;
dispNumber of iterations ;
dispk;
end if

We now apply the ane scaling algorithm to the problem in Example 16.2, as follows:
A1 0 1 0 0; 0 1 0 1 0; 1 1 0 0 1;
b4;6;8;
c2;5;0;0;0;
u2;3;2;3;3;
options10;
options2107;
options3107;
156

affscalec,A,b,u,options;
Terminating: Relative difference between iterates
1.0000e07
Final point
2.0000e00 6.0000e00 2.0000e00 1.0837e09 1.7257e08
Number of iterations
8
The result obtained after 8 iterations as indicated above agrees with the solution in Example 16.2: 2, 6, 2, 0, 0.
18.2
The following is a MATLAB routine that implements the twophase ane scaling method, using the MAT LAB function from Exercise 18.1.
function x,Ntpaffscalec,A,b,options
March 28, 2000

TPAFFSCALEc,A,b;
TPAFFSCALEc,A,b,options;

x TPAFFSCALEc,A,b;
x TPAFFSCALEc,A,b,options;

x,N TPAFFSCALEc,A,b;
x,N TPAFFSCALEc,A,b,options;

TPAFFSCALEc,A,b solves the following linear program using the
TwoPhase Affine Scaling Method:
min cx subject to Axb, x0.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required cost value.
OPTIONS14 max number of iterations.
OPTIONS18 alpha.
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return;
end end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
Phase I
if print,
disp ;
157

dispPhase I;
disp;
end
u randn,1;
v bAu;
if v zerosm,1,
u affscalezeros1,n,1,A v,b,u 1,options;
un1 ;
end
if print
disp
dispInitial condition for Phase II:
dispu
end
if un1 options2,
Phase II
un1 ;
if print
disp ;
dispPhase II;
disp;
dispInitial condition for Phase II:;
dispu;
end
x,Naffscalec,A,b,u,options;
if nargout 0
dispFinal point ;
dispx;
dispNumber of iterations ;
dispN;
end if else
dispTerminating: problem has no feasible solution.;
end

We now apply the above MATLAB routine to the problem in Example 16.5, as follows:
A1 1 1 0; 5 3 0 1;
b4;8;
c3;5;0;0;
options10;
tpaffscalec,A,b,options;
Terminating: Relative difference between iterates
1.0000e07
Terminating: Relative difference between iterates
1.0000e07
Final point
4.0934e09 4.0000e00 9.4280e09 4.0000e00
Number of iterations
7
The result obtained above agrees with the solution in Example 16.5: 0, 4, 0, 4.
18.3
The following is a MATLAB routine that implements the ane scaling method applied to LP problems of the form given in the question by converting the given problem in Karmarkars artificial form and then using the MATLAB function from Exercise 18.1.
158

function x,Nkaraffscalec,A,b,options

KARAFFSCALEc,A,b;
KARAFFSCALEc,A,b,options;

x KARAFFSCALEc,A,b;
x KARAFFSCALEc,A,b,options;

x,N KARAFFSCALEc,A,b;
x,N KARAFFSCALEc,A,b,options;

KARAFFSCALEc,A,b solves the following linear program using the
Affine Scaling Method:
min cx subject to Axb, x0.
We use Karmarkars artificial problem to convert the above problem into
a form usable by the affine scaling method.
The second variant allows a vector of optional parameters to be
defined:
OPTIONS1 controls how much display output is given; set
to 1 for a tabular display of results default is no display: 0.
OPTIONS2 is a measure of the precision required for the final point.
OPTIONS3 is a measure of the precision required cost value.
OPTIONS14 max number of iterations.
OPTIONS18 alpha.
if nargin 4
options ;
if nargin 3
dispWrong number of arguments.;
return;
end end
clc;
format compact;
format short e;
options foptionsoptions;
print options1;
nlengthc;
mlengthb;
Convert to Karmarkars aftificial problem
x0 onesn,1;
l0 onesm,1;
u0 onesn,1;
v0 onesm,1;
AA
c b zeros1,n zeros1,m cx0bl0;
A zerosm,m zerosm,n eyem bAx0v0;
zerosn,n A eyen zerosn,m cAl0
;
bb 0; b; c;
cc zeros2m2n,1; 1;
y0 x0; l0; u0; v0; 1;
y,Naffscalecc,AA,bb,y0,options;
159

if ccy options3,
x y1:n;
if nargout 0
dispFinal point ;
dispx;
dispFinal cost ;
dispcx;
dispNumber of iterations ;
dispN;
end if else
dispTerminating: problem has no optimal feasible solution.;
end
We now apply the above MATLAB routine to the problem in Example 15.15, as follows:
c3;5;
A1 5; 2 1; 1 1;
b40;20;12;
options2104;
karaffscalec,A,b,options;
Terminating: Relative difference between iterates
1.0000e04
Final point
5.1992e00 6.5959e00
Final cost
4.8577e01
Number of iterations
3
The solution from Example 15.15 is 5,7. The accuracy of the result obtained above is disappointing. We believe that the inaccuracy here may be caused by our particularly simple numerical implementation of the ane scaling method. This illustrates the numerical issues that must be dealt with in any practically useful implementation of the ane scaling method.
18.4
a. Suppose Tx Ty. Then, Tix Tiy for i 1,…,n 1. Note that for i 1,…,n, Tix xiaiTn1x and Tiy yiaiTn1y. Therefore,
Tix xiaiTn1x Tiy yiaiTn1y yiaiTn1x, which implies that xi yi, i 1,…,n. Hence x y.
b. Lety2x2:xn1 0. Henceyn1 0. Definexx1,…,xn byxi aiyiyn1,i1,…,n. Then, T x y. To see this, note that
Tn1x 1 yn1 yn1. y1yn1 ynyn1 1 y1 yn yn1
Also, for i 1,…,n,
c. An immediate consequence of the solution to part b.
d. We have
and, for i 1,…,n,
Tn1a 1
a1a1 anan 1
1 , n1
Tix yiyn1Tn1x yi.
Tia aiaiTn1a 1 n1
.
160

e. Since y Tx, we have that for i 1,…,n, yi xiaiyn1. Therefore, x0i yiai xiyn1, which implies that x0 yn1x. Hence, Ax0 yn1Ax byn1.
18.5
Letx2Rn,andyTx. Letai betheithcolumnofA,i1,…,n. Asinthehint,letA0 begivenby A0 a1a1,…,anan,b.
Then,
Axb,Axb0 26×137
, a ,…,a ,b6 . 7 0 1 n 64 x n 75
1
26 x1a1 37
, a a ,…,a a ,b6 . 70
1 1 n n 64 xn an 75
1 26 x1a1yn1 37
, A06 .
64 xn an yn1 75
70 The result follows from Exercise 18.5 by setting A : c and b : 0.
18.7
Considerthesetx2Rn :ex1,x0,x1 0,whichcanbewrittenasx2Rn :Axb,x0, where e 1
Ae, b0, 1
with e 1,…,1, e1 1,0,…,0. Let a0 en. By Exercise 12.20, the closest point on the set x:Axbtothepointa0 is
x AAA1bAa0a0 0, 1 ,…, 1 . n1 n1
Sincex 2x:Axb,x0x:Axb,thepointx isalsotheclosestpointontheset x : Ax b,x 0 to the point a0.
Let r ka0 xk. Then, the sphere of radius r is inscribed in . Note that
r ka0 xk p 1 . p
nn1
Hence, the radius of the largest sphere inscribed in is larger than or equal to 1 nn 1. It remains to show that this largest radius is lpess than or equal to 1 nn 1. To this end, we show that this largest radiusislessthanorequalto1 nn1forany0. Forthis,itsucestoshowthatthesphereof
18.6
yn1 , A0y0.
161

radius 1pnn 1 is not inscribed in . To show this, consider the point x xxa0
kx a0k p
x nn1x a0
rn1 1 1 x ,p ,…,pp
It is easy to verify that the point x above is on the sphere of radiups 1 nn 1 . However, clearly the first component of x is negative. Therefore, the sphere of radius 1 nn 1 is not inscribed in . Our proof is thus completed.
18.8
We first consider the constraints. We claim that x 2 , x 2 . To see this, note that if x 2 , then
ADx 0. To see this, write
Since eD1x 0, we have Ax 0 , ADx 0. Finally, we claim that if x is an optimal solution to the original problem, then x Ux is an optimal solution to the transformed problem. To see this, recall that the problem in a Karmarkars restricted problem, and hence by Assumption B we have cx 0. We now note that the minimum value of the objective function cDx in the transformed problem is zero. This is because cDx cxeD1x, and eD1x 0. Finally, we observe that at the point, x Ux the objective function value for the transformed problem is zero. Indeed,
cDx cDD1xeD1x 0. Therefore, the two problems are equivalent.
18.9
Let v 2 Rm1 be such that vB 0. We will show that
v A 0 e
and hence v 0 by virtue of the assumption that
rank A m 1.
n nn1 nn1
.
ex 1 and hence
which means that x 2 . The same argument can be used for the converse. Next, we claim Ax 0 ,
ex eD1xeD1x 1, Ax ADD1x ADx eD1x.
e
vu vm1
where u 2 Rm constitute the first m components of v. Then,
vB uAD vm1e 0.
Postmultiplying the above by e, and using the facts that De x0, Ax0 0, and ee n, we get uAx0 vm1n vm1n 0,
This in turn gives us the desired result. To proceed, write v as
162

which implies that vm1 0. Hence, uAD 0, which after postmultiplying by D1 gives uA 0.
Hence,
which implies that v 0. Hence, rank B m 1.
18.10
We proceed by induction. For k 0, the result is true because x0 a0. Now suppose that xk is a strictly interior point of . We first show that x k1 is a strictly interior point. Now,
v A 0, e
k1 k x a0rc.
k 1, we have
k Then, since 2 0,1 and kc
kx
Since r is the radius of the largest sphere inscribed in , x k1 is a strictly interior point of . To complete
k1 k a0krkc kr.
the proof, we write
We already know that xk1 2 . It therefore remains to show that it is strictly interior, i.e., xk1 0.
xk1 U1x k1 Dkx k1 k e Dk x k1
To see this, note that eDkx k1 0. Furthermore, we can write
2 x k1 xk 3 k1 6 1 . 1 7
64 . . 75 . k1 k
D k x
0 by the induction hypothesis, and x k1 x 1 ,…x n
x n xn above, xk1 0 and hence it is a strictly interior point of .
19. Integer Linear Programming
19.1
k k Since xk x1 ,…,xn
k1 k1
0 by the
The result follows from the simple observation that if M is a submatrix of A, then any submatrix of M is also a submatrix of A. Therefore, any property involving all submatrices of A also applies to all submatrices of M.
19.2
The result follows from the simple observation that any submatrix of A is the transpose of a submatrix of A, and that the determinant of the transpose of a matrix equals the determinant of the original matrix.
19.3
The claim that A is totally unimodular if A,I is totally unimodular follows from Exercise 19.1. To show the converse, suppose that A is totally unimodular. We will show that any p p invertible submatrix of A, I , p minm, n, has determinant 1. We first note that any p p invertible submatrix of A, I that consists only of columns of A has determinant 1 because A is totally unimodular. Moreover, any p p invertible submatrix of I has determinant 1.
Consider now a p p invertible submatrix of A, I composed of k columns of A and p k columns of I. Without loss of generality, suppose that this submatrix is composed of the first p rows of A,I, the last k columns of A, and the first pk columns of I. This choice of rows and columns is without loss of generality because we can exchange rows and columns to arrive at this form, and each exchange only changes the sign of the determinant. We now proceed as in the proof of Proposition 19.1.
19.4
The result follows from these properties of determinants: 1 that exchanging columns only changes the sign 163

of the determinant; 2 the determinant of a block triangular matrix is the product of the determinants of the diagonal blocks; and 3 the determinant of the identity matrix is 1. See also Exercise 2.4.
19.5
19.6
The vectors x and z together satisfy
which means that z b Ax. Because the righthand side involves only integers, z is an integer vector.
Ax z b, The following MATLAB code generates the figures.

The vertices of the feasible set
x 25 1; 25 253; 1;
X 0 0 x1 2.5;
Y 0 3 x2 0;
fs16; Fontsize
Draw the fesible set for x1 x2 in R.
vi convhullX,Y;
plotX,Y, o;
axis on; axis equal;
axis0.2 4.2 0.2 3.2;
hold on
fill Xvi, Yvi, b,facealpha, 0.2;
text.1,.5,fontsize48Omega,position, 1.5 1.25
setgca,Fontsize,fs
hold off
The optimal solution has to be one of the extreme points
c 3 4;
Draw the feasible set for the noninteger problem
figure
axis on; axis equal;
x 0.5:0.1:x1;
y1 x0.43;
y2 x2.5;
fs16; Fontsize
plotx,y1,b,x,y2,b,LineWidth,2;
axis0.2 4.2 0.2 3.2;
setgca,Fontsize,fs
hold on
X zeros1,4 ones1,3 2ones1,3 3;
Y 0:maxfloorY 0:maxfloorY1 0:maxfloorY1 1;
plotX,Y,bls,LineWidth,2,…
MarkerEdgeColor,k,…
MarkerFaceColor,g,…
MarkerSize,10
Plot of the cost function
xc 1:0.5:5;
yc 0.75xc144ones1,lengthxc;
yc0 0.75xc17.54ones1,lengthxc;
fs16; Fontsize
plotxc, yc, r, xc, yc0, ok, LineWidth,2;
setgca,Fontsize,fs
text.1,.5,fontsize48Omega,position, 1.5 1.25
,Xmin mincX; Y;
str sprintfThe maximizer is d, d and the maximum is .4f,…
XXmin, YXmin, cXXmin; YXmin;
dispstr;

164

19.7
It suces to show the following claim: If we introduce the equation
Xn
j m1
into the original constraint, then the result holds. The reason this suces is that the Gomory cut is obtained by subtracting this equation from an equation obtained by elementary row operations on A,b hence is equivalent to premultiplication by an invertible matrix.
To show the above claim, let xn1 satisfy this constraint with an integer vector x. Then,
xi
byijcxj xn1 byi0c
Xn
j m1
byijcxj xn1 byi0c,
xi
xn1 byi0c xi
which implies that
Because the righthand side involves only integers, xn1 is an integer.
19.8
follows by induction on the number of Gomory cuts, using Exercise 19.7 at each inductive step.
19.9
The result follows from Exercises 19.5 and 19.8.
19.10
The dual problem is:
byijcxj.
If there is only one Gomory cut, then the result follows directly from Exercise 19.7. The general result
Xn
j m1
31 2 subjectto 21223
11 22 4 5
1,2 0 1,2 2Z.
minimize
55
The problem is solved graphically using the same approach as in Example 19.5. We proceed by calculating the extreme points of the feasible set. We first assume that 1,2 2 R. The extreme points are calculated intersecting the given constraints, and they are:
1 5, 5, 2 15,0. 22
In Figure 24.10, we show the feasible set for the case when 1 , 2 2 R.
Next we sketch the feasible set for the case when 1,2 2 Z and solve the problem graphically. The
graphical solution is depicted in Figure 24.11. We can see in Figure 24.11 that the optimal integer solution
is
The following MATLAB code generates the figures.
x 6,2.
The vertices of the feasible set are:
x 25 25;1 253; 4;
X 7.5 x1 20 20;
165

Figure 24.10 Feasible set for the case when 1 , 2 2 R in Example 19.5.
18 16 14 12 10
8 6 4 2 0
0 5 10 15
Figure 24.11 Real feasible set with 1,2 2 Z. 166

Y 0 x2 0 40;
fs16; Fontsize
Now we draw the set Omega, supposing x1 x2 in R.
vi convhullX,Y;
plotX1:2,Y1:2, rX, LineWidth,4;
axis on; axis equal;
axis.2 18 0.2 18;
setgca,Fontsize,fs
titleFeasible set supossing x1, x2 in R.,Fontsize,14,Fontname,Avantgarde;
hold on
fill Xvi, Yvi, b,facealpha, 0.2;
text.1,.5,fontsize48Omega,position, 12 5
hold off
We now the optimal solution has to be one of the extreme points.
c 3 1;
Now we draw the real feasible set for the problem.
figure
axis on; axis equal;
axis.2 18 0.2 18;
setgca,Fontsize,fs
titleFeasible set and cost function,Fontsize,14,Fontname,Avantgarde;
hold on
X ; Y ;
for i1:18
j0;
while ji42.5 j18
ifj7.5i
X X i;
Y Y j; end
jj1; end
end
plotX,Y,bls,LineWidth,1,…
MarkerEdgeColor,k,…
MarkerFaceColor,g,…
MarkerSize,10
x 5:0.1:18;
y1 x42.5;
y2 7.5x;
plotx,y1,b,x,y2,b,LineWidth,2;
setgca,Fontsize,fs
text.1,.5,fontsize48Omega,position, 12 5
Plot of the cost function at level 17.5
xc 1:0.5:18;
yc 352ones1,lengthxc3xc;
plotxc, yc, dk, LineWidth,2;
Plot of the cost function
xc 1:0.5:18;
yc 20ones1,lengthxc3xc;
167

plotxc, yc, r, LineWidth,2;
,Xmin mincX; Y;
str sprintfThe minimizer is d, d and the maximum is .4f,…
XXmin, YXmin, cXXmin; YXmin;
dispstr;

168

20. Problems with Equality Constraints
20.1
The feasible set consists of the points We next find the gradients:
All feasible points are not regular because at the above points the gradients of h and g are not linearly independent. There are no regular points of the constraints.
20.2
a. As usual, let f be the objective function, and h the constraint function. We form the Lagrangian lx, fxhx, and then find critical points by solving the following equations Lagrange condition:
We obtain
Dlx, 0. 437 26×137 26437
07 6 x2 7 6 57 57 6×37 667 . 05415 435
0 2 6
x 165, 110, 3425, 275, 65.
2622037 Lx , Fx Hx 42 6 05.
000
Tx y2R3 :1 2 0y0 405
xa2, a1. rhx2x1 2 and rgx
0 . 3×2 12
The unique solution to the above system is
Note that x is a regular point. We now apply the SOSC. We compute
The tangent plane is
a54, 58, 1 : a 2 R. Let y a54,58,1 2 Tx, a 6 0. We have
yLx,y 75a2 0. 32
Therefore, x is a strict local minimizer.
b. The Lagrange condition for this problem is
42×1 0 2x22x2 0
x21x29 0. 169
26 2 62
2 0 6 0 0 0 2 0 0 5
1 2 0 0 0
60 41 4
0
Dxlx, 0,

We have four points satisfying the Lagrange condition:
x1 3, 0, x2 3, 0, x3 2, p5,
x4 2, p5,
Note that all four points x1, . . . , x4 are regular. We now apply the SOSC. We have
Let y a0,1 2 Tx1, a 6 0. Then
yLx1,1y 2a2 0.
Hence, x1 is a strict local minimizer. For the second point, we have
Hence, x2 is a strict local minimizer. For the third point, we have
Lx3, 3
T x3 Let y ap5,2 2 Tx3, a 6 0. Then
1 23 2 23 3 1 4 1.
For the first point, we have
Lx, 0 02 0, 02 02
Tx y : 2×1,2x2y 0. Lx1, 1 43 0
0 23 Tx1 a0,1 : a 2 R.
3
0 0. 2 0
00
ap5, 2 : a 2 R.
Lx2, 2 43
0 103
yLx3, 3y 10a2 0. Hence, x3 is a strict local maximizer.
For the fourth point, we have
Tx4 Let y ap5,2 2 Tx4, a 6 0. Then
Lx4, 4
2 0 00
ap5,2 : a 2 R.
yLx4, 4y 10a2 0. Hence, x4 is a strict local maximizer.
170

c. The Lagrange condition for this problem is
We have four points satisfying the Lagrange condition:
x1 1p2, 12p2, x2 1p2, 12p2, x3 1p2, 12p2, x4 1p2, 12p2,
1 14 2 14 3 14 4 14.
x22x1 0
x18x2 0 x214x21 0.
Note that all four points x1, . . . , x4 are regular. We now apply the SOSC. We have Lx, 0 12 0,
Note that
10 08 Tx y : 2×1,8x2y 0.
Lx,14 12 1 1 2
Lx, 14 12 1 . 12
After standard manipulations, we conclude that the first two points are strict local maximizers, while the last two points are strict local minimizers.
20.3
We form the lagrangian
The Lagrange conditions take the form,
lx, axbx 1×1 x2 2×2 x3.
010 01
26 x21 37
4×1 x3 1 25
203 x22
64075
ab bax hrxh1x rxh2xi 26 0 1 0 37 26 1 0 37
rxl
41 0 15×41 15
hx
x1x20. x2 x3 0
It is easy to see that x 0 and 0 satisfy the Lagrange, FONC, conditions. 171

The Hessian of the lagrangian is
2601037 Lx,ab ba 41 0 15
010
8 26 1 37 9 T x : y : y a 4 1 5 , a 2 R ; .
1 To verify if the critical point satisfies the SOSC, we evaluate
yLx, y 4a2 0. Thus the critical point is a strict local maximizer.
20.4
By the Lagrange condition, x x1, x2 satisfies
x1 0
x144 0.
20.5
2x x02x 0 kxk2 9,
and the tangent space
Eliminating we get
which implies that x1 43. Therefore, rfx 43,163.
3×1 4 0 a. The Lagrange condition for this problem is:
where 2 R. Rewriting the first equation we get 1 x x0, which when combined with the second equation gives two values for 1 : 1 23 and 1 23. Hence there are two solutions to the
p1 2p Lagrange condition: x1 321, 3, and x2 321, 3.
b. We have Lxi , i 1 i I . To apply the SONC Theorem, we need to check regularity. This is easy, since the gradient of the constraint function at any point x is 2x, which is nonzero at both the points in part a.
For the second point, 1 2 23, which implies that the point is not a local minimizer because the SONC does not hold.
On the other hand, the first point satisfies the SOSC since 1 1 23, which implies that it is a strict local minimizer.
20.6
a. Let x1, x2, and x3 be the dimensions of the closed box. The problem is minimize 2x1x2 x2x3 x3x1
subject to x1x2x3 V.
We denote fx 2x1x2 x2x3 x3x1, and hx x1x2x3 V. We have rfx 2×2 x3,x1 x3, x1 x2 and rhx x2x3, x1x3, x1x2. By the Lagrange condition, the dimensions of the box with minimum surface area satisfies
2bcbc 0 2acac 0 2abab 0
abc V, 172

where 2 R.
b. Regularity of x means rhx 6 0 since there is only one scalar equality constraint in this case. Since x a, b, c is a feasible point, we must have a, b, c 6 0 for otherwise the volume will be 0. Hence, rhx 6 0, which implies that x is regular.
c. Multiplying the first equation by a and the second equation by b, and then subtracting the first from the
second, we obtain:
Since c 6 0 see part b, we conclude that a b. By a similar procedure on the second and third equations,
ca b 0.
we conclude that b c. Hence, substituting into the fourth constraint equation, we obtain
a b c V 13, d. The Hessian of the Lagrangian is given by
with 4V 13.
Lx, 42 c 0
2 b 37 26 0 2 2 37 26 0 1 1 37 2 a5 42 0 25 2 41 0 15 . 0 2 2 0 1 1 0
26 0 2 c 2b 2a
The matrix Lx, is not positive definite there are several ways to check this: we could use Sylvesters criterion, or we could compute the eigenvalues of Lx , , which are 2, 2, 4. Therefore, we need to compute the tangent space Tx. Note that
Dhx rhx bc, ac, ab V 231, 1, 1. Txy:Dhxy0y:1,1,1y0y:y3 y1 y2.
Hence,
Lety2Tx,y60. Notethateithery1 60ory2 60. Wehave,
260 1 137
yLx, y 2y 41 0 15 y 4y1y2 y1y3 y2y3.

110
yLx, y 4y1y2 y1y1 y2 y2y1 y2 4y12 y1y2 y2 4zQz
Substituting y3 y1 y2, we obtain wherezy1,y2 60and
Q 1 12 0. 12 1
Therefore, yLx, y 0, which shows that the SOSC is satisfied. An alternative simpler calculation:
260 1 137
yLx,y2y41 0 15y2y1y2 y3y2y1 y3y3y1 y2.
110
Substituting y1 y2 y3, y2 y1 y3, and y3 y1 y2 in the first, second, and third terms,
respectively, we obtain
yLx, y 2y12 y2 y2 0. 173

20.7
a. We first compute critical points by applying the Lagrange conditions. These are:
2x12x1 0 6x12x2 0 12×3 0
x21 x2 x23 16 There are six points satisfying the Lagrange condition:
x1 p632, 0, 12 , x2 p632, 0, 12 ,
x3 0, 0, 4, x4 0, 0, 4, x5 0, p5756, 16 , x6 0, p5756, 16 ,
0.
1 1
2 1
3 18
4 18
5 3
6 3.
All the above points are regular. We now apply second order conditions to establish their nature. For this,
we compute
26 2 0 0 37 26 2 0 0 37 Fx40 6 05, Hx40 2 05,
000 002 Tx y 2 R3 : 2×1,2×2,2x3y 0.
26 0 0 0 37 Lx1,1 40 4 0 5
0 0 2
T x1 ap63, b, a : a, b 2 R.
Let y ap63, b, a 2 T x1, where a and b are not 8both zero. Then
and
For the first point, we have
: 0 if a bp2 yLx1,1y4b22a2 0 ifabp2.
From the above, we see that x1 does not satisfy the SONC. Therefore, x1 cannot be an extremizer. Performing similar calculations for x2, we conclude that x2 cannot be an extremizer either.
For the third point, we have
26740 037 Lx3,3 4 0 234 0 5
0 0 14 Tx3 a,b,0 : a,b 2 R.
Let y a,b,0 2 Tx3, where a and b are not both zero. Then yLx3,3y 7a2 23b2 0.
0 ifabp2
44 174

Hence, x3 is a strict local minimizer. Performing similar calculations for the remaining points, we conclude that x4 is a strict local minimizer, and x5 and x6 are both strict local maximizers.
b. The Lagrange condition for the problem is:
2×1 6×1 4×2 0 2×2 4×1 12×2 0
3×2 4x1x2 6×2 140 0. We represent the first two equations as
2 6 4 x1 0 . 4 212 x2 0
From the constraint equation, we note that x 0, 0 cannot satisfy the Lagrange condition. Therefore, the determinant of the above matrix must be zero. Solving for yields two possible values: 17 and 12. We then have four points satisfying the Lagrange condition:
x1 2, 4, x2 2, 4, x3 2p14, p14, x4 2p14, p14,
1 17 2 17 3 12 4 12.
Applying the SOSC, we conclude that x1 and x2 are strict local minimizers, and x3 and x4 are strict local maximizers.
20.8
a. We can represent the problem as
and
Let y 3,2 2 Tx1, 6 0. We have
minimize f x subject to hx 0,
where fx 2×1 3×2 4, and hx x1x2 6. We have Dfx 2,3, and Dhx x2,x1. Note that 0 is not a feasible point. Therefore, any feasible point is regular. If x is a local extremizer, then by the Lagrange multiplier theorem, there exists 2 R such that Dfx Dhx 0, or
2×2 0 3×1 0.
Solving, we get two possible extremizers: x1 3, 2, with corresponding Lagrange multiplier 1 1, and x2 3, 2, with corresponding Lagrange multiplier 2 1.
b. We have F x O, and
First, consider the point x1 3, 2, with corresponding Lagrange multiplier 1 1. We have
Lx1, 1 0 1 , 10
Hx0 1. 10
Tx1y:2,3y03,2 :2R.
yLx1, 1y 122 0. 175

Therefore, by the SOSC, x1 3, 2 is a strict local minimizer.
Next, consider the point x2 3, 2, with corresponding Lagrange multiplier 2 1. We have
Lx2, 2 0 1 . 10
and
Tx2 y : 2,3y 0 3,2 : 2 R Tx1. Let y 3,2 2 Tx2, 6 0. We have
yLx2, 2y 122 0. Therefore, by the SOSC, x2 3, 2 is a strict local maximizer.
c. Note that fx1 8, while fx2 16. Therefore, x1, although a strict local minimizer, is not a global minimizer. Likewise, x2, although a strict local maximizer, is not a global maximizer.
20.9
We observe that fx1,x2 is a ratio of two quadratic functions, that is, we can represent fx1,x2 as fx1,x2 xQx.
xPx
Therefore, if a point x is a maximizer of fx1,x2 then so is any nonzero multiple of this point because
txQtx t2xQx xQx. txP tx t2xP x xP x
Thus any nonzero multiple of a solution is also a solution. To proceed, represent the original problem in an equivalent form,
maximize xQx 18×21 8x1x2 12×2 subjectto xPx2x212x21.
Thus,wewishtomaximizefx1,x218x218x1x212x2 subjecttotheequalityconstraint,hx1,x2 1 2×21 2×2 0. We apply the Lagranges method to solve the problem. We form the Lagrangian function,
lx, f h, compute its gradient and find critical points. We have,
rxl rx x 18 4x 1x 2 0x!! 4 12 0 2
218 4×22 0x 0.4 12 0 2
We represent the above in an equivalent form,
0I22 0118 41Ax0.
02 412
That is, solving the problem is being reduced to solving an eigenvalueeigenvector problem,
I2 9 2!x9 2 x0. 26 26
176

The characteristic polynomial is
det9 2 2 1550510.
Thus,
24 21 p0.1 2
1
2 6
The eigenvalues are 5 and 10. Because we are interested in finding a maximizer, we conclude that the value of the maximized function is 10, while the corresponding maximizer corresponds to an appropriately scaled, to satisfy the constraint, eigenvector of this eigenvalue. An eigenvector can easily be found by taking any nonzero column of the adjoint matrix of
10I29 2. 2 6
Performing simple manipulations gives
adj1 24 2.
is a maximizer for the equivalent problem. Any multiple of the above vector is a solution of the original maximization problem.
20.10
We use the technique of Example 20.8. First, we write the objective function in the form xQx, where QQ3 2.
23
The characteristic polynomial of Q is 2 6 5, and the eigenvalues of Q are 1 and 5. The solutions to
the problem are the unit length eigenvectors of Q corresponding to the eigenvalue 5, which are 1, 1p2. 20.11
Consider the problem
minimize kAxk2 subject to kxk2 1.
The optimal objective function value of this problem is the smallest value that kyk2 can take. The above can be solved easily using Lagrange multipliers. The Lagrange conditions are
xAA x 0 1xx 0.
The first equation can be rewritten as AAx x, which implies that is an eigenvalue of AA. Moreover, premultiplying by x yields xAAx xx , which indicates that the Lagrange multiplier is equal to the optimal objective function value. Hence, the range of values that kyk kAxk can take is 1 to p20.
20.12
Consider the following optimization problem we need to use squared norms to make the functions dieren tiable:
minimize subject to
kAxk2 kxk2 1.
177

As usual, write fx kAxk2 and hx kxk2 1. We have rfx 2AAx and rhx 2x. Note that all feasible solutions are regular. Let x be an optimal solution. Note that the optimal value of the objective function is fx kAk2. The Lagrange condition for the above problem is:
2AAx 2x 0 kxk2 1.
From the first equation, we see that
which implies that is an eigenvalue of AA, and x is the corresponding eigenvector. Premultiplying the
above equation by x and combining the result with the constraint equation, we obtain xAAx kAxk2 fx kAk2.
Therefore, because x minimizes fx, we deduce that must be the largest eigenvalue of AA; i.e.,
1. Therefore,
20.13
Lethx1xPx0. Letx0 besuchthathx00. Then,x0 60. Forx0 tobearegularpoint,we need to show that rhx0 is a linearly independent set, i.e., rhx0 6 0. Now, rhx 2Px. Since P is nonsingular, and x0 6 0, then rhx0 2P x0 6 0.
20.14
Note that the point 1, 1 is a regular point. Applying the Lagrange multiplier theorem gives a2 0
AAx x,
kAk2 1.
p
b2 0.
a. Denote the solution by x1,x2. The Lagrange condition for this problem has the form
Hence, a b. 20.15
x 2 2 2 x 1 x 1 2 x 2 x 1 2 x 2 2
0 0 0 .
From the first and third equations it follows that x1 , x2 6 0. Then, combining the first and second equations, we obtain 2 x x
21 2×1 2×2
which implies that 2×2 x22 x12. Hence, x2 1, and by the third Lagrange equation, x12 1. Thus, the only two points satisfying the Lagrange condition are 1,1 and 1,1. Note that both points are regular.
b. Consider the point x 1, 1. The corresponding Lagrange multiplier is 12. The Hessian of theLagrangianis 0 1 12 0 1 1
Lx, 1 0 2 0 2 1 1 . Txy:2,2y0a,a :a2R.
Let y 2 Tx, y 6 0. Then, y a,a for some a 6 0. We have yLx,y 2a2 0. Hence, SONC does not hold in this case, and therefore x 1, 1 cannot be local minimizer. In fact, the point is a strict local maximizer.
The tangent plane is given by
178

c. Consider the point x 1, 1. The corresponding Lagrange multiplier is 12. The Hessian of the
Lagrangian is
Lx,0 112 01 1. 1 0 2 0 2 1 1
The tangent plane is given by
Lety2Tx,y60. Then,ya,aforsomea60. WehaveyLx,y2a2 0. Hence,bythe
Txy:2,2y0a,a :a2R. SOSC, the point x 1, 1 is a strict local minimizer.
20.16
a. The point x is the solution to the optimization problem minimize 1kx x0k2
2 subject to Ax 0.
Since rank A m, any feasible point is regular. By the Lagrange multiplier theorem, there exists 2 Rm
such that
Postmultiplying both sides by x and using the fact that Ax 0, we get
b. From part a, we have
Premultiplying both sides by A we get
x x0x 0.
x x0 A. Ax0 AA
x x0 A 0.
from which we conclude that AA1Ax0. Hence,
x x0 A x0 AAA1Ax0 In AAA1Ax0.
20.17
a. The Lagrange condition is omitting all superscript for convenience: AxbAC 0
Cx d.
For simplicity, write Q AA, which is positive definite. From the first equation, we have
x Q1Ab Q1C. Multiplying boths sides by C and using the second equation, we have
from which we obtain
Substituting back into the equation for x, we obtain
d CQ1Ab CQ1C,
CQ1C1CQ1Ab d.
x Q1Ab Q1CCQ1C1CQ1Ab d. 179

b. Rewrite the objective function as
1xAAx bAx 1kbk2.
22
As before, write Q AA. Completing the squares and setting y x Q1Ab, the objective function
can be written as
Hence, the problem can be converted to the equivalent QP:
minimize 1yQy 2
1yQy const. 2
The solution to this QP is
Hence, the solution to the original problem is:
subject to Cy d CQ1Ab.
y Q1CCQ1C1d CQ1Ab.
x Q1Ab Q1CCQ1C1d CQ1Ab
AA1Ab AA1CCAA1C1d CAA1Ab,
which agrees with the solution obtained in part a.
20.18
Write f x 1 x Qx c x d actually, we could have ignored d and hx b Ax. We have 2
The Lagrange condition is
From the first equation we get
Dfx xQ c, Dhx A. xQc A 0
bAx 0. x Q1A c.
Multiplying both sides by A and using the second equation constraint, we get AQ1A AQ1c b.
Since Q 0 and A is of full rank, we can write
AQ1A1b AQ1c.
L is positive semidefinite on M
, forally2M, yLy0
, for all x 2 Rm, BxLBx 0 , forallx2Rm, xBLBx0 , forallx2Rm, xLMx0
, LM0.
Hence,
Alternatively, we could have rewritten the given problem in our usual quadratic programming form with
x Q1c Q1AAQ1A1b AQ1c. Clearly,wehaveMRB,i.e.,y2Mifandonlyifthereexistsx2Rm suchthatyBx. Hence
variable y x Q1c. 20.19
180

For positive definiteness, the same argument applies, with replaced by . 20.20
a. By simple manipulations, we can write Therefore, the problem is
x2 a2x0 abu0 bu1. minimize 1u2 u2
201
subject to a2x0 abu0 bu1 0.
Alternatively, we may use a vector notation: writing u u0, u1, we have minimize f u
subject to hu 0,
where f u 1 kuk2 , and hu a2 x0 ab, bu. Since the vector rhu ab, b is nonzero for any u,
2
then any feasible point is regular. Therefore, by the Lagrange multiplier theorem, there exists 2 R such that
u0ab 0 u1b 0
a2x0abu0bu1 0.
We have three linear equations in three unknowns, that upon solving yields
u a3x0 , u a2x0 . 0 b1a2 1 b1a2
b. The Hessians of f and h are F u I2 2 2 identity matrix and Hu O, respectively. Hence, the Hessian of the Lagrangian is Lu, I2, which is positive definite. Therefore, u satisfies the SOSC, and is therefore a strict local minimizer.
20.21
Letting z x2, u1, u2, the objective function is zQz, where
261 0 037 Q4012 05.
0 0 13 The linear constraint on z is obtained by writing
x2 2×1 u2 22u1u2, which can be written as Az b, where
A 1,2,1, b 4. Hence, using the method of Section 20.6, the solution is
1 1 1 26 1 37 1 26 1 3 37 z Q A AQ A b44512 44435.
3 1
Thus, the optimal controls are u1 43 and u2 1. 181

20.22
The composite input vector is
The performance index J is J 1 uu. To obtain the constraint Au b, where A 2 R13, we proceed as
follows. First, we write
Using the above, we obtain
u hu0 u1 u2i . 2
problem
To solve the above problem, we form the Lagrangian 2
1 u u 2
x2 x12u1
x0 2u0 2u1.
x3 9
x22u2
x0 2u0 2u1 2u2. We represent the above in the format Au b as follows
h i 26 u 0 37
2 2 24u156.
u2
Thus we formulated the problem of finding the optimal control sequence as a constrained optimization
minimize subject to
Au b. lu, 1uuAub,
where is the Lagrange multiplier. Applying the Lagrange firstorder condition yields uA0 and Aub.
From the first of the above conditions, we calculate, u A. Substituting the above into the second of
the Lagrange conditions gives
Combining the last two equations, we obtain a closedform formula for the optimal input sequence
In our problem,
AA1 b. u A AA1 b.
26 u 0 37 b 26 1 37 u 4u15 AAA 415.
u2 1
21. Problems With Inequality Constraints
182

21.1
a. We form the Lagrangian function,
lx,x21 4×2 4×21 2×2.
The KKT conditions take the form,
Dxlx, h2x1 2×1
From the first of the above equality, we obtain
8×2 4x2i 0 1×1 0 2×2 0.
4×21 2×20 0
4 x 21 2 x 2 2 0 .
We first consider the case when 0. Then, we obtain the point, x1 0, which does not satisfy the constraints.
The next case is when 1. Then we have to have x2 0 and using 4 x21 2×2 0 gives x2 20 and x3 20.
Forthecasewhen2,wehavetohavex1 0andweget x4p0 and x5p0.
b. The Hessian of l is When 1,
We next find the subspace
22 L2 02 0.
08 04 L0 0.
04
T Ty:h4 0iy0 yah0 1i :a2R.
We then check for positive definiteness of L on T ,
yLya2h0 1i0 004a2 0.
041 Hence, x2 and x3 satisfy the SOSC to be strict local minimizers.
When 2, and
L2 0, 00
Tyah1 0i:a2R. 183

We have
Thus, x4 and x5 do not satisfy the SONC to be minimizers.
yLy 2a2 0. In summary, only x2 and x3 are strict local minimizers.
21.2
a. We first find critical points by applying the KarushKuhnTucker conditions, which are
0 0
0 0.
2×1 221×1 52 2×2101112
1 51 2 1 5 x 2 x 21 2 5 x 1 2 x 2 5
We have to check four possible combinations.
Case 1: 1 0, 2 0 Solving the first and second KarushKuhnTucker equations yields x1 1, 5.
However, this point is not feasible and is therefore not a candidate minimizer. Case 2: 1 0, 2 0 We have two possible solutions:
x2 0.98, 4.8 2 2.02 1
x3 0.02, 0 3 50. 1
Both x2 and x3 satisfy the constraints, and are therefore candidate minimizers. Case 3: 1 0, 2 0 Solving the corresponding equation yields:
x4 0.5050, 4.9505 4 0.198. 1
The point x4 is not feasible, and hence is not a candidate minimizer. Case 4: 1 0, 2 0 We have two solutions:
x5 0.732, 2.679 5 13.246, 3.986 x6 2.73, 37.32 6 188.8, 204 .
The point x5 is feasible, but x6 is not.
We are left with three candidate minimizers: x2, x3, and x5. It is easy to check that they are regular.
We now check if each satisfies the second order conditions. For this, we compute
For x2, we have
Lx, 2 21 0 . 02
Lx2, 2 2.04 0 02
T x2 a0.1021, 1 : a 2 R. Let y a0.1021, 1 2 T x2 with a 6 0. Then
yLx2, 2y 1.979a2 0.
Thus, by the SOSC, x2 is a strict local minimizer. For x3, we have
Lx3, 3 97.958 0 02
T x3 a4.898, 1 : a 2 R. 184

Let y a4.898, 1 2 T x3 with a 6 0. Then,
yLx3, 3y 2347.9a2 0.
Thus, x3 does not satisfy the SOSC. In fact, in this case, we have Tx3 T x3, and hence x3 does not satisfy the SONC either. We conclude that x3 is not a local minimizer. We can easily check that x3 is not a local maximizer either.
For x5, we have
Lx5, 5 24.4919 0 02
T x5 0.
The SOSC is trivially satisfied, and therefore x5 is a strict local minimizer.
b. The KarushKuhnTucker conditions are:
2×113 2×223 x1 x2 x1x25 1×1 2×2 3×1 x2 5 1,2,3
0 0 0 0 0 0 0.
It is easy to verify that the only combination of KarushKuhnTucker multipliers resulting in a feasible point is 1 2 0, 3 0. For this case, we obtain x 2.5,2.5, 0,0,5. We have
Lx, 2 0 0. 02
Hence, x is a strict local minimizer in fact, the only one for this problem. c. The KarushKuhnTucker conditions are:
2×1 6×2 421×1 22 6×122122 x212x21 2x12x21 1×21 2×2 122×1 2×2 1 1,2
0 0 0 0 0 0.
It is easy to verify that the only combination of KarushKuhnTucker multipliers resulting in a feasible point is 1 0, 2 0. For this case, we obtain x 914, 214, 0, 1314. We have
Lx, 2 6 60
T x a1, 1 : a 2 R. Let y a1,1 2 T x with a 6 0. Then
yLx, y 14a2 0.
Hence, x is a strict local minimizer in fact, the only one for this problem. 185

21.3
The KarushKuhnTucker conditions are:
2×1 2×1 2×2 2×1
2×2 2×1 2×2 x21 2x1x2 x2 1 x21x2 x21x2

0 0 0 0 0 0.
We have two cases to consider.
Case 1: 0 Substituting x2 x21 into the third equation and
yields two possible points:
combining the result with the first two x1 1.618, 0.618 1 3.7889
x2 2.618, 0.382 2 0.2111.
Note that the resulting values violate the condition 0. Hence, neither of the points are minimizers although they are candidates for maximizers.
Case 2: 0 Subtracting the second equation from the first yields x1 x2, which upon substituting into the third equation gives two possible points:
x3 12, 12, x4 12, 12.
Note that x4 is not a feasible point, and is therefore not a candidate minimizer.
Therefore, the only remaining candidate is x3, with corresponding 3 12 and 0. We now
check second order conditions. We have
Lx3, 0, 3 1 1
11
Tx3 a1,1 : a 2 R.
Let y a1, 1 2 T x3 with a 6 0. Then
yLx3, 0, 3y 4a2 0.
Therefore, by the SOSC, x3 is a strict local minimizer. 21.4
The optimization problem is:
minimize e n p subject to GpPem
p 0,
where G gi,j, en 1,…,1 with n components, and p p1,…,pn. The KKT condition for this problem is:
e n 1 G 2 0 1 Pem Gp2 p 0
Gp Pem 1,2,p 0.
186

21.5
a. We have fx x2 x1 23 3 and gx 1×2. Hence, rfx 3×1 22,1 and rgx 0, 1. The KKT condition is
0 3×122 0 10 1×2 0
1×2 0.
The only solution to the above conditions is x 2, 1, 1.
To check if x is regular, we note that the constraint is active. We have rgx 0,1, which is
nonzero. Hence, x is regular. b. We have
Lx,FxGx0 0. 00
Hence, the point x satisfies the SONC.
c. Since 0,wehaveT x,Txy:0,1y0y:y2 0,whichmeansthatT
contains nonzero vectors. Hence, the SOSC does not hold at x.
21.6
a. Write fx x2, gx x2 x1 12 3. We have rfx 0,1 and rgx 2×1 1,1. The KKT conditions are:
0 2×1 1 0 10 x2 x1 12 3 0
x2 x1 12 3 0.
From the third equation we get 1. The second equation then gives x1 1, and the fourth equation gives x2 3. Therefore, the only point that satisfies the KKT condition is x 1, 3, with a KKT multiplier of 1.
b. Note that the constraint x2 x1 12 3 0 is active at x. We have Tx y : 0,1y 0y:y2 0,andNxy:y0,1z, z2Ry:y1 0. Because 0,wehave T xTxy:y2 0.
c. We have
From part b, Tx y : y2 0. Therefore, for any y 2 Tx, yLx,y 2y12 0, which means
that x does not satisfy the SONC. 21.7
a. We need to consider two optimization problems. We first consider the minimization problem
minimize x1 22 x2 12 subjectto x21x20
x1 x2 20 x1 0.
Lx,O12 02 0. 00 00
187

Then, we form the Lagrangian function
lx,x1 22 x2 12 1×21 x22x1 x2 23×1.
The KKT condition takes the form
rxlx,h2x1 221×1 2 3
1 x 21 x 2 0 2×1 x2 20 3×1 0
i 0.
2×2 11 2i0T
The point x 0 satisfies the above conditions for 1 2, 2 0, and 3 4. Thus the point x does not satisfy the KKT conditions for minimum.
We next consider the maximization problem
minimize x1 22 x2 12 subjectto x21x20
x1 x2 20 x1 0.
The Lagrangian function for the above problem is,
lx,x1 22 x2 12 1×21 x22x1 x2 23×1.
The KKT condition takes the form
rxlx,h2x1 221×1 2 3
2×2 11 2i0
The point x 0 satisfies the above conditions for 1 2, 2 0, and 3 4. Hence, the point x satisfies
1 x 21 x 2 0 2×1 x2 20 3×1 0
i 0.
the KKT conditions for maximum.
b. We next compute the Hessian, with respect to x, of the lagrangian to obtain
LF 1G1 2 0 4 02 0 0 2 0 0 0 2
which is indefinite on R2. We next find the subspace
T y:rg1xy0y: 0 1y0
rg3x 1 0 0,
That is, T is a trivial subspace that consists only of the zero vector. Thus the SOSC for x to be a strict local maximizer is trivially satisfied.
21.8
a. Write hx x1 x2, gx x1. We have Dfx x2,2x1x2, Dhx 1,1, Dgx 1,0. 188

Note that all feasible points are regular. The KKT condition is:
x2 0 2x1x2 0 x1 0 0 x1x2 0
x1 0.
We first try x1 x1 0 active inequality constraint. Substituting and manipulating, we have the solution x1 x2 0 with 0, which is a legitimate solution. If we then try x1 x1 0 inactive inequality constraint, we find that there is no consistent solution to the KKT condition. Thus, there is only one point satisfying the KKT condition: x 0.
c. The tangent space at x 0 is given by
T0 y : 1,1y 0,1,0y 0 0.
Therefore, the SONC holds for the solution in part a.
d. We have
Lx,, 0 2×2. 2×2 2×1
Hence, at x 0, we have L0, 0, 0 O. Since the active constraint at x 0 is degenerate, we have T 0,0 y : 1,1y 0,
which is nontrivial. Hence, for any nonzero vector y 2 T 0,0, we have yL0,0,0y 0 6 0. Thus, the SOSC does not hold for the solution in part a.
21.9
a. The KKT condition for the problem is:
AxbAe 0 x 0 0 ex1 0 x0
where e 1,…,1.
b. A feasible point x is regular in this problem if the vectors e, ei, i 2 Jx are linearly independent, where Jx i : xi 0 and ei is the vector with 0 in all components except the ith component, which is 1.
In this problem, all feasible points are regular. To see this, note that 0 is not feasible. Therefore, any feasible point results in the set Jx having fewer than n elements, which implies that the vectors e, ei, i 2 Jx are linearly independent.
21.10
By the KKT Theorem, there exists 0 such that
x x0 rgx 0
gx 0. Premultiplying both sides of the first equation by x x0, we obtain
kx x0k2 x x0rgx 0. 189

Since kx x0k2 0 because gx0 0 and 0, we deduce that x x0rgx 0 and 0. From the second KKT condition above, we conclude that gx 0.
21.11
a. By inspection, we guess the point 2, 2 drawing a picture may help.
b. We write fx x1 32 x2 42, g1x x1, g2x x2, g3x x1 2, g4x x2 2,
g g1, g2, g3, g4. The problem becomes
subject to gx 0.
We now check the SOSC for the point x 2,2. We have two active constraints: g3, g4. Regularity holds, since rg3x 1,0 and rg4x 0,1. We have rfx 2,4. We need to find a 2 R4, 0, satisfying FONC. From the condition gx 0, we deduce that 1 2 0. Hence, DfxDgx0 ifandonlyif 0,0,2,4. Now,
Fx 2 0, Gx 0 0. 02 00
21.12
The KKT condition is
minimize f x
Hence
which is positive definite on R2. Hence, SOSC is satisfied, and x is a strict local minimizer.
Lx, 2 0 02
xQA 0 Axb 0 0
Axb 0. Postmultiplying the first equation by x gives
xQx Ax 0. We note from the second equation that Ax b. Hence,
xQx b 0.
Since Q 0, the first term is nonnegative. Also, the second term is nonnegative because 0 and b 0. Hence, we conclude that both terms must be zero. Because Q 0, we must have x 0.
Aside: Actually, we can deduce that the only solution to the KKT condition must be 0, as follows. The problem is convex; thus, the only points satisfying the KKT condition are global minimizers. However, we see that 0 is a feasible point, and is the only point for which the objective function value is 0. Further, the objective function is bounded below by 0. Hence, 0 is the only global minimizer.
21.13
a. We have one scalar equality constraint with hx c,dxe and two scalar inequality constraints with gx x. Hence, there exists 2 R2 and 2 R such that
ac1
bd2 x
cx1dx2 x
190
0 0 0 0 e 0.

b. Because x is a basic feasible solution, and the equality constraint precludes the point 0, exactly one of the inequality constraints is active. The vectors rhx c,d and rg1 1,0 are linearly independent. Similarly, the vectors rhx c,d and rg2 0,1 are linearly independent. Hence, x must be regular.
c. The tangent space is given by
Tx y2Rn :Dhxy0, Dgjxy0, j2Jx
NM,
where M is a matrix with the first row equal to Dhx c,d, and the second row is either Dg1 1,0
or Dg2 0,1. But, as we have seen in part b, rankM 2 Hence, Tx 0.
d. Recall that we can take to be the relative cost coecient vector i.e., the KKT conditions are satisfied with being the relative cost coecient vector. If the relative cost coecients of all nonbasic variables are strictly positive, then j 0 for all j 2 Jx. Hence, T x, Tx 0, which implies that
Lx, , 0 on T x, . Hence, the SOSC is satisfied. 21.14
Let x be a solution. Since A is of full rank, x is regular. The KKT Theorem states that x satisfies:
0 cA 0
Ax 0.
If we postmultiply the second equation by x and subtract the third from the result, we get
21.15
a. We can write the LP as
cx 0.
minimize f x subject to hx 0, gx 0,
where fx cx, hx Axb, and gx x. Thus, we have Dfx c, Dhx A, and Dgx I. The KarushKuhnTucker conditions for the above problem have the form: if x is a local minimizer, then there exists and such that
0 cA 0
x 0.
b. Let x be an optimal feasible solution. Then, x satisfies the KarushKuhnTucker conditions listed in part a. Since 0, then from the second condition in part a, we obtain A c. Hence, is a feasible solution to the dual see Chapter 17. Postmultiplying the second condition in part a by x, we have
0cx Axx cx b
which gives
Hence, achieves the same objective function value for the dual as x for the primal.
c. From part a, we have c A. Substituting this into x 0 yields the desired result. 191
c x b .

21.16
By definition of Jx, we have gix 0 for all i 62 Jx. Since by assumption gi is continuous for all i, thereexists0suchthatgix0foralli62JxandallxinthesetBx:kxxk. Let S1 x:hx0,gjx0,j2Jx. WeclaimthatSBS1B. Toseethis,notethatclearly SBS1B. ToshowthatS1BSB,supposex2S1B. Then,bydefinitionofS1 andB,we havehx0,gjx0forallj2Jx,andgix0foralli62Jx. Hence,x2SB.
Since x is a local minimizer of f over S, and SB S, x is also a local minimizer of f over S B S1 B. Hence, we conclude that x is a regular local minimizer of f on S1. Note that S0 S1, and x 2 S0. Therefore, x is a regular local minimizer of f on S0.
21.17
Writefxx21x2,g1xx21x24,g2xx2x12,andgg1,g2. Wehaverfx2x1,2×2, rg1x 2×1,1, rg2x 1,1, and D2fx diag2,2. We compute
rfxrgx2x1 21×1 2,2×2 1 2.
We use the FONC to find critical points. Rewriting rfx rgx 0, we obtain
x1 2 , x2 1 2 . 221 2
We also use gx 0 and 0, giving
1×21 x2 40, 2×2 x1 20.
The vector has two components; therefore, we try four dierent cases. Case 1: 1 0, 2 0 We have
x21 x2 40, x2 x1 20.
We obtain two solutions: x1 2, 0 and x2 3, 5 . For x1 , the two FONC equations give 1 2 and 2221 1, which yield 1 2 45. This is not a legitimate solution since we require 0. For x2, the two FONC equations give 1 2 10 and 32 21 2, which yield 165, 665. Again, this is not a legitimate solution.
Case 3: 1 0, 2 0 We have
x21 x2 40, x1 0, x2 1.
Therefore, x2 4, 1 8, and again we dont have a legitimate solution.
Case 4: 1 0, 2 0 We have x1 x2 0, and all constraints are inactive. This is a legitimate
candidate for the minimizer. We now apply the SOSC. Note that since the candidate is an interior point of the constraint set, the SOSC for the problem is equivalent to the SOSC for unconstrained optimization. The Hessian matrix D2fx diag2,2 is symmetric and positive definite. Hence, by the SOSC, the point x 0, 0 is the strict local minimizer in fact, it is easy to see that it is a global minimizer.
21.18
Write fx x21x2, g1x x1x24, g2x x110, and g g1,g2. We have rfx 2×1,2×2, rg1x 1, 2×2, rg2x 1, 0, D2fx diag2, 2, D2g1x diag0, 2, and D2g2x O. We
Case 2: 1 0, 2 0 We have
x2 x1 20, x1 2, x2 2.
22
Hence, x1 x2 , and thus x 1, 1, 2 2. This is not a legitimate solution since we require 0.
compute
We use the FONC to find critical points. Rewriting rfx rgx 0, we obtain
rfxrgx2x1 1 2,2×2 21×2.
x1 1 2, 2
x2110.
2
192

Since we require 0, we deduce that x2 0. Using gx 0 gives 1×1 4 0, 2 0.
We are left with two cases.
Case 1: 1 0, 2 0 We have x1 4 0, and 1 8, which is a legitimate candidate.
Case2: 1 0,2 0Wehavex1 x2 0,whichisnotalegitimatecandidate,sinceitisnota
feasible point.
We now apply SOSC to our candidate x 4, 0, 8, 0. Now,
L4,0,8,02 080 02 0, 02 02 018
which is positive definite on all of R2. The point 4, 0 is clearly regular. Hence, by the SOSC, x 4, 0 is a strict local minimizer.
21.19
Writefxx21x2,g1xx1x24,g2x3x2x1,g3x3x2x1 andgg1,g2,g3. Wehave rfx 2×1, 2×2, rg1x 1, 2×2, rg2x 1, 3, rg2x 1, 3, D2fx diag2, 2, D2g1x diag0, 2, and D2g2x D2g3x O.
From the figure, we see that the two candidates are x1 3, 1 and x2 3, 1. Both points are easily verified to be regular.
For x1, we have 3 0. Now,
Dfx1Dgx161 2,221 320,
which yields 1 4, 2 2. Now, T x1 0. Therefore, any matrix is positive definite on T x1. Hence, by the SOSC, x1 is a strict local minimizer.
For x2, we have 2 0. Now,
Dfx1Dgx161 3,221 330,
which yields 1 4, 3 2. Now, again we have T x2 0. Therefore, any matrix is positive definite on T x2. Hence, by the SOSC, x2 is a strict local minimizer.
21.20
a. Write fx 3×1 and gx 2 x1 x2. We have rfx 3,0 and rgx 1,0. Hence, letting 3, we have rfxrgx 0. Note also that 0 and gx 0. Hence, x 2,0 satisfies the KKT first order necessary condition.
b. We have F x O and Gx diag0, 2. Hence, Lx, O 3 diag0, 2 diag0, 6. Also, Tx y : 1,0y 0 y : y1 0. Hence, x 2,0 does not satisfy the second order necessary condition.
c. No. Consider points of the form x x2 2,×2, x2 2 R. Such points are feasible, and could be arbitrarily close to x. However, for such points x 6 x,
fx3x2 266×2 6fx. Hence, x is not a local minimizer.
21.21
The KKT condition for the problem is
0 xa 0
x 0. 193

Premultiplying the second KKT condition above by and using the third condition, we get a kk2.
Also, premultiplying the second KKT condition above by x and using the feasibility condition ax b,
we get
kx k2 b 0.
We conclude that 0. For if not, the equation a kk2 implies that a 0, which contradicts 0anda0.
Rewriting the second KKT condition with 0 yields x a.
Using the feasibility condition ax b, we get
xa b .
kak2
21.22
a. Suppose x1 2 x2 2 1. Then, the point x x1 , x2 lies in the interior of the constraint set x : kxk2 1. Hence, by the FONC for unconstrained optimization, we have that rfx 0, where fx kx a,bk2 is the objective function. Now, rfx 2x a,b 0, which implies that x a, b which violates the assumption x1 2 x2 2 1.
b. First, we need to show that x is a regular point. For this, note that if we write the constraint as gx kxk2 1 0, then rgx 2x 6 0. Therefore, x is a regular point. Hence, by the Karush KuhnTucker theorem, there exists 2 R, 0, such that
which gives
Hence,x isunique,andwecanwritex1 a,x2 b,where110.
rfx rgx 0, x 1 a.
1 b
c. Using part b and the fact that kxk 1, we get kxk2 2ka, bk2 1, which gives 1ka, bk
1pa2 b2.
21.23
a. The KarushKuhnTucker conditions for this problem are
2x1expx1
2×21
expx1 x2
expx1
0
0
0
x2 0.
b. From the second equation in part a, we obtain 2×2 1. Since x2 expx1 0, then 0. Hence, by the third equation in part a, we obtain x2 expx1.
c. Since 2×2 1 2expx1 1, then by the first equation in part a, we have 2×1 2expx1 1 expx1 0
which implies
x1 exp2x1 expx1. 194

Since expx1,exp2x1 0, then x1 0, and hence expx1,exp2x1 1. Therefore, x1 2. 21.24
a. We rewrite the problem as
minimize f x subject to gx 0,
otherwise it would not be feasible, and therefore it is a regular point. By the KKT theorem, there exists 0suchthatcx andgx0. Sincec60,wemusthave 60. Therefore,gx0, which implies that kxk2 2. p
where fx cx and gx 1kxk2 1. Hence, rfx c and rgx x. Note that x 6 0 for 2
b. Fromparta,wehave2kek2 2. Sincekek2 n,wehave 2n.
To find c, we use
4cx p8kxk2 82 8, and thus 2. Hence, c 2e 2 2ne.
21.25
We can represent the equivalent problem as
minimize f x
where gx 1 khxk2. Note that 2
Therefore, the KKT condition is:
subject to gx 0, rgx Dhxhx.
0 rfx Dhxhx 0
khx k 0.
Note that for a feasible point x, we have hx 0. Therefore, the KKT condition becomes
0 rfx 0.
Note that rgx 0. Therefore, any feasible point x is not regular. Hence, the KKT theorem cannot be applied in this case. This should be clear, since obviously rfx 0 is not necessary for optimality in general.
22. Convex Optimization Problems
22.1
The given function is a quadratic, which we represent in the form
261 137 fx4 1 25x.
1 2 5
A quadratic function is concave if and only if it is negativesemidefinite. Equivalently, if and only if its
negative is positivesemidefinite. On the other hand, a symmetric matrix is positive semidefinite if and only 195

if all its principal minors, not just the leading principal minors, are nonnegative. Thus we will determine the range of the parameter for which
26 1 1 37 f x 4 1 2 5 x.
1 2 5
is positivesemidefinite. It is easy to see that the three firstorder principal minors diagonal elements of F are all positive. There are three secondorder principal minors. Only one of them, the leading principal minor, is a function of the parameter ,
detF1:!2,1:!2det1 12. 1
The above secondorder leading principal minor is nonnegative if and only if 2 1, 1.
The other secondorder principal minors are
detF1: !3,1: !3 and detF2: !3,2: !3,
and they are positive. There is only one thirdorder principal minor, det F , where detF det1 2det 1det 1
252512 15221
152 221 524.
The thirdorder principal minor is nonnegative if and only if, 5 4 0, that is, if and only if 2 45, 0.
Combining this with 2 1,1 from above, we conclude that the function f is negativesemidefinite, equivalently, the quadratic function f is concave, if and only if
22.2
We have
d2 dQd 0 d2
and hence by Theorem 22.5, is strictly convex.
2 45, 0.
1xdQxdxdb
2 1dQd2 dQx b 1xQx xb .
22 This is a quadratic function of . Since Q 0, then
22.3
Write fx xQx, where
Q10 1. 210
196

Let x,y 2 . Then, x a1,ma1 and y a2,ma2 for some a1,a2 2 R. By Proposition 22.1, it is enough to show that y xQy x 0. By substitution,
which completes the proof.
22.4

fx1y

maxfix1y i
max fix 1 fiy i
max fix 1 max fiy ii
fx 1 fy
by convexity of each fi by property of max
which implies that f is convex.
22.7
yxQyxma2 a12 0,
Letx,y2and20,1. Then,hxhyc. Byconvexityof,hx1yc. Therefore, hx 1 y hx 1 hy
and so h is convex over . We also have
hx 1 y hx 1 hy,
which shows that h is convex, and thus h is concave. 22.5
At x 0, for 2 1, 1, we have, for all y 2 R,
y 0 y 0 y.
Thus, in this case any in the interval 1, 1 is a subgradient of f at x 0. At x 1, 1 is the only subgradient of f, because, for all y 2 R,
y 1 y 1 y. 22.6
Let x,y 2 and 2 0,1. For convenience, write f maxf1,…,f maxi fi. We have
: This is true by definition.
: Letd2Rn begiven. WewanttoshowthatdQd0. Now,fixsomevectorx2. Sinceis
open, there exists 6 0 such that y x d 2 . By assumption,
0 y xQy x 2dQd
which implies that dQd 0.
22.8
Yes, the problem is a convex optimization problem. Firstweshowthattheobjectivefunctionfx1kAxbk2 isconvex.Wewrite
2
fx 1xAAx bAx constant 2
which is a quadratic function with Hessian AA. Since the Hessian AA is positive semidefinite, the objective function f is convex.
Next we show that the constraint set is convex. Consider two feasible points x and y, and let 2 0, 1. Then, x and y satisfy ex 1,x 0 and ey 1,y 0, respectively. We have
ex 1 y ex 1 ey 1 1. 197

Moreover, each component of x 1 y is given by xi 1 yi, which is nonnegative because every term here is nonnegative. Hence, x 1 y is a feasible point, which shows that the constraint set is convex.
22.9
We need to show that is a convex set, and f is a convex function on .
To show that is a convex set, we need to show that for any y, z 2 and 2 0, 1, we have y1z 2
. Lety,z2and20,1. Thus,y1 y2 0andz1 z2 0. Hence, xy1zy1 1z1
Now,
and since , 1 0,
y2 1 z2
x1 y1 1z1 y2 1z2 x2, x1 0.
Hence, x 2 and therefore is convex.
Toshowthatf isconvexon,weneedtoshowthatforanyy,z2and20,1,fy1z fy1fz. Lety,z2and20,1. Thus,y1 y2 0andz1 z2 0,sothatfyy13 and fzz13. Also,3 and13 1. Wehave
fy 1 z
Hence, f is convex.
22.10
y1 1 z1 3
3y13 1 3z13 32×211 y1 3×11 2y12
y13 1z13 maxy1,z13 13 1
321312 y13 1 z13
fy 1 fz.
Since the problem is a convex optimization problem, we know for sure that any point of the form y1z, 2 0,1, is a global minimizer. However, any other point may or may not be a minimizer. Hence, the largest set of points G for which we can be sure that every point in G is a global minimizer, is given by
G y 1 z : 0 1.
22.11
a. Let f be the objective function and the constraint set. Consider the set x 2 : fx 1. This set contains all three of the given points. Moreover, by Lemma 22.1, is convex. Now, if we take the average of the first two points which is a convex combination of them, the resulting point 121, 0, 0
120, 1, 0 121, 1, 0 is in , 130, 0, 1 131, 1, 1 is also in 131, 1, 1 must be 1.
because is convex. Similarly, the point 23121, 1, 0 , because is convex. Hence, the objective function value of
b. If the three points are all global minimizers, then the point 131, 1, 1, which must cannot have higher objective function value than the given three points by part a, must also be a global minimizer.
22.12
a. The Lagrange condition for the problem is given by: xQA 0
From the first equation above, we obtain
Ax b.
x Q1A. 198

Applying the second equation constraint on x, we have AQ1A b.
Since rank A m, the matrix AQ1 A is invertible. Therefore, the only solution to the Lagrange condition
is
x Q1AAQ1A1b.
b. The point in part a above is a global minimizer because the problem is a convex optimization problem by problem 1, the constraint set is convex; the objective function is convex because its Hessian, Q, is positive definite.
22.13
By Theorem 22.4, for all x 2 , we have
fx fxP Dfxx x.
Substituting Dfx from the equation Dfx a 0 into the above inequality yields
Observe that for each j 2 Jx, and for each x 2 ,
Hence, for each j 2 Jx,
Since j 0, we get
and the proof is completed.
22.14
fxfx
j2Jx
Xj 2 J x j j
jaj xx.
a j x b j 0 ,
a j x b j 0 .
aj xx0.
X
fxfx
a. Letx2Rn :axb,x1,x2 2,and20,1. Then,ax1 bandax2 b. Therefore,
ax1 1 x2 ax1 1 ax2 b1b
b
which means that x1 1 x2 2 . Hence, is a convex set.
b. Rewrite the problem as
minimize f x subject to gx 0
where fx kxk2 and gx b ax. Now, rgx a 6 0. Therefore, any feasible point is regular. By the KarushKuhnTucker theorem, there exists 0 such that
2xa 0 b ax 0.
Since x is a feasible point, then x 6 0. Therefore, by the first equation, we see that 6 0. The second
equation then implies that b ax 0.
j2Jx
jaj xxfx
199

c. By the first KarushKuhnTucker equation, we have x a2. Since a x b, then a a2 ax b, and therefore 2bkak2. Since x a2 then x is uniquely given by x bakak2.
22.15
a. Letfxcxandx:x0. Supposex,y2,and20,1. Then,x,y0. Hence, x1y0,whichmeansx1y2. Furthermore,
cx 1 y cx 1 cy. Therefore, f is convex. Hence, the problem is a convex programming problem.
b. : We use contraposition. Suppose ci 0 for some i. Let d 0,…,1,…,0, where 1 appears in the ith component. Clearly d is a feasible direction for any point x 0. However, drfx dc ci 0. Therefore, the FONC does not hold, and any point x 0 cannot be a minimizer.
: Suppose c 0. Let x 0, and d a feasible direction at x. Then, d 0. Hence, drfx 0. Therefore, by Theorem 22.7, x is a solution.
The above also proves that if a solution exists, then 0 is a solution.
c. Write gx x so that the constraint can be expressed as gx 0.
: We have Dgx I, which has full rank. Therefore, any point is regular. Suppose a solution x
exists. Then, by the KKT theorem, there exists 0 such that c 0 and x 0. Hence, c 0.
: Supposec0. Letx 0and c. Then, 0,c 0,andx 0,i.e.,the KKT condition is satisfied. By part a, x is a solution to the problem.
The above also proves that if a solution exists, then 0 is a solution. 22.16
a. The standard form problem is
which can be written as
minimize subject to
minimize subject to
c x Ax b x 0,
f x hx 0, gx 0,
where fx cx, hx Axb, and gx x. Thus, we have Dfx c, Dhx A, and Dgx I. The KarushKuhnTucker conditions for the above problem has the form:
0 cA 0
x 0.
b. The KarushKuhnTucker conditions are sucient for optimality in this case because the problem is a convex optimization problem, i.e., the objective function is a convex function, and the feasible set is a convex set.
c. The dual problem is
c. Let
maximize b
subject to A c.
c A. 200

Since is feasible for the dual, we have 0. Rewriting the above equation, we get c A 0.
The Complementary Slackness condition c Ax 0 can be written as x 0. Therefore, the KarushKuhnTucker conditions hold. By part b, x is optimal.
22.17
a. We can treat s1 and s2 as vectors in Rn. We have
Sa s:sx1s1 x2s2,x1,x2 2R,
Let a a, . . . , a . The optimization problem is: minimize
si a,i1…,n. x1s1 x2s2 a.
1×2 x2 212
b. The KKT conditions are:
subject to
x1s1 x2s2
0 0 0
x1s1 x2s2 a 0
x1s1 x2s2 a.
c. Yes, because the Hessian of the Lagrangian is I identity, which is positive definite.
d. Yes, this is a convex optimization problem. The objective function is quadratic, with identity Hessian hence positive definite. The constraint set is of the form Ax a, and hence is a linear variety.
22.18
a. We first show that the set of probability vectors
q2Rn :q1 qn 1, qi 0, i1,…,n
is a convex set. Let y, z 2 , so y1 yn 1, yi 0, z1 zn 1, and zi 0. Let 2 0, 1 and x y 1 z. We have
x1xn
y1 1z1 yn 1zn y1 yn1z1 zn 1
1.
Also,becauseyi 0,zi 0,0,and10,weconcludethatxi 0. Thus,x2,whichshowsthat is convex.
b. We next show that the function f is a convex function on . For this, we compute
0 3
7 . . . 75 ,
pn qn2
which shows that Fq 0 for all q in the open set q : qi 0, i 1,…,n, which contains . Therefore, f is convex on .
2 p1 6 q 12
F q 64 . . . . . . 0
201

c. Fix a probability vector p. Consider the optimization problem
minimize p1 logp1 pn logpn
x1 xn subject to x1 xn 1
xi 0, i1,…,n
By parts a and b, the problem is a convex optimization problem. We ignore the constraint xi 0 and write
down the Lagrange conditions for the equalityconstraint problem: pi 0, i1,…,n
xi
x1xn 1.
Rewrite the first set of equations as xi pi. Combining this with the constraint and the fact that p1 pn 1, we obtain 1, which means that xi pi . Therefore, the unique global minimizer is x p.
Note that fx 0. Hence, we conclude that fx 0 for all x 2 . Moreover, fx 0 if and only if x 0. This proves the required result.
d. Given two probability vectors p and q, the number
Dp, q p1 log p1 pn log pn
is called the relative entropy or KullbackLiebler divergence between p and q. It is used in information theory to measure the distance between two probability vectors. The result of part c justifies the use of D as a measure of distance although D is not a metric because it is not symmetric.
22.19
We claim that a solution exists, and that it is unique. To prove the first claim, choose 0 such that there exists x 2 satisfying kx zk . Consider the modified problem
minimize kx zk
subject to x 2 y : ky zk .
If this modified problem has a solution, then clearly so does the original problem. The objective function here is continuous, and the constraint set is closed and bounded. Hence, by Weierstrasss Theorem, a solution to the problem exists.
Let f be the objective function. Next, we show that f is convex and hence the problem is a convex optimization problem. Let x, y 2 and 2 0, 1. Then,
fx1y kx1yzk
kxz1yzk
kxzk1kyzk fx 1 fy,
which shows that f is convex.
To prove uniqueness, let x1 and x2 be solutions to the problem. Then, by convexity, x3 x1 x22 is
also a solution. But
k x z k x 1 x 2 z
q1 qn
32
x1zx2z
2 2
1kx1 zkkx2 zk
2
kx3zk,
202

from which we conclude that the triangle inequality above holds with equality, implying that x1 z x2 z for some 0. Because kx1 zk kx2 zk kx1 zk, we have 1. From this, we obtain x1 x2, which proves uniqueness.
22.20
a. LetA2Rnn andB2Rn besymmetricandA0,B0. Fix20,1,x2Rn,andlet C A1B. Then,
xCx xA 1 Bx
xAx 1 xBx.
Since xAx 0, xBx 0, and , 1 0 by assumption, then xCx 0, which proves the required result.
b. Wefirstshowthattheconstraintsetx:F0Pnj1xjFj 0isconvex. So,letx,y2and 20,1. Letzx1y. Then,
F0
By assumption, we have
Xn j1
zjFj
Xn
F0 xj 1yjFj
j1
Xn j1
Xn j1
Xn
F0 xjFj 0
F0 F0
Xn xjFj 1 yjFj
j1
Xn
xjFj 1 F0
yjFj.
j1
Xn
F0 yjFj 0.
j1
Xn j1
fx1y cx1y
cx 1 cy
fx 1 fy
c. The objective function is already in the required form. To rewrite the constraint, let ai,j be the i,jth
entry of A, i 1,…,m, j 1,…,n. Then, the constraint Ax b can be written as ai,1×1 ai,2×2 ai,nxn bi, i1,…,m
Now form the diagonal matrices
F0 diagb1,…,bm
Fj diaga1,j,…,am,j, j 1,…,n. 203
By part c, we conclude that
F0
To show that the objective function fx cx is convex on , let x,y 2 and 2 0,1. Then,
which implies that z 2 .
which shows that f is convex.
zjFj 0,
j1

Note that a diagonal matrix is positive semidefinite if aPnd only if every diagonal element is nonnegative. Hence, the constraint Ax b can be written as F 0 nj1 xj F j 0. The left hand side is a diagonal matrix, and the ith diagonal element is simply bi ai,1×1 ai,2×2 ai,nxn.
22.21
a. We have
Soletx,y2and20,1. Considerzx1y. Wehave
x:x1 …xn 1; x1,…,xn 0; x1 2xi, i2,…,n.
z1zn
x1 1y1 xn 1yn x1 xn1y1 yn 1
1.
Moreover,foreachi,becausexi 0,yi 0,0and10,wehavezi 0. Finally,foreachi, z1 x1 1y1 2xi 12yi 2zi.
Hence, z 2 , which implies that is convex.
b. We first show that the negative of the objective function is convex. For this, we will compute its Hessian, which turns out to be a diagonal matrix with ith diagonal entry 1x2i , which is strictly positive. Hence, the Hessian is positive definite, which implies that the negative of the objective function is convex.
Combining the above with part a, we conclude that the problem is a convex optimization problem. Hence, the FONC for set constraints is necessary and sucient. Let x be a given allocation. The FONC at x
is d rfx 0 for all feasible directions d at x. But because is convex, the FONC can be written as P
y xrfx 0 for all y 2 . Computing rfx for fx ni1 logxi, we get the proportional fairness condition.
22.22
a. We rewrite the problem into a minimization problem by multiplying the objective function by 1. Thus, the new objective function is the sum of the functions Ui. Because each Ui is concave, Ui is convex, and hence their sum is convex.
To show that the constraint set x : ex C where e 1,…,1 is convex, let x1,x2 2 , and 2 0, 1. Then, ex1 C and ex2 C. Therefore,
ex1 1 x2 ex1 1 ex2 C1C
C
which means that x1 1 x2 2 . Hence, is a convex set.
b. Because the problem is a convex optimization problem, the following KKT condition is necessary and sucient for x to be a global minimizers:
xiC0 i1
Xn
0 i0i
XUx! 0, i1,…,n n
xi C. i1
Note that because Uixi xi is a convex function of xi, the second line above can be written as xi argmaxxUixx.
204

c. Because each Ui is concave and incPreasing, we conclude that Pni1 xi C; for otherwise we could increase some xi and hence Uixi and also ni1 Uixi , contradicting the optimality of x.
22.23
First note that the optimization problem that you construct cannot be a convex problem for otherwise, the FONC implies that x is a global minimizer, which then implies that the SONC holds. Let fx x2, gx x2 x21, and x 0. Then, rfx 0,1. Any feasible direction d at x is of the form d d1,d2 with d2 0. Hence, drfx 0, which shows that the FONC holds.
Becausergx0,1sox isregular,weseethatif 1,thenrfxrgx0,andso the KKT condition holds.
Because F x O, the SONC for set constraint holds. However, Lx,O2 00
00
and Tx y : y2 0, which shows that the SONC for inequality constraint gx 0 does not hold.
22.24
a. Let x0 and 0 be feasible points in the primal and dual, respectively. Then, gx0 0 and 0 0, and
so 0 gx0 0. Hence,
fx0 fx0 0 gx0 lx0,0
min lx,0 x2Rn
q0.
b. Suppose fx0 q0 for feasible points x0 and 0. Let x be any feasible point in the primal. Then, by part a, fx q0 fx0. Hence x0 is optimal in the primal.
Similarly, let be any feasible point in the dual. Then, by part a, q fx0 q0. Hence 0 is optimal in the dual.
c. Let x be optimal in the primal. Then, by the KKT Theorem, there exists 2 Rm such that
rxlx,DfxDgx 0 gx 0
0.
Therefore, is feasible in the dual. Further, we note that l, is a convex function because f is convex, g is convex being the sum of convex functions, and hence l is the sum of two convex functions. Hence, we have lx, minx2Rn lx, . Therefore,
By part b, is optimal in the dual. 22.25
q min lx, x2Rn
lx,
fx gx fx.
a. The Schur complement of M1,1 is
11 M2: !3,2: !3M2:3,1M1,11M1,2: !3
1 2h 1i 25 1
12 2 24.
205

b. The Schur complement of M2: !3,2: !3 is
22.26
Let and let
x2 x3
P11 0,P20 1,andP30 0.
22

M1,1M1,2: !3M2: !3,2:31M2: !3,1 1h 1i1 21
25 1 1h 1i5 2
54. 2 1 1 P x1 x2
001001 Then, we can represent the Lyapunov inequality AP P A 0 as
where Equivalently, if and only if
22.27
AP PA x1 AP1 P1Ax2 AP2 P2A x3 AP3 P3A
x1F1 x2F2 x3F3 0,
Fi APi PiA, i1,2,3. PP andAPPA0 Fxx1F1 x2F2 x3F3 0.
The quadratic inequality
can be equivalently represented by the following LMI:
or as the following LMI:
AP PAPBR1BP 0 R BP 0,
PB APPA APPA PB0.
BP R
It is easy to verify, using Schur complements, that the above two LMIs are equivalent to the following
quadratic inequality:
AP PAPBR1BP O 0. O R
206

22.28
The MATLAB code is as follows:
A 0.9501 0.4860 0.4565
0.2311 0.8913 0.0185
0.6068 0.7621 0.8214;
setlmis;
Plmivar1,3 1;
lmiterm1 1 1 P,1,A,s
lmiterm2 1 1 0,0.1
lmiterm2 1 1 P,1,1
lmiterm3 1 1 P,1,1
lmiterm3 1 1 0,1
lmisgetlmis;
tmin,xfeasfeasplmis;
Pdec2matlmis,xfeas,P
23. Algorithms for Constrained Optimization
23.1
a. By drawing a simple picture, it is easy to see that x xkxk, provided x 6 0.
b. By inspection, we see that the solutions are 0,1 and 0,1. Or use Rayleighs inequality.
Hence,
12
d. Assuming x0 6 0, y0 is well defined. Hence, by part c, we can write
c. Now,
where k 1kI Qxkk. For the particular given form of Q, we have
2
Because 0,
which implies that yk ! 0. But
1 k 12
1 1, 12
xk1 xk rfxk kxk Qxk kI Qxk, xk1 1 xk
1k1 xk1 1 2xk.
2k2 yk1 1yk.
yk
y0.
1
vxk2 xk2
kqx k k 12
k ! u ut q 2 x 2
xk 1 1 2 xk2
2
xk yk2 1, 2
207

which implies that
xk p 1 . 2 yk2 1
Because yk ! 0, we have xk ! 1. By the expression for xk1 in part c, we see that the sign of xk does 222
not change with k. Hence, we deduce that either xk ! 1 or xk ! 1. This also implies that xk ! 0. Hence, xk converges to a solution to the problem.2 2 1
e. If x0 0, then xk 0 for all k, which means that xk 1 or 1 for all k. In this case, the algorithm 221
is stuck at the initial condition 1, 0 or 1, 0 which are in fact the minimizers.
23.2
a. Yes. To show: Suppose that xk is a global minimizer of the given problem. Then, for all x 2 , x 6 xk, we have cx cxk. Rewriting, we obtain cx xk 0. Recall that
xk rfxk arg min kx xk rfxkk2 x2
But, for any x 2 , x 6 xk,
kxxk ck2
argminkxxk ck2. x2
kxxkk2 kck2 2cxxk kck2,
where we used the facts that kxxkk2 0 and cxxk 0. On the other hand, kxk xk ck2
kck2. Hence,
b. No. Counterexample:
23.3
xk1 xk rfxk xk.
a. Suppose xk satisfies the FONC. Then, rfxk 0. Hence, xk1 xk. Conversely, suppose xk does not satisfy the FONC. Then, rfxk 6 0. Hence, k 0, and so xk1 6 xk.
b. Case i: Suppose xk is a corner point. Without loss of generality, take xk 1, 1. We can do this because any other corner point can be mapped to this point by changing variables xi to xi as appropriate. Note that any feasible direction d at xk 1, 1 satisfies d 0. Therefore,
xk1 xk , rfxk 0
, drfxk 0 for all feasible d at xk
, xk satisfies FONC.
Case ii: Suppose xk is not a corner point i.e., is an edge point. Without loss of generality, take xk 2 x : x1 1,1 x2 1. We can do this because any other edge point can be mapped to this point by changing variables xi to xi as appropriate. Note that any feasible direction d at xk 2 x : x1 1, 1 x2 1 satisfies d1 0. Therefore,
xk1 xk , rfxk a, 0, a 0
, drfxk 0 for all feasible d at xk
, xk satisfies FONC. 208

23.4
By definition of , we have
By Exercise 6.7, we can write
x0 y
argminkxx0 yk x2
argminkxx0yk. x2
have
where P I AAA1A. Hence,
23.5
arg min kz yk P y, z2N A
x0 y x0 P y.
argminkxx0ykx0 argminkzyk. x2 z2N A
The term arg minz2N A kz yk is simply the orthogonal projection of y onto N A. By Exercise 6.7, we
Sincek 0isaminimizerofkfxkPgk,weapplytheFONCtoktoobtain 0kxk PgkQPgkbPgk.
Therefore,0k0ifgkPQPgk xkQbPgk. But xkQ b gk.
Hence
23.6
gkP gk
k gkP QP gk .
By Exercise 23.5, the projected steepest descent algorithm applied to this problem takes the form
xk1
Ifx0 2x:Axb,thenAx0 b,andhence
x1 AAA1b which solves the problem see Section 12.3.
23.7
a. Define
By the Chain Rule,
Since k minimizes k, 0kk 0, and thus gk1Pgk 0. 209
kfxk Prfxk 0kk dkk
xk P xk InPxk
AAA1Axk.
d
rfxk kPrfxkPrfxk rfxk1Prfxk.

b.Wehavexk1xkkPgk andxk2xk1k1Pgk1.Therefore,
xk2 xk1xk1 xk byparta,andthefactthatP P P2.
23.8
k1kgk1P P gk k1kgk1P gk 0
a. minimizefxPx.
b. Suppose x 62 . Then, Px 0 by definition of P. Because x is a global minimizer of the
unconstrained problem, we have
which implies that
fx Px fx Px fx,
fx fx Px fx.
We use the penalty method. First, we construct the unconstrained objective function with penalty parameter
23.9
:
Because f is a quadratic with positive definite quadratic term, it is easy to find its minimizer:
1 23 1 For example, we can obtain the above by solving the FONC:
fxx21 2×2 x1 x2 32. x 1 2.
21×1 2×2 6 0 2×1 22×2 6 0.
Now letting ! 1, we obtain
It is easy to verify, using other means, that this is indeed the correct solution.
23.10
Using the penalty method, we construct the unconstrained problem minimize x maxa x, 02
To find the solution to the above problem, we use the FONC. It is easy to see that the solution x satisfies x a. The derivative of the above objective function in the region x a is 1 2x a. Thus, by the FONC, we have x a 12. Since the true solution is at a, the dierence is 12. Therefore, for 12 , we need 12. The smallest such is 12.
x 21 .
23.11
a. We have
1kxk2 kAxbk2 1x 12 2 xx 2. 2 22122
210

The above is a quadratic with positive definite Hessian. Therefore, the minimizer is x 12 2 12
2 12 2 1 1.
212 1
lim x 1 1.
Hence,
The solution to the original constrained problem is see Section 12.3
!1 21
x AAA1b 1 1 .
21
b. We represent the objective function of the associated unconstrained problem as
1kxk2 kAxbk2 1x In 2AAxx 2Abbb. 22
The above is a quadratic with positive definite Hessian. Therefore, the minimizer is x In 2AA1 2Ab
Let A U S
1 1
2In AA Ab.
O V be the singular value decomposition of A. For simplicity, denote 12. We have
x InAA1Ab
S !1
InV O UUS OV Ab S2 O !1
Note that Also,
V A OS U .
Im S2 O 1 Im S21 O O I O 1I ,
InVOOV Ab I S2 O 1
V m VAb. O Inm
nm
nm
211

where Im S21 is diagonal. Hence,
InAA1Ab
V Im S21 O SU
O 1Inm O S
V O ImS21U
V OSUUIm S21U
AUIm S21U.
UIm S21U ! US21U.
But, Therefore,
US21U US2U1 AA1. x ! AAA1b x.
x
Note that as ! 1, ! 0, and
24. MultiObjective Optimization
24.1
The MATLAB code is as follows:
function multiop
MULTIOP, illustrates multiobjective optimization.
clear
clc
disp
disp This is a demo illustrating multiobjective optimization.
disp The numerical example is a modification of the example
disp from the 2002 book by A. Osyczka,
disp Example 5.1 on pages 101105
disp
disp Select the population size denoted POPSIZE, for example, 50.
disp
POPSIZEinputPopulation size POPSIZE ;
disp
disp Select the number of iterations denoted NUMITER; e.g., 10.
disp
NUMITERinputNumber of iterations NUMITER ;
disp
disp
Main
for i 1:NUMITER
fprintfWorking on Iteration .0f…n,i
xmat genxmatPOPSIZE;
if i1
for j 1:lengthxR
212

xmat xmat;xRj;
end
end
xR,fR SelectPxmat;
fprintfNumber of Pareto solutions: .0fn,lengthfR
end
disp
disp
fprintf Pareto solutions n
celldispxR
disp
disp
fprintf Objective vector values n
celldispfR
xlabelf1,Fontsize,16
ylabelf2,Fontsize,16
titlePareto optimal front,Fontsize,16
setgca,Fontsize,16
grid
for i1:lengthxR
xxixRi1;
yyixRi2;
end
XXxx; yy;
figure
axis1 7 5 10
hold on
for i1:sizeXX,2
plotXX1,i,XX2,i,marker,o,markersize,6
end
xlabelx1,Fontsize,16
ylabelx2,Fontsize,16
titlePareto optimal solutions,Fontsize,16
setgca,Fontsize,16
grid
hold off
figure
axis2 10 2 13
hold on
plot2 6,5 5,marker,o,markersize,6
plot6 6,5 9,marker,o,markersize,6
plot2 6,9 9,marker,o,markersize,6
plot2 2,5 9,marker,o,markersize,6
for i1:sizeXX,2
plotXX1,i,XX2,i,marker,x,markersize,10
end
x12:.2:10;
x22:.2:13;
X1, X2meshgridx1,x2;
Z1X1.2 X2;
v0 5 7 10 15 20 30 40 60;
cs1contourX1,X2,Z1,v;
clabelcs1
Z2X1X2.2;
v220 25 35 40 60 80 100 120;
cs2contourX1,X2,Z2,v2;
213

clabelcs2
xlabelx1,Fontsize,16
ylabelx2,Fontsize,16
titleLevel sets of f1 and f2, and Pareto optimal
points,Fontsize,16
setgca,Fontsize,16
grid
hold off
function xmat0 genxmatPOPSIZE
xmat0 randPOPSIZE,2;
xmat0:,1 xmat0:,142;
xmat0:,2 xmat0:,245;
function xR,fR SelectPxmat
Declaration
J sizexmat,1;
Init
Rset 1;
j 1;
isstep7 0;
Step 1
x1 xmat1,:;
f1 evalfcnx1;
Step 2
while j J
j j1;
Step 3
r 1;
rdel ;
q 0;
R lengthRset;
for k 1:sizexmat,1
xk xmatk,:;
fk evalfcnxk;
end
Step 4
while 1
for r1:R
if allfjfRsetr
q q1;
rdel rdel r;
else
Step 5
if allfjfRsetr
break end
end
Step 6
214

rr1; if r R
isstep7 1;
break end
end
Step 7
if isstep7 1
isstep7 0;
if q0
Rsetrdel ;
Rset Rset j;
else
Step 8
Rset Rset j;
end
end
for k 1:sizexmat,1
xk xmatk,:;
fk evalfcnxk;
end
R lengthRset;
end
Return the Pareto solution.
for i 1:lengthRset
xRi xRseti;
fRi fRseti;
end
x1 ;
y1 ;
x2 ;
y2 ;
for k 1:sizexmat,1
if ismemberk,Rset
x1 x1 fk1;
y1 y1 fk2;
else
x2 x2 fk1;
y2 y2 fk2;
end
end
newplot
plotx1,y1,xr,x2,y2,.b
drawnow
function y f1x
y x12x2;
The above function is the original function in the Osyczkas 2002
book,
Example 5.1, page 101.
Its negative makes a much more interesting example.
215

y x12x2;
function y f2x
y x1x22;
function y evalfcnx
y1 f1x;
y2 f2x;
24.2
a. We proceed using contraposition. Assume that x is not Pareto optimal. Therefore, there exists a point
x 2 such that
fix fix for all i 1,2,…, and for some j, fjx fjx. Since c 0,
cfx cfx,
which implies that x is not a global minimizer for the weightedsum problem.
For the converse, consider the following counterexample: x 2 R2 : kxk 1,x 0 and fx
x ,x . ItiseasytoseethattheParetofrontisx:kxk1,x0i.e.,thepartoftheunitcircleinthe 12p
nonnegative quadrant. So x 1 21, 1 is a Pareto minimizer. However, there is no c 0 such that x is a global minimizer of the weightedsum problem. To see this, fix c 0 assuming c c without loss
p12
of generality and consider the objective function value fx c1 c2 2 for the weightedsum problem. Now, the point x0 1,0 is also a feasible point. Moreover fx0 c1 c1 c22 fx. So x is
not a global minimizer of the weightedsum problem.
b. We proceed using contraposition. Assume that x is not Pareto optimal. Therefore, there exists a point
x 2 such that
and for some j, fjx fjx. By assumption, for all i 1,…,, fix 0, which implies that
fix fix for all i 1,2,…,
X i1
fixp
X i1
fixp
because p 0. Hence, x is not a global minimizer for the minimumnorm problem.
For the converse, consider the following counterexample: x 2 R2 : x1 2×2 2, x 0 and fx x1,x2. It is easy to see that the Pareto front is x : x1 2×2 2,x 0. So x 1,12 is a Pareto minimizer. However, there is no p 0 such that x is a global minimizer of the minimumnorm problem. To see this, fix p 0 and consider the objective function value fx 112p for the minimum norm problem. Now, the point x0 0, 1 is also a feasible point. Moreover fx0 1 112p fx.
So x is not a global minimizer of the minimumnorm problem.
c. For the first part, consider the following counterexample: x 2 R2 : x1 x2 2, x 0 and fxx1,x2.TheParetofrontisx:x1x22,x0,andx12,34 isaParetominimizer. But
fx maxf1x, f2x max12, 34
34.
However, x0 1,1 is also a feasible point, and fx0 1 fx. Hence, x is not a global minimizer of the minimax problem.
Forthesecondpart,supposex2R2 :x1 2andfxx1,2. Then,foranyx2, maxf1x,f2x 2. So any x 2 R2 is a global minimizer of the minimax singleobjective problem.
216

However, consider another point x such that x1 x1. Then, f1x f1x and f2x f2x. Hence, x is not a Pareto minimizer.
In fact, in the above example, no Pareto minimizer exists. However, if we set x 2 R2 : 1 x1 2, then the counterexample is still valid, but in this case any point of the form 1,×2 is a Pareto minimizer.
24.3
Let
fx cfx
where x 2 x : hx 0. The function f is convex because all the functions fi are convex and ci 0,i1,2,…,.Wecanrepresentthegivenfirstorderconditioninthefollowingform:foranyfeasible
direction d at x, we have
By Theorem 22.7, the point x is a global minimizer of f over . Therefore,
That is,
X i1
ci fix
X i1
ci fix for all x 2 .
drfx 0.
fx fx for all x 2 .
To finish the proof, we now assume that x is not Pareto optimal and the above condition holds. We then proceed using the proof by contradiction. Because, by assumption, x is not Pareto optimal, there exists a
point x 2 such that
and for some j, fjx fjx. Since for all i 1,2,…,, ci 0, we must have
fix fix for all i 1,2,…,
fx cfx
i 1, 2, . . . , . We can represent the given Lagrange condition in the form
Dfx Dgx 0 hx 0.
By Theorem 22.8, the point x is a global minimizer of f over . Therefore, fx fx for all x 2 .
That is,
X i1
ci fix
X i1
ci fix for all x 2 .
X
X
ci fix,
ci fix
which contradicts the above condition, Pi1 ci fix Pi1 ci fix for all x 2 . This completes the
proof. See also Exercise 24.2, part a.
24.4
i1
i1
Let
where x 2 x : hx 0. The function f is convex because all the functions fi are convex and ci 0,
To finish the proof, we now assume that x is not Pareto optimal and the above condition holds. We then proceed using the proof by contradiction. Because, by assumption, x is not Pareto optimal, there exists a
point x 2 such that
fix fix for all i 1,2,…, 217

and for some j, fjx fjx. Since for all i 1,2,…,, ci 0, we must have
X i1
X i1
ci fix,
ci fix
which contradicts the above condition, Pi1 ci fix Pi1 ci fix for all x 2 . This completes the
proof. See also Exercise 24.2, part a.
24.5
Let
where x 2 x : gx 0. The function f is convex because all the functions fi are convex and ci 0,
fx cfx
i 1, 2, . . . , . We can represent the given KKT condition in the form
0 Dfx Dgx 0
gx 0 gx 0.
By Theorem 22.9, the point x is a global minimizer of f over . Therefore, fx fx for all x 2 .
That is,
X
ci fix
X
ci fix for all x 2 .
i1
proceed using the proof by contradiction. Because, by assumption, x is not Pareto optimal, there exists a
i1
To finish the proof, we now assume that x is not Pareto optimal and the above condition holds. We then
point x 2 such that
and for some j, fjx fjx. Since for all i 1,2,…,, ci 0, we must have
fix fix for all i 1,2,…,
X i1
X i1
ci fix,
ci fix
which contradicts the above condition, Pi1 ci fix Pi1 ci fix for all x 2 . This completes the
proof. See also Exercise 24.2, part a.
24.6
Let
where x 2 x : hx 0, gx 0. The function f is convex because all the functions fi are convex
fx cfx
and ci 0, i 1, 2, . . . , . We can represent the given KKTtype condition in the form
0 Dfx Dhx Dgx 0
gx 0 hx 0
gx 0. By Theorem 22.9, the point x is a global minimizer of f over . Therefore,
fx fx for all x 2 . 218

That is,
X i1
ci fix
X i1
ci fix for all x 2 .
To finish the proof, we now assume that x is not Pareto optimal and the above condition holds. We then proceed using the proof by contradiction. Because, by assumption, x is not Pareto optimal, there exists a
point x 2 such that
and for some j, fjx fjx. Since for all i 1,2,…,, ci 0, we must have
fix fix for all i 1,2,…,
X
X
ci fix,
The given minimax problem is equivalent to the problem given in the hint: minimize z
subjectto fixz0, i1,2.
Suppose x,z is a local minimizer for the above problem which is equivalent to x being a local minimizer to for the original problem. Then, by the KKT Theorem, there exists 0, where 2 R2, such that
ci fix
which contradicts the above condition, Pi1 ci fix Pi1 ci fix for all x 2 . This completes the
proof. See also Exercise 24.2, part a.
24.7
i1
i1
0, 1 rf1x, 1 rf2x, 1
f1x z f2x z
0 0.
Rewriting the first equation above, we get
1rf1x 2rf2x 0, 1 2 1.
Rewriting the second equation, we get
ifixz0, i1,2.
Suppose fix maxf1x, f2x, where i 2 1, 2. Then, z fix. Hence, by the above equation we conclude that i 0.
219

Related Posts