程序代写代做代考 matlab algorithm ant Design optimization algorithms and tools

Design optimization algorithms and tools

Population-based
algorithms

ME 564/SYS 564
Wed Oct 17, 2018
Steven Hoffenson

Goal of Week 8: To do another KKT example, and then learn some increasingly-
common algorithms that use multiple points in each iteration

Recap: How to optimize

1. Formulate the problem
a) Define system boundaries

b) Develop analytical models

c) Explore/reduce the problem space

d) Formalize optimization problem

2. Solve the problem
a) Choose the right approach/algorithm

b) Solve (by hand, code, or software)

c) Interpret the results

d) Iterate if needed

𝐱𝑘+1 = 𝐱𝑘 − 𝐇(𝐱𝑘)
−1𝛁𝑓 𝐱0

(Weeks 1-2, 4, 9-12)

(Weeks 3, 5-8, 12)
TODAY

Recap: Gradient-based methods

• Unconstrained
• FONC and SOSC (when math is simple enough for algebra)

• Gradient descent (algorithm with linear convergence)

• Newton method (algorithm with hyper-linear convergence)

• Constrained
• Reduced gradient (with known active constraints)

• Generalized Reduced Gradient (algorithm w active constr)

• Active set strategy (algorithm w updating set of active constr)

• Lagrangian (equality or active inequality constr)

• KKT conditions (with any inequality and equality constr)

• Quasi-Newton methods (2nd-derivative-free)

• SQP (efficiently handles constraints)

Recap: Quasi-Newton and SQP

Quasi-Newton methods approximate 𝐇−𝟏 to simplify math

Sequential Quadratic Programming (SQP) algorithm solves a
subproblem for the step size and direction 𝐬𝑘, then moves as
𝐱𝑘+1 = 𝐱𝑘 + 𝐬𝑘

Example: KKT conditions (5.14)

min 𝑓 𝐱 = 𝑥1
2 + 𝑥2

2 + 𝑥3
2 + 40𝑥1 + 20𝑥2 − 3000

s.t. 𝑔1 𝐱 = 𝑥1 − 50 ≥ 0
𝑔2 𝐱 = 𝑥1 + 𝑥2 − 100 ≥ 0
𝑔3 𝐱 = 𝑥1 + 𝑥2 + 𝑥3 − 150 ≥ 0

Recall the KKT conditions:

1. 𝛁𝐱ℒ = 𝛁f 𝐱
∗ + 𝛌𝑇𝛁𝐡 𝐱∗ + 𝛍𝑇𝛁𝐠 𝐱∗ = 𝟎𝑇

2. 𝐡 𝐱∗ = 𝟎, 𝐠 𝐱∗ ≤ 𝟎

3. 𝛌 ≠ 𝟎, 𝛍 ≥ 𝟎

4. 𝝁𝑇𝐠 𝐱∗ = 𝟎𝑇

Example: KKT conditions (5.14)

min 𝑓 𝐱 = 𝑥1
2 + 𝑥2

2 + 𝑥3
2 + 40𝑥1 + 20𝑥2 − 3000

s.t. 𝑔1 𝐱 = 𝑥1 − 50 ≥ 0
𝑔2 𝐱 = 𝑥1 + 𝑥2 − 100 ≥ 0
𝑔3 𝐱 = 𝑥1 + 𝑥2 + 𝑥3 − 150 ≥ 0

𝛁𝐱ℒ = 𝛁𝑓 𝐱
∗ + 𝛌T𝛁𝐡 𝐱∗ + 𝛍T𝛁𝐠 𝐱∗ = 𝟎𝐓

𝛁𝐱ℒ =
2𝑥1 + 40
2𝑥2 + 20
2𝑥3

+ 𝟎𝐓 + 𝜇1 𝜇2 𝜇3
−1 0 0
−1 −1 0
−1 −1 −1

= 𝟎𝐓

𝛁𝐠 𝐱∗ =

ൗ
𝜕𝑔1

𝜕𝑥1
ൗ

𝜕𝑔1
𝜕𝑥2

ൗ
𝜕𝑔1

𝜕𝑥3

ൗ
𝜕𝑔2

𝜕𝑥1
ൗ

𝜕𝑔2
𝜕𝑥2

ൗ
𝜕𝑔2

𝜕𝑥3

ൗ
𝜕𝑔3

𝜕𝑥1
ൗ

𝜕𝑔3
𝜕𝑥2

ൗ
𝜕𝑔3

𝜕𝑥3

𝛁𝑓 𝐱∗ =

ൗ
𝜕𝑓

𝜕𝑥1

ൗ
𝜕𝑓

𝜕𝑥2

ൗ
𝜕𝑓

𝜕𝑥3

The constraints all need to
be multiplied by -1 to
become negative null!

Example: KKT conditions (5.14)

𝛁𝐱ℒ =
2𝑥1 + 40
2𝑥2 + 20
2𝑥3

+ 𝟎𝐓 + 𝜇1 𝜇2 𝜇3
−1 0 0
−1 −1 0
−1 −1 −1

= 𝟎𝐓

2𝑥1 + 40 − 𝜇1 − 𝜇2 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇2 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0

3 equations,
6 unknowns

𝛍T𝐠 𝐱∗ = 𝟎𝐓

𝜇1 𝜇2 𝜇3

−𝑥1 + 50
−𝑥1 − 𝑥2 + 100

−𝑥1 − 𝑥2 − 𝑥3 + 150
=

0
0
0

3 more equations!

Recall the KKT conditions:

1. 𝛁𝐱ℒ = 𝟎
𝐓

2. 𝐡 𝐱∗ = 𝟎, 𝐠 𝐱∗ ≤ 𝟎

3. 𝛌 ≠ 𝟎, 𝛍 ≥ 𝟎

4. 𝛍T𝐠 𝐱∗ = 𝟎𝐓

Example: KKT conditions (5.14)
2𝑥1 + 40 − 𝜇1 − 𝜇2 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇2 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
𝜇1 −𝑥1 + 50 = 0
𝜇2 −𝑥1 − 𝑥2 + 100 = 0
𝜇3 −𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

6 equations,
6 unknowns

How do we solve?

From the lower 3 equations, there are multiple possibilities in each:
Either 𝜇𝑖 = 0, or the expression in parentheses = 0.

So, we will have to examine 8 different scenarios for this system of
equations, and check each solution against the remaining KKT
conditions: 𝐡 𝐱∗ = 𝟎, 𝐠 𝐱∗ ≤ 𝟎, 𝛌 ≠ 𝟎, 𝛍 ≥ 𝟎.

We don’t need these, since there are no equality constraints in this problem

Example: KKT conditions (5.14)

Scenario 1: 𝜇1 = 0, 𝜇2 = 0, 𝜇3 = 0

2𝑥1 + 40 = 0
2𝑥2 + 20 = 0
2𝑥3 = 0
0 = 0
0 = 0
0 = 0

𝑥1 = −20
𝑥2 = −10
𝑥3 = 0
𝜇1 = 0
𝜇2 = 0
𝜇3 = 0

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔1 = 20 + 50 ≰ 0

Since 𝑔1 is violated, this is
not a KKT point.

2𝑥1 + 40 − 𝜇1 − 𝜇2 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇2 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
𝜇1 −𝑥1 + 50 = 0
𝜇2 −𝑥1 − 𝑥2 + 100 = 0
𝜇3 −𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

Example: KKT conditions (5.14)

Scenario 2: 𝜇1 ≠ 0, 𝜇2 = 0, 𝜇3 = 0

2𝑥1 + 40 − 𝜇1 = 0
2𝑥2 + 20 = 0
2𝑥3 = 0
−𝑥1 + 50 = 0
0 = 0
0 = 0

𝑥1 = 50
𝑥2 = −10
𝑥3 = 0
𝜇1 = 140
𝜇2 = 0
𝜇3 = 0

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔2 = −50 + 10 + 100 ≰ 0

Since 𝑔2 is violated, this is not
a KKT point.

Example: KKT conditions (5.14)

Scenario 3: 𝜇1 = 0, 𝜇2 ≠ 0, 𝜇3 = 0

2𝑥1 + 40 − 𝜇2 = 0
2𝑥2 + 20 − 𝜇2 = 0
2𝑥3 = 0
0 = 0
−𝑥1 − 𝑥2 + 100 = 0
0 = 0

𝑥1 = 45
𝑥2 = 55
𝑥3 = 0
𝜇1 = 0
𝜇2 = 130
𝜇3 = 0

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔1 = −45 + 50 ≰ 0

Since 𝑔1 is violated, this is not
a KKT point.

Example: KKT conditions (5.14)

Scenario 4: 𝜇1 = 0, 𝜇2 = 0, 𝜇3 ≠ 0

2𝑥1 + 40 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
0 = 0
0 = 0
−𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

𝑥1 = 40
𝑥2 = 50
𝑥3 = 60
𝜇1 = 0
𝜇2 = 0
𝜇3 = 120

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔1 = −40 + 50 ≰ 0

Since 𝑔1 is violated, this is not
a KKT point.

Example: KKT conditions (5.14)

Scenario 5: 𝜇1 ≠ 0, 𝜇2 ≠ 0, 𝜇3 = 0

2𝑥1 + 40 − 𝜇1 − 𝜇2 = 0
2𝑥2 + 20 − 𝜇2 = 0
2𝑥3 = 0
−𝑥1 + 50 = 0
−𝑥1 − 𝑥2 + 100 = 0
0 = 0

𝑥1 = 50
𝑥2 = 50
𝑥3 = 0
𝜇1 = 20
𝜇2 = 120
𝜇3 = 0

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔3 = −50 − 50 − 0 + 150 ≰ 0

Since 𝑔3 is violated, this is not a
KKT point.

Example: KKT conditions (5.14)

Scenario 6: 𝜇1 ≠ 0, 𝜇2 = 0, 𝜇3 ≠ 0

2𝑥1 + 40 − 𝜇1 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
−𝑥1 + 50 = 0
0 = 0
−𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

𝑥1 = 50
𝑥2 = 45
𝑥3 = 55
𝜇1 = 30
𝜇2 = 0
𝜇3 = 110

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔2 = −50 − 45 + 100 ≰ 0

Since 𝑔2 is violated, this is not a
KKT point.

Example: KKT conditions (5.14)

Scenario 7: 𝜇1 = 0, 𝜇2 ≠ 0, 𝜇3 ≠ 0

2𝑥1 + 40 − 𝜇2 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇2 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
0 = 0
−𝑥1 − 𝑥2 + 100 = 0
−𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

𝑥1 = 45
𝑥2 = 55
𝑥3 = 50
𝜇1 = 0
𝜇2 = 30
𝜇3 = 100

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

𝑔1 = −45 + 50 ≰ 0

Since 𝑔1 is violated, this is not a
KKT point.

Example: KKT conditions (5.14)

Scenario 8: 𝜇1 ≠ 0, 𝜇2 ≠ 0, 𝜇3 ≠ 0

2𝑥1 + 40 − 𝜇1 − 𝜇2 − 𝜇3 = 0
2𝑥2 + 20 − 𝜇2 − 𝜇3 = 0
2𝑥3 − 𝜇3 = 0
−𝑥1 + 50 = 0
−𝑥1 − 𝑥2 + 100 = 0
−𝑥1 − 𝑥2 − 𝑥3 + 150 = 0

𝑥1 = 50
𝑥2 = 50
𝑥3 = 50
𝜇1 = 20
𝜇2 = 20
𝜇3 = 100

Check: 𝐠 𝐱∗ ≤ 𝟎, 𝛍 ≥ 𝟎

All of these conditions
hold, so 50, 50, 50 𝑇 is a
KKT point!

Example: KKT conditions (5.14)

min 𝑓 𝐱 = 𝑥1
2 + 𝑥2

2 + 𝑥3
2 + 40𝑥1 + 20𝑥2 − 3000

s.t. 𝑔1 𝐱 = 𝑥1 − 50 ≥ 0
𝑔2 𝐱 = 𝑥1 + 𝑥2 − 100 ≥ 0
𝑔3 𝐱 = 𝑥1 + 𝑥2 + 𝑥3 − 150 ≥ 0

Across the 8 scenarios, the only KKT point is 𝐱∗ = 50, 50, 50 𝑇, 𝑓∗ = 7500

Now we can test the SOSC:

𝛁𝐱ℒ =

2𝑥1 + 40 − 𝜇1 − 𝜇2 − 𝜇3
2𝑥2 + 20 − 𝜇2 − 𝜇3

2𝑥3 − 𝜇3

ℒ𝑥𝑥 =
2 0 0
0 2 0
0 0 2

 This is pos. def. everywhere!

Therefore, 𝐱∗ is a global minimizer.

This is the analytical
approach to solving
constrained problems.
We also have algorithmic
approaches like SQP

Summary: Gradient-based algorithms

• Gradient descent

• Newton method

• Generalized Reduced Gradient (GRG)

• Active set strategy

• Quasi-Newton methods

• Sequential Quadratic Programming (SQP)

Gradient-free approaches

• Approximation models

• Pattern search (e.g., Hooke-Jeeves, Nelder-Meade)

• Space-filling search (e.g., DIRECT)

• Random search (e.g., Simulated Annealing)

• Linear Programming (e.g., Simplex)

• Genetic/evolutionary algorithms

• Particle swarm

• Ant colony

We covered
some of these

in Week 3

These are
population-based

algorithms

Approximation models
Use gradient-based methods even though your
functions are not differentiable

1. Approximate derivatives with finite differences

𝑓′(𝑥) ≈
𝑓 𝑥+ℎ −𝑓(𝑥)

ℎ
for some small ℎ

2. Metamodels: Sample points & fit a function

Figures from:
Zamani, A., Mirabadi, A., and Schmid, F. (2012), “Applying metamodeling techniques in
the design and optimization of train wheel detector”, Sensor Review, 32 (4), 327 – 336.

Population-based
algorithms

ME 564/SYS 564
Wed Oct 17, 2018
Steven Hoffenson

Goal of Week 8: To learn some increasingly-common algorithms that use
multiple points in each iteration, and practice using them

Genetic algorithms – overview

• A.k.a., evolutionary algorithms

• Mimic gene selection and “survival of the fittest”

Genetic algorithms – steps

1. Start with a random population (set) of inputs

2. Select the best among population to be “parents”

3. Use parents to spawn a new generation through
crossovers and mutations

4. Repeat 2-3 with more “generations” until satisfied

Initial population Best Designs

Selected

New Generation

of Designs

figure from John Whitefoot, 2010 lecture

Better
Designs

Worse
Designs

Genetic algorithms – parent selection

There are different ways to select
parents from a population:

• Elitism: pick only the absolute best
points

• Roulette wheel: better designs have
a better chance of being chosen
(pictured)

• Tournament: segment the
population randomly, and choose
the best in each segment

Different algorithms use different
strategies.

Genetic algorithms – spawning
Spawning new generations may be done using a
combination of:

1. Survivors: the best simply join the new generation

2. Crossovers: combinations of traits from 2 parents
create a child

3. Mutations: traits from a parent randomly change

25
Image credit: http://www.klopfenstein.net/lorenz.aspx/genetic-algorithms (accessed 2013)

The interactive part means that humans do the “parent selection”
portion in each iteration of the GA. In this case, it usually
converged to a Coca-Cola bottle shape.

Dr. Jarod Kelly, University of Michigan

Which bottle shape do you prefer?

Example: Interactive Genetic Algorithm

MATLAB – ga

There is a genetic algorithm function ‘ga’ that can
handle objectives and constraints in a similar way to
fmincon
[xopt,fopt]=ga(OBJFUN,NVARS,A,b,Aeq,beq,lb,ub,NONLCON)

Note the difference:
• fmincon needs a start point
• ga just needs the # of variables

Note: This function requires the
“Global optimization toolbox”

to be installed.

MATLAB – optimtool

MATLAB currently
has an interactive
tool ‘optimtool’
that can guide
you through
optimizing with
different options

Note: This may
disappear in
future MATLAB
releases 

Particle swarm optimization

1. Start with some random set of points (particles)
with directions/velocities within the input space

2. At each iteration, update each particle’s position
and velocity based on the particle’s best previous
position and the best positions of its near
neighbors

29Figure from a presentation by Maurice Clerc at (accessed 2013):
http://www.docstoc.com/docs/76778278/Particle-Swarm-Optimization-mini-tutorial-(PowerPoint)

Particle swarm optimization

Ant colony optimization

Adjustments depend on a combination of
randomness and pheromones (which tell what has
been tried and true), which evaporate over time

31
Figures from http://masters.donntu.edu.ua/2011/fknt/el_khatib/diss/indexe.htm (accessed 2013)

Non-continuous/discrete variables

What to do when we have non-continuous variables?

Examples:

• Number of rotors on a drone (must be integer)

• Number of cylinders in an engine (must be integer)

• Material choice (there are a few options with different
specific properties)

Strategies:

1. Treat them as continuous, and then pick nearby point

2. Parametric optimization

3. Integer programming

Treat discrete variables as continuous

The easiest way to handle non-continuous variables (material
properties, countable things) is to treat them as continuous:

1. Optimize pretending that they are continuous

2. Choose the discrete value closest to the optimizer (or
evaluate all surrounding points and pick the best)

This often does not work well with non-convex functions!

Optimizer with
continuous assumption

Parametric optimization

When the number of discrete choices is small, you
can do parametric optimization:

1. List out all discrete variable combinations

2. Optimize to find the best solution under each of
the scenarios in (1)

3. Choose the best from all the solutions in (2)

This works well when there is a small number of
discrete choices and the functions are quick to
evaluate

Discrete/integer programming

When you have a large number of non-continuous variables…

S1 S2

Branch-and-bound algorithm

1. Branch: Split the space of
candidate solutions

2. Bound: Compute upper and lower
bounds on each subspace (i.e.,
figure out the highest and lowest
possible objective function values
for each group)

3. If the lower bound for one group is
higher than the upper bound for
another, eliminate it

4. Repeat 1-3 until a single solution

ub: f = 12
lb: f = -2

ub: f = 10
lb: f = 3

S11 S12 S13
ub: f = 8
lb: f = 5

ub: f = 10
lb: f = 7

ub: f = 6
lb: f = 4

S21 S22

ub: f = 10
lb: f = 6.5

ub: f = 12
lb: f = -1

Recap: Gradient-free algorithms

• Approximation models

• Pattern search (e.g., Hooke-Jeeves)

• Space-filling search (e.g., DIRECT)

• Random search (e.g., Simulated Annealing)

• Linear Programming (e.g., Simplex)

• Genetic/evolutionary algorithms

• Particle swarm

• Ant colony

Week 3

When to use gradient-free methods

Cons of gradient-free methods

• Usually slower to converge
• These require many more function evaluations
• The search directions are not always efficient

• No optimality guarantee
• There is no analogy to FONC (grad = 0) and SOSC

(Hessian is positive definite)
• Convergence must be measured by changes to f and x

• Many parameters to tune

• Constraint handling is imperfect
• Cannot use Lagrangian
• Must use penalty or barrier

• Stochastic methods are not repeatable

A note on global optimization

Most algorithms seek local optima

To find global solutions, try:

1. Performing optimization with multiple start points

2. Using global algorithms (e.g., genetic algorithms &
particle swarm)

global maximum
on interval [0,3.25]

global minimum
on interval [0,3.25]

local minimum

Recap: How to optimize

1. Formulate the problem
a) Define system boundaries

b) Develop analytical models

c) Explore/reduce the problem space

d) Formalize optimization problem

2. Solve the problem
a) Choose the right approach/algorithm

b) Solve (by hand, code, or software)

c) Interpret the results

d) Iterate if needed

𝐱𝑘+1 = 𝐱𝑘 − 𝐇(𝐱𝑘)
−1𝛁𝑓 𝐱0

(Weeks 1-2, 4, 9-12)

(Weeks 3, 5-8, 12)
TODAY

Summary of optimization approaches
• Mathematical/analytical

• Elimination of constraints using monotonicity analysis
• Finding stationary points and their nature with optimality conditions

• Linear programming: Simplex

• Nonlinear gradient-based methods
• Gradient descent
• Newton method
• Generalized Reduced Gradient (GRG)
• Active set strategies
• Quasi-Newton methods
• Sequential quadratic programming (SQP)

• Nonlinear gradient-free methods
• Approximation
• Pattern search
• Space-filling search
• Random search
• Genetic/evolutionary algorithms (GAs/EAs)
• Particle swarm & Ant colony

• Integer programming: Branch-and-bound 41

When to use what?

• If the math is simple enough, try solving by hand (monotonicity
analysis, optimality conditions: FONC/SOSC or KKT)

• If you have a linear problem, Simplex LP is efficient

• If the functions are way too slow to evaluate, create a meta-model

• If you have a convex and differentiable problem, you can’t beat the
efficiency and accuracy of a gradient-based algorithm (e.g., SQP, GRG)

• If you have a convex, non-differentiable problem, pattern-search or
random search algorithms are usually effective

• If you have non-continuous variables, you should either:

• Solve parametrically (i.e., solve separately for each discrete value)

• Use branch-and-bound techniques

• With tricky problems (non-linear, non-convex) with fast functions, try:

• Convex search methods with multiple start points

• Gradient-free algorithms like GA’s and other bio-inspired searches

Acknowledgements

• Some of this material came from Chapter 7 of the
textbook, Principles of Optimal Design

• Some of these slides and examples came from Dr.
John Whitefoot, Dr. Alex Burnap, Dr. Yi Ren, and Dr.
Michael Kokkolaras at the University of Michigan

Announcements

• HW4 is posted, due in 13 days, Tuesday at noon

• Project progress reports are due a week from
Sunday

• Please do the mid-semester survey! It’s totally
anonymous and will help me improve the course

https://goo.gl/forms/hR7fN7luoFUtJOKU2

Related Posts