CS计算机代考程序代写 Linear Optimal Control (LQR)

Linear Optimal Control (LQR)
Robert Platt Northeastern University

The linear control problem
Given: System:

Given: System:
Cost function:
The linear control problem
where:

The linear control problem
Given: System:
Cost function:
where:

The linear control problem
Given: System:
Cost function:
where:
Initial state:
Calculate:
U that minimizes J(X,U)

The linear control problem
Given: System:
Cost function:
Important problem! Howwdheore:wesolveit?
Initial state:
Calculate:
U that minimizes J(X,U)

One solution: least squares

One solution: least squares

One solution: least squares
where

One solution: least squares
where:

Given: System:
Cost function:
One solution: least squares
where:
Initial state:
Calculate:
U that minimizes J(X,U)

One solution: least squares
Given: System:
Cost function: Initial state:
Calculate: U that minimizes J(X,U)

One solution: least squares Substitute X into J:
Minimize by setting dJ/dU=0:
Solve for U:

What can this do?
Start here
Solve for optimal trajectory:
End here at time=T
Image: van den Berg, 2015

What can this do?
This is cool, but…
– only works for finite horizon problems – doesn’t account for noise
– requires you to invert a big matrix

Bellman solution
Cost-to-go function: V(x)
– the cost that we have yet to experience if we travel along the minimum cost path.
– given the cost-to-go function, you can calculate the optimal path/policy Example:
The number in each cell describes the number of steps “to-go” before reaching the goal state

Bellman solution Bellman optimality principle:
Cost of this time step
(Cost of future time steps)

Bellman solution Bellman optimality principle:

Bellman solution Bellman optimality principle:
Cost-to-go from state x at time t
Cost-to-go from state (Ax+Bu) at time t+1
Cost incurred on this time step
Cost incurred after this time step

Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:

Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:
Then:

Bellman solution
For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:
where:
Then:
How do we minimize this term?
– take derivative and set it to zero.

Bellman solution
How do we minimize this term?
– take derivative and set it to zero.
optimal control as a function of state – but: it depends on P_{t+1}…

Bellman solution
How do we minimize this term?
– take derivative and set it to zero.
How solve for P_{t+1}???
optimal control as a function of state – but: it depends on P_{t+1}…

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):
Dynamic Riccati Equation

Example: planar double integrator
Initial velocity
m=1
b=0.1 u=applied force
Initial position of the puck
Build the LQR controller for: Initial state:
Time horizon: Cost fn:
Goal position
Air hockey table

Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, … , P_1
HOW?
Air hockey table

Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, … , P_1
Air hockey table

Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, … , P_1
Air hockey table

Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, … , P_1
Air hockey table

Example: planar double integrator
Step 1:
Calculate P backward from T: P_100, P_99, P_98, … , P_1
Air hockey table

Example: planar double integrator
Step 2:
Calculate u starting at t=1 and going forward to t=T-1
Air hockey table

Example: planar double integrator 1
0.2 0
origin
0 0.2

Example: planar double integrator
u_x, u_y
t

Example: planar double integrator

Example: planar double integrator
origin
0

Example: planar double integrator
origin
0

The infinite horizon case So far: we have optimized cost over a fixed horizon, T.
– optimal if you only have T time steps to do the job
But, what if time doesn’t end in T steps?
One idea:
– at each time step, assume that you always have T
more time steps to go
– this is called a receding horizon controller

The infinite horizon case
Time step
Notice that elt’s of P stop changing (much) more than 20 or 30 time steps prior to horizon.
– what does this imply about the infinite horizon case?
Elements of P matrix

The infinite horizon case
Converging toward fixed P
Time step
Notice that elt’s of P stop changing (much) more than 20 or 30 time steps prior to horizon.
– what does this imply about the infinite horizon case?
Elements of P matrix

The infinite horizon case We can solve for the infinite horizon P exactly:
Discrete Time Algebraic Riccati Equation

So, what are we optimizing for now?
Given: System:
Cost function:
where:
Initial state:
Calculate:
U that minimizes J(X,U)

Controllability
A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.
When is a linear system controllable?
It’s property of the system dynamics…

Controllability
A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.
When is a linear system controllable?
Remember this?

Controllability
What property must this matrix have?

Controllability
This submatrix must be full rank.
– i.e. the rank must equal the dimension of the state space