This post was last edited by dirty on 2024-2-25 18:00
This article explains the basic concepts of optimal control from Chapters 3 to 5, dynamic programming, linear quadratic regulator, and model predictive control. Take notes and understand the code simulation practice here.
1. Optimal Control
1. Composition of the optimal control problem
(1) Mathematical model of the system
In optimal control, the mathematical model of the system is often expressed using state-space equations. The operation process of the system is the process of transfer of its state variables. The transfer of state variables forms a trajectory in the state space. What the control algorithm does is to regulate this trajectory.
(2) Target value (reference value)
Depending on the application scenario, the target value may be a fixed value, a set, or a dynamically changing trajectory.
(3) Performance index (cost function)
Performance index (also called cost function) is the key to optimal control, which is used to measure the performance of the system. Performance index is usually a scalar function that quantifies the performance of the system in terms of performance, efficiency or quality.
(4) Constraints
In optimal control, constraints can be imposed on state variables or on control variables. It should be noted that when considering constraints, it is necessary to ensure that the optimal control algorithm can find a feasible solution within the constraints to avoid too strict constraints that lead to no solution.
2. Common optimization problems
Common optimal control problems: shortest time problem, terminal control problem, minimum control quantity problem, trajectory tracking problem, comprehensive problem. In the process of analyzing and solving optimization problems, the following points need to be noted.
(1) Existence of optimal control strategy
(2) Diversity of Optimal Strategies
(3) In the process of modeling the optimization problem, try to design the performance indicators and constraints as convex functions.
2. Dynamic Programming and Linear Quadratic Regulator
1. Dynamic programming is an optimization method for optimal control problems. It decomposes the problem into a series of sub-problems and solves these sub-problems recursively to obtain the optimal solution. The original English text of Bellman's optimization theory is described as follows:
An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the stateresulting from the first decision.
This passage mainly contains two important parts: first, no matter what the initial state is, no matter what the initial control decision is; second, the remaining decisions must conform to the optimal policy.
The book uses numerical methods to explain the problem of UAV altitude control, and explains it through brute force algorithm, reverse hierarchical solution method, and dynamic programming table lookup method.
Figure 1: Simulation results of the shortest time required for drones
The figure above shows the time relationship between (a) speed and altitude (b) altitude and system input (c) speed and time (d) altitude and time (e) system input.
2. The system is linear and the performance indicators are in quadratic form. The control objective is to stabilize the state variable at 0 (regulation problem). A controller that meets this requirement is called a linear quadratic regulator ( LQR ). It is divided into discrete linear quadratic systems and continuous linear quadratic systems. The book talks about a typical application problem - spring mass damping system.
3. Model Predictive Control
Model Predictive Control ( MPC ) is an advanced control strategy widely used in the field of industrial control. The basic idea of MPC is to use the current state and constraints of the system to predict the state and input variables in the future and solve a set of optimal control input sequences. Then, only the first set of results in the optimal control sequence is selected and applied to the system. At the next moment, the same operation is repeated to obtain a new optimal control sequence until the system reaches the desired state.
MPC is a receding horizon control method. At each sampling moment, MPC calculates the optimal control sequence by solving an optimization problem in a finite time, which is called the prediction horizon . The quadratic programming problem ( QP ) is closely related to linear MPC. In model predictive control, we need to use the prediction model to calculate the optimal control input in the future at each moment, and this process can be transformed into a constrained quadratic programming problem.
The standard form of the quadratic programming problem can be expressed as
, that is, to find the u value that minimizes the performance indicator J while satisfying the constraints
. u is an nX1 vector; H is an nXn symmetric positive definite matrix. The objective function consists of two parts, the first part is a quadratic form, and the second part is a linear term, where f is an nX1 vector. Constraints can include equality constraints and inequality constraints, as well as the range of values of ", where LB represents the lower bound and UB represents the upper bound, both of which are nX1 vectors. The book explains an unconstrained quadratic programming example, and the Octave simulation is as follows
Figure 2: Unconstrained quadratic programming
Figure (a) shows the three-dimensional image of the performance index in this example. It is a "bowl-shaped" parabola, and the minimum value appears at the "bottom of the bowl". Figure (b) is a contour map with u, as the horizontal axis and u, as the vertical axis. The "height" of each ring is consistent (J is the same), and the minimum value is at the center of the graph.
The chapters in the book combine mathematical formulas and code examples to explain the model control derivation - unconstrained regulation problems, trajectory tracking problems, soft constraints and hard constraints, drone altitude control, etc. The amount of information is still relatively large and needs to be digested carefully. Finally, the development direction of MPC is discussed: nonlinear MPC, large-scale MPC, data-based MPC, multi-objective MPC. It can be seen that model predictive control has great development prospects.
Through this study, I have a deeper understanding of the control theory method. I can really feel the profound theoretical foundation of the masters and their ability to combine mathematical theories and tools to solve real-world problems. I want to emulate them and encourage myself.