Reinforcement learning-based model predictive control for uncertain systems with application to robot manipulators

Date

2026

Authors

Lu, Tianxiang

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Model-based control strategies often lead to performance degradation or even failure when implemented in practice since practical systems are subject to external disturbances and model uncertainties. Model predictive control (MPC) is also among these strategies. However, MPC is known to be capable of providing optimal control performance for constrained systems by solving an associated optimization problem. An essential component for constructing the MPC optimization problem is the prediction model which in many cases is the priori known physical model of the controlled system. To make MPC an effective method for practical systems with external disturbances and model uncertainties, adaptive MPC methods aim to obtain a satisfactory prediction model and thus control performance by performing system identification during the online control process. Despite the effectiveness of adaptive MPC, its intrinsic conservativeness leads to the development of other MPC frameworks that apply various techniques to provide robustness to the disturbed system with uncertainties. Reinforcement learning (RL) is one of the representative schemes that can be combined with MPC for safety guarantees and performance enhancement. In general, there exist three ways to combine RL and MPC. The first is to use RL to learn the optimal MPC parameters such as weighting matrices, constraint margins, and disturbance model parameters with the maximum long-term reward. The second applies MPC to check the safety of the actions generated by RL during the online control process. The last focuses on adopting MPC to generate the data for the offline training process of RL. In this dissertation, we mainly focus on the first way of combining RL and MPC to design novel reinforcement learning-based MPC (RLMPC) methods. Chapter 1 presents an overview of conventional MPC, adaptive learning-based MPC, RLMPC schemes along with classical control frameworks for robot manipulators. Chapter 2 introduces preliminary results on MPC implementation for linear and nonlinear systems as well as robot manipulators. Chapter 3 proposes a robust data-driven MPC for linear systems with mixed uncertainties via on-policy RL method with theoretical guarantees. Based on the proposed closed-loop MPC scheme, the on-policy SARSA algorithm is correspondingly chosen among various RL algorithms to construct a safe RLMPC framework with guaranteed recursive feasibility and closed-loop stability. Comprehensive comparative studies further demonstrate the advantages of better control performance and larger region of attraction over several well-established robust MPC and RLMPC methods. Building on Chapter 3, Chapter 4 develops an off-policy RLMPC approach for nonlinear systems with an integrated triggering mechanism. A quasi-dynamic event-triggered mechanism is incorporated into the proposed RLMPC framework to effectively reduce the overall computational load resulted from the parameter update by the off-policy RL. The resultant event-triggered RLMPC method is able to enhance the control performance while maintaining the closed-loop stability with guaranteed recursive feasibility and marginal conservativeness. Chapter 5 and Chapter 6 investigate the application of RLMPC schemes to robot manipulator with relatively simple structures and complex multi-joint structures, respectively. In Chapter 5, a variant of RLMPC methods proposed in Chapter 3 is further developed for manipulators with accessible dynamic model however subject to both model uncertainties and external disturbances. Therefore, the on-policy SARSA RL algorithm is again utilized to construct the policy for updating the uncertain Coriolis and centrifugal matrix, as well as the potentially imperfect feedback control gain. A case study on a planar robot manipulator with two revolute joints shows the capability of achieving better closed-loop performance under the proposed method. Chapter 6 presents the experimental results of applying a RLMPC scheme to the trajectory tracking problem of a practical robot manipulator: UR-10e. Considering the ability to provide real-time control actions, the trajectory tracking task is converted to the consecutive waypoint reaching task to further facilitate the design of a lightweight RLMPC framework. Two trajectories are adopted in the experimental validation to demonstrate superior performance under the proposed method over conventional MPC and classical soft actor-critic method, especially in the presence of relatively large model mismatch. Lastly, Chapter 7 concludes this dissertation and introduces potential future research directions.

Description

Keywords

model predictive control, reinforcement learning, robot manipulators

Citation