Skip to main content
Log in

Proximal policy optimization with an integral compensator for quadrotor control

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

We use the advanced proximal policy optimization (PPO) reinforcement learning algorithm to optimize the stochastic control strategy to achieve speed control of the “model-free” quadrotor. The model is controlled by four learned neural networks, which directly map the system states to control commands in an end-to-end style. By introducing an integral compensator into the actor-critic framework, the speed tracking accuracy and robustness have been greatly enhanced. In addition, a two-phase learning scheme which includes both offline- and online-learning is developed for practical use. A model with strong generalization ability is learned in the offline phase. Then, the flight policy of the model is continuously optimized in the online learning phase. Finally, the performances of our proposed algorithm are compared with those of the traditional PID algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Qing-ling WANG guided the research. Huan HU performed the experiments, drafted, revised, and finalized the paper.

Corresponding author

Correspondence to Qing-ling Wang.

Ethics declarations

Huan HU and Qing-ling WANG declare that they have no conflict of interest.

Additional information

Project supported by the National Key R&D Program of China (No. 2018AAA0101400), the National Natural Science Foundation of China (Nos. 61973074, U1713209, 61520106009, and 61533008), the Science and Technology on Information System Engineering Laboratory (No. 05201902), and the Fundamental Research Funds for the Central Universities, China

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, H., Wang, Ql. Proximal policy optimization with an integral compensator for quadrotor control. Front Inform Technol Electron Eng 21, 777–795 (2020). https://doi.org/10.1631/FITEE.1900641

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1900641

Key words

CLC number

Navigation