Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation

Qin, Yahang; Zhang, Chengye; Chen, Ci; Xie, Shengli; Lewis, Frank L.

doi:10.1007/s11424-024-3357-z

Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation

Published: 27 February 2024

Volume 37, pages 114–135, (2024)
Cite this article

Journal of Systems Science and Complexity Aims and scope Submit manuscript

Yahang Qin^1,2^na1,
Chengye Zhang^1,3^na1,
Ci Chen^1,4,
Shengli Xie^5,6 &
…
Frank L. Lewis⁷

76 Accesses
Explore all metrics

Abstract

This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation. While navigating in urban canyons, the multipath effect is a kind of interference that causes the navigation signal to drift and thus imposes severe impacts on vehicle localization due to the reflection and diffraction of the BeiDou signal. Here, the authors formulated the navigation control system with unknown vehicle dynamics into an optimal control-seeking problem through a linear discrete-time system, and the point-to-point localization control is modeled and handled by leveraging off-policy reinforcement learning for feedback control. The proposed learning-based design guarantees optimality with prescribed performance and also stabilizes the closed-loop navigation system, without the full knowledge of the vehicle dynamics. It is seen that the proposed method can withstand the impact of the multipath effect while satisfying the prescribed convergence rate. A case study demonstrates that the proposed algorithms effectively drive the vehicle to a desired setpoint under the multipath effect introduced by actual experiments of BeiDou navigation in the urban environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Navigation Function Based Decentralized Control of Multi-Vehicle Systems in Unknown Environments

Article 30 November 2016

Adaptive non-holonomic constraint aiding Multi-GNSS PPP/INS tightly coupled navigation in the urban environment

Article 25 June 2023

An INS-assisted vector tracking receiver with multipath error estimation for dense urban canyons

Article 16 March 2023

References

Hsu L T and Wen W, New integrated navigation scheme for the level 4 autonomous vehicles in dense urban areas, Proceedings of the 2020 IEEE/ION Position, Location and Navigation Symposium (PLANS), Portland, 2020, 297–305.
Suzuki T, Matsuo K, and Amano Y, Rotating gnss antennas: Simultaneous LOS and NLOS multipath mitigation, GPS Solutions, 2020, 24: 1–13.
Article Google Scholar
Hsu L T, Analysis and modeling GPS NLOS effect in highly urbanized area, GPS Solutions, 2018, 22(1): 1–12.
Article Google Scholar
Wen W, Bai X, and Hsu L T, 3D vision aided GNSS real-time kinematic positioning for autonomous systems in urban canyons, NAVIGATION: Journal of the Institute of Navigation, 2023, 70(3): navi.590.
Article Google Scholar
Sun R, Zhang Z, Cheng Q, et al., Pseudorange error prediction for adaptive tightly coupled gnss/imu navigation in urban areas, GPS Solutions, 2022, 26: 1–13.
Article Google Scholar
Zhang G, Wen W, Xu B, et al., Extending shadow matching to tightly-coupled GNSS/INS integration system, IEEE Transactions on Vehicular Technology, 2020, 69(5): 4979–4991.
Article Google Scholar
Sharaf R, Noureldin A, Osman A, et al., Online INS/GPS integration with a radial basis function neural network, IEEE Aerospace and Electronic Systems Magazine, 2005, 20(3): 8–14.
Article Google Scholar
Liu Z, Liu J, Xu X, et al., DeepGPS: Deep learning enhanced GPS positioning in urban canyons, IEEE Transactions on Mobile Computing, 2022, DOI: https://doi.org/10.1109/TMC.2022.3208240.
Kanhere A V, Gupta S, Shetty A, et al., Improving GNSS positioning using neural-network-based corrections, NAVIGATION: Journal of the Institute of Navigation, 2022, 69(4): navi.548.
Article Google Scholar
Zhang E and Masoud N, Increasing GPS localization accuracy with reinforcement learning, IEEE Transactions on Intelligent Transportation Systems, 2020, 22(5): 2615–2626.
Article Google Scholar
Cao X R, Stochastic learning and optimization-a sensitivity-based approach, IFAC Proceedings Volumes, 2008, 41(2): 3480–3492.
Article MathSciNet Google Scholar
Sutton R S and Barto A G, Reinforcement Learning: An Introduction, MIT Press, Cambridge, 2018.
Google Scholar
Lewis F L, Vrabie D, and Syrmos V L, Optimal Control, John Wiley & Sons, New York, 2012.
Book Google Scholar
Zhang H, Liu D, Luo Y, et al., Adaptive Dynamic Programming For Control: Algorithms and Stability, Springer Science & Business Media, Berlin, 2012.
Google Scholar
Lewis F L, Vrabie D, and Vamvoudakis K G, Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers, IEEE Control Systems Magazine, 2012, 32(6): 76–105.
Article MathSciNet Google Scholar
Jiang Y and Jiang Z P, Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 2012, 48(10): 2699–2704.
Article MathSciNet Google Scholar
Modares H, Lewis F L, and Jiang Z P, Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning, IEEE Transactions on Cybernetics, 2016, 46(11): 2401–2410.
Article PubMed Google Scholar
Chen C, Modares H, Xie K, et al., Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics, IEEE Transactions on Automatic Control, 2019, 64(11): 4423–4438.
Article MathSciNet Google Scholar
Chen C, Lewis F L, Xie K, et al., Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems, Automatica, 2020, 119: 109081.
Article MathSciNet Google Scholar
Jiang Z P, Bian T, Gao W, et al., Learning-based control: A tutorial and some recent results, Foundations and Trends in Systems and Control, 2020, 8(3): 176–284.
Article Google Scholar
Chen C, Xie L, Xie K, et al., Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning, Automatica, 2022, 146: 110581.
Article MathSciNet Google Scholar
Gao W, Deng C, Jiang Y, et al., Resilient reinforcement learning and robust output regulation under denial-of-service attacks, Automatica, 2022, 142: 110366.
Article MathSciNet Google Scholar
Qasem O, Gao W, and Vamvoudakis K G, Adaptive optimal control of continuous-time nonlinear affine systems via hybrid iteration, Automatica, 2023, 157: 111261.
Article MathSciNet Google Scholar
Jiang Y and Jiang Z P, Robust Adaptive Dynamic Programming, John Wiley & Sons, New York, 2017.
Book Google Scholar
Kamalapurkar R, Walters P, Rosenfeld J, et al., Reinforcement Learning for Optimal Feedback Control, Springer, Berlin, 2018.
Book Google Scholar
Chen C, Xie L, Jiang Y, et al., Robust output regulation and reinforcement learning-based output tracking design for unknown linear discrete-time systems, IEEE Transactions on Automatic Control, 2022, 68(4): 2391–2398.
Article MathSciNet Google Scholar
Kiumarsi B and Lewis F L, Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems, IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(1): 140–151.
Article MathSciNet PubMed Google Scholar
Kiumarsi B, Lewis F L, Modares H, et al., Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica, 2014, 50(4): 1167–1175.
Article MathSciNet Google Scholar
Lu X, Kiumarsi B, Chai T, et al., Operational control of mineral grinding processes using adaptive dynamic programming and reference governor, IEEE Transactions on Industrial Informatics, 2018, 15(4): 2210–2221.
Article Google Scholar
Kiumarsi B, Lewis F L, and Jiang Z P, H_∞ control of linear discrete-time systems: Off-policy reinforcement learning, Automatica, 2017, 78: 144–152.
Article MathSciNet Google Scholar
Lewis F L and Vamvoudakis K G, Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 14–25.
Article Google Scholar
Kiumarsi B, Lewis F L, Naghibi-Sistani M B, et al., Optimal tracking control of unknown discrete-time linear systems using input-output measured data, IEEE Transactions on Cybernetics, 2015, 45(12): 2770–2779.
Article PubMed Google Scholar
Gao W and Jiang Z P, Adaptive dynamic programming and adaptive optimal output regulation of linear systems, IEEE Transactions on Automatic Control, 2016, 61(12): 4164–4169.
Article MathSciNet Google Scholar
Yi J, Fan J L, and Chai T Y, Data-driven optimal output regulation with assured convergence rate, Acta Automatica Sinica, 2021, 47: 1–12.
Google Scholar
Chen C and Xie L, A data-driven prescribed convergence rate design for robust tracking of discrete-time systems, Journal of Guangdong University of Technology, 2021, 38: 29–34.
ADS Google Scholar
Zhang C, Chen C, and Xie S, Learning-based prescribed rate design for output regulation of discrete-time systems, Proceedings of the 2023 35th Chinese Control and Decision Conference (CCDC), Yichang, 2023, 2738–2744.
Hsu L T, Jan S S, Groves P D, et al., Multipath mitigation and nlos detection using vector tracking in urban environments, GPS Solutions, 2015, 19: 249–262.
Article Google Scholar
Groves P D and Jiang Z, Height aiding, C/N0 weighting and consistency checking for gnss nlos and multipath mitigation in urban areas, The Journal of Navigation, 2013, 66(5): 653–669.
Article Google Scholar
Chen X, Morton Y J, Yu W, et al., GPS L1CA/BDS B1I multipath channel measurements and modeling for dynamic land vehicle in shanghai dense urban area, IEEE Transactions on Vehicular Technology, 2020, 69(12): 14247–14263.
Article Google Scholar
Cai C, He C, Santerre R, et al., A comparative analysis of measurement noise and multipath for four constellations: GPS, BeiDou, GLONASS and Galileo, Survey Review, 2016, 48(349): 287–295.
Article Google Scholar
Hewer G, An iterative technique for the computation of the steady state gains for the discrete optimal regulator, IEEE Transactions on Automatic Control, 1971, 16(4): 382–384.
Article Google Scholar
Lancaster P and Rodman L, Algebraic Riccati Equations, Clarendon Press, Oxford, 1995.
Book Google Scholar

Download references

Author information

These authors contributed equally: QIN Yahang and ZHANG Chengye.

Authors and Affiliations

School of Automation, Guangdong University of Technology, Guangzhou, 510006, China
Yahang Qin, Chengye Zhang & Ci Chen
Guangdong Key Laboratory of IoT Information Technology, Guangzhou, 510006, China
Yahang Qin
Center for Intelligent Batch Manufacturing Based on IoT Technology, Guangzhou, China
Chengye Zhang
Key Laboratory of Intelligent Detection and The Internet of Things in Manufacturing, Ministry of Education, Guangzhou, 510006, China
Ci Chen
Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou, 510006, China
Shengli Xie
Key Laboratory of Intelligent Information Processing and System Integration of IoT, Ministry of Education, Guangzhou, 510006, China
Shengli Xie
UTA Research Institute, The University of Texas at Arlington, Fort Worth, TX, 76019, USA
Frank L. Lewis

Authors

Yahang Qin
View author publications
You can also search for this author in PubMed Google Scholar
Chengye Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ci Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shengli Xie
View author publications
You can also search for this author in PubMed Google Scholar
Frank L. Lewis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ci Chen.

Ethics declarations

The authors declare no conflict of interest.

Additional information

This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 62320106008 and 62373114, and in part by the Collaborative Innovation Center for Transportation Science and Technology of Guangzhou under Grant No. 202206010056.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, Y., Zhang, C., Chen, C. et al. Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation. J Syst Sci Complex 37, 114–135 (2024). https://doi.org/10.1007/s11424-024-3357-z

Download citation

Received: 05 September 2023
Revised: 27 September 2023
Published: 27 February 2024
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11424-024-3357-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation

Abstract

Access this article

Similar content being viewed by others

A New Navigation Function Based Decentralized Control of Multi-Vehicle Systems in Unknown Environments

Adaptive non-holonomic constraint aiding Multi-GNSS PPP/INS tightly coupled navigation in the urban environment

An INS-assisted vector tracking receiver with multipath error estimation for dense urban canyons

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation