Skip to main content
Log in

Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

This paper presents a learning-based control policy design for point-to-point vehicle positioning in the urban environment via BeiDou navigation. While navigating in urban canyons, the multipath effect is a kind of interference that causes the navigation signal to drift and thus imposes severe impacts on vehicle localization due to the reflection and diffraction of the BeiDou signal. Here, the authors formulated the navigation control system with unknown vehicle dynamics into an optimal control-seeking problem through a linear discrete-time system, and the point-to-point localization control is modeled and handled by leveraging off-policy reinforcement learning for feedback control. The proposed learning-based design guarantees optimality with prescribed performance and also stabilizes the closed-loop navigation system, without the full knowledge of the vehicle dynamics. It is seen that the proposed method can withstand the impact of the multipath effect while satisfying the prescribed convergence rate. A case study demonstrates that the proposed algorithms effectively drive the vehicle to a desired setpoint under the multipath effect introduced by actual experiments of BeiDou navigation in the urban environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hsu L T and Wen W, New integrated navigation scheme for the level 4 autonomous vehicles in dense urban areas, Proceedings of the 2020 IEEE/ION Position, Location and Navigation Symposium (PLANS), Portland, 2020, 297–305.

  2. Suzuki T, Matsuo K, and Amano Y, Rotating gnss antennas: Simultaneous LOS and NLOS multipath mitigation, GPS Solutions, 2020, 24: 1–13.

    Article  Google Scholar 

  3. Hsu L T, Analysis and modeling GPS NLOS effect in highly urbanized area, GPS Solutions, 2018, 22(1): 1–12.

    Article  Google Scholar 

  4. Wen W, Bai X, and Hsu L T, 3D vision aided GNSS real-time kinematic positioning for autonomous systems in urban canyons, NAVIGATION: Journal of the Institute of Navigation, 2023, 70(3): navi.590.

    Article  Google Scholar 

  5. Sun R, Zhang Z, Cheng Q, et al., Pseudorange error prediction for adaptive tightly coupled gnss/imu navigation in urban areas, GPS Solutions, 2022, 26: 1–13.

    Article  Google Scholar 

  6. Zhang G, Wen W, Xu B, et al., Extending shadow matching to tightly-coupled GNSS/INS integration system, IEEE Transactions on Vehicular Technology, 2020, 69(5): 4979–4991.

    Article  Google Scholar 

  7. Sharaf R, Noureldin A, Osman A, et al., Online INS/GPS integration with a radial basis function neural network, IEEE Aerospace and Electronic Systems Magazine, 2005, 20(3): 8–14.

    Article  Google Scholar 

  8. Liu Z, Liu J, Xu X, et al., DeepGPS: Deep learning enhanced GPS positioning in urban canyons, IEEE Transactions on Mobile Computing, 2022, DOI: https://doi.org/10.1109/TMC.2022.3208240.

  9. Kanhere A V, Gupta S, Shetty A, et al., Improving GNSS positioning using neural-network-based corrections, NAVIGATION: Journal of the Institute of Navigation, 2022, 69(4): navi.548.

    Article  Google Scholar 

  10. Zhang E and Masoud N, Increasing GPS localization accuracy with reinforcement learning, IEEE Transactions on Intelligent Transportation Systems, 2020, 22(5): 2615–2626.

    Article  Google Scholar 

  11. Cao X R, Stochastic learning and optimization-a sensitivity-based approach, IFAC Proceedings Volumes, 2008, 41(2): 3480–3492.

    Article  MathSciNet  Google Scholar 

  12. Sutton R S and Barto A G, Reinforcement Learning: An Introduction, MIT Press, Cambridge, 2018.

    Google Scholar 

  13. Lewis F L, Vrabie D, and Syrmos V L, Optimal Control, John Wiley & Sons, New York, 2012.

    Book  Google Scholar 

  14. Zhang H, Liu D, Luo Y, et al., Adaptive Dynamic Programming For Control: Algorithms and Stability, Springer Science & Business Media, Berlin, 2012.

    Google Scholar 

  15. Lewis F L, Vrabie D, and Vamvoudakis K G, Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers, IEEE Control Systems Magazine, 2012, 32(6): 76–105.

    Article  MathSciNet  Google Scholar 

  16. Jiang Y and Jiang Z P, Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 2012, 48(10): 2699–2704.

    Article  MathSciNet  Google Scholar 

  17. Modares H, Lewis F L, and Jiang Z P, Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning, IEEE Transactions on Cybernetics, 2016, 46(11): 2401–2410.

    Article  PubMed  Google Scholar 

  18. Chen C, Modares H, Xie K, et al., Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics, IEEE Transactions on Automatic Control, 2019, 64(11): 4423–4438.

    Article  MathSciNet  Google Scholar 

  19. Chen C, Lewis F L, Xie K, et al., Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems, Automatica, 2020, 119: 109081.

    Article  MathSciNet  Google Scholar 

  20. Jiang Z P, Bian T, Gao W, et al., Learning-based control: A tutorial and some recent results, Foundations and Trends in Systems and Control, 2020, 8(3): 176–284.

    Article  Google Scholar 

  21. Chen C, Xie L, Xie K, et al., Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning, Automatica, 2022, 146: 110581.

    Article  MathSciNet  Google Scholar 

  22. Gao W, Deng C, Jiang Y, et al., Resilient reinforcement learning and robust output regulation under denial-of-service attacks, Automatica, 2022, 142: 110366.

    Article  MathSciNet  Google Scholar 

  23. Qasem O, Gao W, and Vamvoudakis K G, Adaptive optimal control of continuous-time nonlinear affine systems via hybrid iteration, Automatica, 2023, 157: 111261.

    Article  MathSciNet  Google Scholar 

  24. Jiang Y and Jiang Z P, Robust Adaptive Dynamic Programming, John Wiley & Sons, New York, 2017.

    Book  Google Scholar 

  25. Kamalapurkar R, Walters P, Rosenfeld J, et al., Reinforcement Learning for Optimal Feedback Control, Springer, Berlin, 2018.

    Book  Google Scholar 

  26. Chen C, Xie L, Jiang Y, et al., Robust output regulation and reinforcement learning-based output tracking design for unknown linear discrete-time systems, IEEE Transactions on Automatic Control, 2022, 68(4): 2391–2398.

    Article  MathSciNet  Google Scholar 

  27. Kiumarsi B and Lewis F L, Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems, IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(1): 140–151.

    Article  MathSciNet  PubMed  Google Scholar 

  28. Kiumarsi B, Lewis F L, Modares H, et al., Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics, Automatica, 2014, 50(4): 1167–1175.

    Article  MathSciNet  Google Scholar 

  29. Lu X, Kiumarsi B, Chai T, et al., Operational control of mineral grinding processes using adaptive dynamic programming and reference governor, IEEE Transactions on Industrial Informatics, 2018, 15(4): 2210–2221.

    Article  Google Scholar 

  30. Kiumarsi B, Lewis F L, and Jiang Z P, H control of linear discrete-time systems: Off-policy reinforcement learning, Automatica, 2017, 78: 144–152.

    Article  MathSciNet  Google Scholar 

  31. Lewis F L and Vamvoudakis K G, Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 14–25.

    Article  Google Scholar 

  32. Kiumarsi B, Lewis F L, Naghibi-Sistani M B, et al., Optimal tracking control of unknown discrete-time linear systems using input-output measured data, IEEE Transactions on Cybernetics, 2015, 45(12): 2770–2779.

    Article  PubMed  Google Scholar 

  33. Gao W and Jiang Z P, Adaptive dynamic programming and adaptive optimal output regulation of linear systems, IEEE Transactions on Automatic Control, 2016, 61(12): 4164–4169.

    Article  MathSciNet  Google Scholar 

  34. Yi J, Fan J L, and Chai T Y, Data-driven optimal output regulation with assured convergence rate, Acta Automatica Sinica, 2021, 47: 1–12.

    Google Scholar 

  35. Chen C and Xie L, A data-driven prescribed convergence rate design for robust tracking of discrete-time systems, Journal of Guangdong University of Technology, 2021, 38: 29–34.

    ADS  Google Scholar 

  36. Zhang C, Chen C, and Xie S, Learning-based prescribed rate design for output regulation of discrete-time systems, Proceedings of the 2023 35th Chinese Control and Decision Conference (CCDC), Yichang, 2023, 2738–2744.

  37. Hsu L T, Jan S S, Groves P D, et al., Multipath mitigation and nlos detection using vector tracking in urban environments, GPS Solutions, 2015, 19: 249–262.

    Article  Google Scholar 

  38. Groves P D and Jiang Z, Height aiding, C/N0 weighting and consistency checking for gnss nlos and multipath mitigation in urban areas, The Journal of Navigation, 2013, 66(5): 653–669.

    Article  Google Scholar 

  39. Chen X, Morton Y J, Yu W, et al., GPS L1CA/BDS B1I multipath channel measurements and modeling for dynamic land vehicle in shanghai dense urban area, IEEE Transactions on Vehicular Technology, 2020, 69(12): 14247–14263.

    Article  Google Scholar 

  40. Cai C, He C, Santerre R, et al., A comparative analysis of measurement noise and multipath for four constellations: GPS, BeiDou, GLONASS and Galileo, Survey Review, 2016, 48(349): 287–295.

    Article  Google Scholar 

  41. Hewer G, An iterative technique for the computation of the steady state gains for the discrete optimal regulator, IEEE Transactions on Automatic Control, 1971, 16(4): 382–384.

    Article  Google Scholar 

  42. Lancaster P and Rodman L, Algebraic Riccati Equations, Clarendon Press, Oxford, 1995.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ci Chen.

Ethics declarations

The authors declare no conflict of interest.

Additional information

This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 62320106008 and 62373114, and in part by the Collaborative Innovation Center for Transportation Science and Technology of Guangzhou under Grant No. 202206010056.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, Y., Zhang, C., Chen, C. et al. Control Policy Learning Design for Vehicle Urban Positioning via BeiDou Navigation. J Syst Sci Complex 37, 114–135 (2024). https://doi.org/10.1007/s11424-024-3357-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-024-3357-z

Keywords

Navigation