Abstract
This paper presents a novel dynamic vehicle tracking framework, achieving accurate pose estimation and tracking in urban environments. For vehicle tracking with laser scanners, pose estimation extracts geometric information of the target from a point cloud clustering unit, which plays an essential role in tracking tasks. However, the point cloud acquired from laser scanners only provides distance measurements to the object surface facing the sensor, leading to nonnegligible pose estimation errors. To address this issue, we take the motion information of targets as feedback to assist vehicle detection and pose estimation. In addition, the heading normalization vehicle model and a robust target size estimation method are introduced to deduce the pose of a vehicle with 2D matched filtering. Furthermore, considering the mobility of vehicles, we utilize the interactive multitude model (IMM) to capture multiple motion patterns. Compared to existing methods in the literature, our method can be applied to spatially sparse or incomplete point cloud observations. Experimental results demonstrate that our vehicle tracking framework achieves promising performance, and its real-time capability is also validated in real traffic scenarios.
Similar content being viewed by others
Code Availability
The custom code is currently not available.
References
Wang J, Huang H, Li K, Li J (2021) Towards the Unified Principles for Level 5 Autonomous Vehicles, Engineering. https://doi.org/10.1016/j.eng.2020.10.018
Badue C et al (2021) Self-driving cars: a survey. Expert Syst Appl, 165. https://doi.org/10.1016/j.eswa.2020.113816
Janai J, Güney F, Behl A, Geiger A (2020) Computer vision for autonomous vehicles: problems, datasets and state of the art. Foundations and Trends in Computer Graphics and Vision 12:1–308. https://doi.org/10.1561/0600000079
Yeong D, Velasco-Hernandez G, Barry J, Walsh J (2021) Sensor and sensor fusion technology in autonomous vehicles: a review. Sensors 21(6):2140. https://doi.org/10.3390/s21062140
Thrun S, et al. (2006) Stanley: the robot that won the DARPA grand challenge. J Field Rob 23:661–692. https://doi.org/10.1002/rob.20147
Huang W, Liang H, Lin L, Wang Z, Wang S, Yu B, Niu R (2021) A fast point cloud ground segmentation approach based on Coarse-To-Fine markov random field. IEEE trans Intell Transp Syst, pp 1–14. https://doi.org/10.1109/TITS.2021.3073151
Lim H, Oh M, Myung H (2021) Patchwork: Concentric Zone-based Region-wise Ground Segmentation with Ground Likelihood Estimation Using a 3D liDAR Sensor. IEEE Robot Autom Lett 6:6458–6465. https://doi.org/10.1109/LRA.2021.3093009
Yang H, Wang Z, Lin L, Liang H, Huang W, Xu F (2020) Two-layer-graph Clustering for Real-Time 3D liDAR Point Cloud Segmentation. Appl Sci 10(23):8534. https://doi.org/10.3390/app10238534
Challa S, Morelande M R, Muǎicki D, Evans RJ (2011) Fundamentals of object tracking. Cambridge University Press, Cambridge
Arya ASAR (2017) 3D-LIDAR Multi Object Tracking for Autonomous driving: Multi-target Detection and Tracking under Urban Road Uncertainties. Dissertation, Delft University of Technology
Kalman R E (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35–45. https://doi.org/10.1115/1.3662552
Reif K, Gunther S, Yaz E, Unbehauen R (1999) Stochastic stability of the discrete-time extended Kalman filter. IEEE Trans Autom Control 44(4):714–728. https://doi.org/10.1109/9.754809
Wan EA, Der V, Merwe R (2000) The unscented Kalman filter for nonlinear estimation. In: Proc Adapt Syst Signal Process Commun Control Symp, pp 153–158. https://doi.org/10.1109/ASSPCC.2000.882463
Börcs A, Nagy B, Benedek C (2015) Dynamic 3D environment perception and reconstruction using a mobile rotating multi-beam Lidar scanner, Handling Uncertainty and Networked Structure in Robot Control, Springer, Cham, pp 153–180. doi:10.1007/978-3-319-26327-4_7
Wang H, Wang Z, Lin L, Xu F, Yu J, Liang H (2021) Optimal Vehicle Pose Estimation Network Based on Time Series and Spatial Tightness with 3D liDARs. Remote Sens 13(20):4123. https://doi.org/10.3390/rs13204123
Morris DD, Hoffman R, Haley P (2009) A view-dependent adaptive matched filter for ladar-based vehicle tracking, 14th Int Conf Robot Appl Cambridge, MA, USA
Chen T, Wang R, Dai B, Liu D, Song J (2016) Likelihood-Field-Model-Based Dynamic Vehicle Detection and Tracking for Self-Driving. In: IEEE Transactions on Intelligent Transportation Systems, vol 17, pp 3142–3158. https://doi.org/10.1109/TITS.2016.2542258
Rieken J, Matthaei R, Maurer M (2015) Toward Perception-Driven Urban Environment Modeling for Automated Road Vehicles. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems, vol 2015, pp 731–738. https://doi.org/10.1109/ITSC.2015.124
Kim D, Jo K, Lee M, Sunwoo M (2018) L-Shape Model Switching-Based precise motion tracking of moving vehicles using laser scanners. IEEE trans Intell Transp Syst 19:598–612. https://doi.org/10.1109/TITS.2017.2771820
Qu S, Chen G, Ye C, Lu F, Wang F, Xu Z, Ge Y (2018) An Efficient L-Shape Fitting Method for Vehicle Pose Detection with 2D liDAR Inproceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia. https://doi.org/10.1109/ROBIO.2018.8665265
Steyer S, Tanzmeister G, Wollherr D (2017) Object tracking based on evidential dynamic occupancy grids in urban environments. 2017 IEEE Intelligent Vehicles Symposium (IV) 2017:1064–1070. https://doi.org/10.1109/IVS.2017.7995855
Ye Y, Fu L, Li B (2016) Object detection and tracking using multi-layer laser for autonomous urban driving IEEE 19th Int Conf on Intell Transp Syst, Rio de Janeiro, Brazil. https://doi.org/10.1109/ITSC.2016.7795564
Liu K, Wang W, Tharmarasa R, Wang J (2018) Dynamic vehicle detection with sparse point clouds based on PE-CPD. IEEE trans Intell Transp Syst 20:1964–1977. https://doi.org/10.1109/TITS.2018.2857510
Zhao C, Fu C, Dolan J M, Wang J (2021) L-shape Fitting-based Vehicle Pose Estimation and Tracking Using 3d-liDAR. IEEE trans on Intell Veh 6(4):787–798. https://doi.org/10.1109/TIV.2021.3078619
An J, Kim E (2021) Novel vehicle bounding box tracking using a Low-End 3D laser scanner. IEEE trans Intell Transp Syst 22:3403–3419. https://doi.org/10.1109/TITS.2020.2994624
Petrovskaya A, Thrun S (2009) Model based vehicle detection and tracking for autonomous urban driving. Autom. Robots 26:123–139. https://doi.org/10.1007/s10514-009-9115-1
Xiao J, Li H, Qu G, Fujita H, Cao Y, Zhu J, Huang C (2021) Hope: heatmap and offset for pose estimation, Journal of Ambient Intelligence and Humanized Computing, pp 1–13. https://doi.org/10.1007/s12652-021-03124-w
Li S, Yan Z, Li H et al (2021) Exploring intermediate representation for monocular vehicle pose estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1873–1883. https://doi.org/10.1109/CVPR46437.2021.00191
Wang D, Huang C, Wang Y, Deng Y, Li H (2020) A 3D multiobject tracking algorithm of point cloud based on deep learning. Math Probl Eng 2020:1–10. https://doi.org/10.1155/2020/8895696
Tháodose R, Denis D, Blanc C, Chateau T, Checchin P (2019) Vehicle Detection based on Deep Learning Heatmap Estimation IEEE Intell Veh Symp, Paris, France. https://doi.org/10.1109/IVS.2019.8814285
Vaquero V, Del Pino I, Moreno-Noguer F, Sola J, Sanfeliu A, Andrade-Cetto J (2017) Deconvolutional networks for point-cloud vehicle detection and tracking in driving scenarios European Conference on Mobile Robots, Paris, France. https://doi.org/10.1109/ECMR.2017.8098657
Luo H, Chen C, Fang L, Khoshelham K, Shen G (2020) MS-RRFSEgnet: Multiscale Regional Relation Feature Segmentation Network for Semantic Segmentation of Urban Scene Point Clouds. IEEE Trans Geosci Remote Sens 58(12):8301–8315. https://doi.org/10.1109/TGRS.2020.2985695
Zhou Y, Tuzel O (2018) Voxelnet: End-to-End Learning for Point Cloud Based 3D Object Detection, IEEE/CVF Conf on Comput Vision Pattern recognit(CVPR) Salt Lake City, UT, USA. https://doi.org/10.1109/CVPR.2018.00472
Zheng F, Sifan Z, Yubo C, Sebastian S (2021) 3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object Tracking Using Raw Point Cloud. IEEE Sensors Journal 21:4995–5011. https://doi.org/10.1109/JSEN.2020.3033034
Dave A (2021) Open world object detection and tracking. Dissertation, Carnegie Mellon University
Xu F, Liang H, Wang Z, Lin L (2021) A Framework for Drivable Area Detection Via Point Cloud Double Projection on Rough Roads. J Intell Robot Syst, 102(45). https://doi.org/10.1007/s10846-021-01381-7
Schreier M (2017) Bayesian environment representation, prediction and criticality assessment for driver assistance systems. Dissertation, Technical University of Darmstadt
Chen Z, Cai H, Zhang Y, Wu C, Mu M, Li Z, Sotelo MA (2019) A novel sparse representation model for pedestrian abnormal trajectory understanding. Expert Systems With Applications 138:112753. https://doi.org/10.1016/j.eswa.2019.06.041
Patil A, Malla S, Gang H, Chen Y (2019) The H3D dataset for full-surround 3d multi-object detection and tracking in crowded urban scenes, IEEE Int Conf on Robot and autom(ICRA) Montreal, QC, Canada. https://doi.org/10.1109/ICRA.2019.8793925
Acknowledgements
This work was supported by National Key Research and Development Program of China (Nos. 2020AAA0108103), Key Science and Technology Project of Anhui (Grant No. 202103a05020007) and Technological Innovation Project for New Energy and Intelligent Networked Automobile Industry of Anhui Province.
Funding
This work was supported in part by National Key Research and Development Program of China (Nos. 2020AAA0108103), Key Science and Technology Project of Anhui (Grant No. 202103a05020007) and Technological Innovation Project for New Energy and Intelligent Networked Automobile Industry of Anhui Province.
Author information
Authors and Affiliations
Contributions
Conceptualization: Fengyu Xu, Zhiling Wang; Methodology: Fengyu Xu, Linglong Lin; Formal analysis and investigation: Fengyu Xu, Hangqi Wang, Linglong Lin; Writing - original draft preparation: Fengyu Xu, Hangqi Wang; Writing - review and editing: Linglong Lin, Zhiling Wang; Funding acquisition: Huawei Liang, Zhiling Wang; Resources: Huawei Liang; Supervision: Huawei Liang, Zhiling Wang.
Corresponding authors
Ethics declarations
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Availability of Data and Material
The authors declare that all data and materials support our claims in the manuscript and comply with field standards. And the original data involved in our research is available.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Derivation of the double integral u ( p i, p r)
For a pair of points pi and pr
where
the derivative of (32) is as follows:
Given the Gaussian error function, we have
Similarly,
As a result, u(pi, pr) can be calculated by:
Appendix 2: The gradient and Hessian of the cost function
where I is the total number of points in a clustering unit, cp = (x, y) denotes the center point, and J represents the total number of rectangular areas. Therefore, J = 5 in this paper.
(1)The gradient of Fcost(x, y) is:
where
(2)The Hessian of Fcost(x, y) is :
where
From the definition of \(I(p_{i}, R_{c_{p},j})\) in (16), \(I(p_{i}, R_{c_{p},j})\) is a linear combination of the double integral u(pi, pr). Therefore, it is necessary to compute the first and second derivatives of u(pi, pr). For notation simplicity, let
where xoffset and yoffset represent the offset from the center to a rectangular vertex in the X and Y directions, respectively, as shown in Table 3. Hence, (36) can be rewritten as:
Furthermore, the Gaussian error function can be differentiated to the first- and second-order as follows:
By combining (42), (43) and (44), we have:
Then, the gradient and Hessian of the cost function can be calculated using (45).
Appendix 3: Motion model equation
The state vector X of a moving target contains its pose information and motion information as follows:
where (x, y) is the center position in a global coordinate system, 𝜃 denotes the target heading, and v and ω represent the velocity and yaw rate of the target, respectively. Assuming the time interval is δt, the system functions of the three motion models are given as follows:
a) CV Model
b) CTRV Model
c) RM Model
Appendix: 4: Parameter table
Parameter | Description | Value |
Xsize/Ysize | the size of the elevation map | 400/600 |
d grid | the resolution of the elevation map | 20cm |
xcar/ ycar | the location of ego-car in the elevation map | 200/200 |
D th | the segmentation distance threshold | 0.7m |
d th | the merging distance threshold | 0.3m |
N th | the frame count threshold for dynamic target | 10 |
d cell | the length of cells | 0.3m |
X th | the X-axis span threshold | 2.5m |
v | the width of the rectangle edges | 0.5m |
σx / σy | the variances of measurement noise | 0.212/0.212 |
N itmax | the maximum number of iteration | 10 |
δ th | permission error | 0.05m |
Lini/Wini | the initial value of length and width | 4m/2m |
\(S_{\max \limits }\) | maximum size | 15m |
n th | the element count threshold for size estimation | 15 |
Rights and permissions
About this article
Cite this article
Xu, F., Wang, Z., Wang, H. et al. Dynamic vehicle pose estimation and tracking based on motion feedback for LiDARs. Appl Intell 53, 2362–2390 (2023). https://doi.org/10.1007/s10489-022-03576-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03576-3