3D Single-Object Tracking in Point Clouds with High Temporal Variation

Wu, Qiao; Sun, Kun; An, Pei; Salzmann, Mathieu; Zhang, Yanning; Yang, Jiaqi

doi:10.1007/978-3-031-72667-5_16

Qiao Wu¹³,
Kun Sun¹⁴,
Pei An¹⁵,
Mathieu Salzmann¹⁶,
Yanning Zhang¹³ &
…
Jiaqi Yang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15065))

Included in the following conference series:

European Conference on Computer Vision

469 Accesses

Abstract

The high temporal variation of the point clouds is the key challenge of 3D single-object tracking (3D SOT). Existing approaches rely on the assumption that the shape variation of the point clouds and the motion of the objects across neighboring frames are smooth, failing to cope with high temporal variation data. In this paper, we present a novel framework for 3D SOT in point clouds with high temporal variation, called HVTrack. HVTrack proposes three novel components to tackle the challenges in the high temporal variation scenario: 1) A Relative-Pose-Aware Memory module to handle temporal point cloud shape variations; 2) a Base-Expansion Feature Cross-Attention module to deal with similar object distractions in expanded search areas; 3) a Contextual Point Guided Self-Attention module for suppressing heavy background noise. We construct a dataset with high temporal variation (KITTI-HV) by setting different frame intervals for sampling in the KITTI dataset. On the KITTI-HV with 5 frame intervals, our HVTrack surpasses the state-of-the-art tracker CXTracker by 11.3%/15.7% in Success/Precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MPPNet: Multi-frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

SPAN: siampillars attention network for 3D object tracking in point clouds

Article 08 February 2022

Serial Spatial and Temporal Transformer for Point Cloud Sequences Recognition

References

Chen, X., et al.: Trajectoryformer: 3D object tracking transformer with predictive trajectory hypotheses. arXiv preprint arXiv:2306.05888 (2023)
Cheng, R., Wang, X., Sohel, F., Lei, H.: Topology-aware universal adversarial attack on 3D object tracking. Vis. Intell. 1(1), 31 (2023)
Article Google Scholar
Chiu, H.K., Prioletti, A., Li, J., Bohg, J.: Probabilistic 3D multi-object tracking for autonomous driving. arXiv preprint arXiv:2001.05673 (2020)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Cui, Y., Fang, Z., Shan, J., Gu, Z., Zhou, S.: 3D object tracking with transformer. arXiv preprint arXiv:2110.14921 (2021)
Ding, S., Rehder, E., Schneider, L., Cordts, M., Gall, J.: 3dmotformer: graph transformer for online 3D multi-object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9784–9794 (2023)
Google Scholar
Fang, Z., Zhou, S., Cui, Y., Scherer, S.: 3D-SiamRPN: an end-to-end learning method for real-time 3D single object tracking using raw point cloud. IEEE Sens. J. 21(4), 4995–5011 (2020)
Article Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012)
Google Scholar
Giancola, S., Zarzar, J., Ghanem, B.: Leveraging shape completion for 3D siamese tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1359–1368 (2019)
Google Scholar
Guo, Z., Mao, Y., Zhou, W., Wang, M., Li, H.: CMT: context-matching-guided transformer for 3D tracking in point clouds. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13682, pp. 95–111. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_6
Chapter Google Scholar
Hui, L., Wang, L., Cheng, M., Xie, J., Yang, J.: 3D siamese voxel-to-BEV tracker for sparse point clouds. In: Advances in Neural Information Processing Systems, vol. 34, pp. 28714–28727 (2021)
Google Scholar
Hui, L., Wang, L., Tang, L., Lan, K., Xie, J., Yang, J.: 3D siamese transformer network for single object tracking on point clouds. arXiv preprint arXiv:2207.11995 (2022)
Jiao, L., Wang, D., Bai, Y., Chen, P., Liu, F.: Deep learning in visual tracking: a review. IEEE Trans. Neural Netw. Learn. Syst. 34(9), 5497–5516 (2021)
Article Google Scholar
Jiayao, S., Zhou, S., Cui, Y., Fang, Z.: Real-time 3D single object tracking with transformer. IEEE Trans. Multimedia 25, 2339–2353 (2022)
Google Scholar
Kapania, S., Saini, D., Goyal, S., Thakur, N., Jain, R., Nagrath, P.: Multi object tracking with UAVs using deep SORT and YOLOv3 RetinaNet detection framework. In: Proceedings of the 1st ACM Workshop on Autonomous and Intelligent Mobile Systems, pp. 1–6 (2020)
Google Scholar
Kart, U., Lukezic, A., Kristan, M., Kamarainen, J.K., Matas, J.: Object tracking by reconstruction with view-specific discriminative correlation filters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1339–1348 (2019)
Google Scholar
Lan, K., Jiang, H., Xie, J.: Temporal-aware siamese tracker: integrate temporal context for 3D object tracking. In: Proceedings of the Asian Conference on Computer Vision, pp. 399–414 (2022)
Google Scholar
Luo, C., Yang, X., Yuille, A.: Exploring simple 3D multi-object tracking for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10488–10497 (2021)
Google Scholar
Machida, E., Cao, M., Murao, T., Hashimoto, H.: Human motion tracking of mobile robot with kinect 3D sensor. In: Proceedings of SICE Annual Conference (SICE), pp. 2207–2211. IEEE (2012)
Google Scholar
Nishimura, H., Komorita, S., Kawanishi, Y., Murase, H.: SDOF-tracker: fast and accurate multiple human tracking by skipped-detection and optical-flow. IEICE Trans. Inf. Syst. 105(11), 1938–1946 (2022)
Article Google Scholar
Pang, Z., Li, Z., Wang, N.: Model-free vehicle tracking and state estimation in point cloud sequences. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8075–8082. IEEE (2021)
Google Scholar
Qi, C.R., Litany, O., He, K., Guibas, L.J.: Deep hough voting for 3D object detection in point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9277–9286 (2019)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Qi, H., Feng, C., Cao, Z., Zhao, F., Xiao, Y.: P2B: point-to-box network for 3D object tracking in point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6329–6338 (2020)
Google Scholar
Ren, C., Xu, Q., Zhang, S., Yang, J.: Hierarchical prior mining for non-local multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3611–3620 (2023)
Google Scholar
Ren, S., Yang, X., Liu, S., Wang, X.: SG-former: self-guided transformer with evolving token reallocation. arXiv preprint arXiv:2308.12216 (2023)
Sadjadpour, T., Li, J., Ambrus, R., Bohg, J.: Shasta: modeling shape and spatio-temporal affinities for 3D multi-object tracking. IEEE Robot. Autom. Lett. (2023)
Google Scholar
Shan, J., Zhou, S., Fang, Z., Cui, Y.: PTT: point-track-transformer module for 3D single object tracking in point clouds. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1310–1316 (2021)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: waymo open dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2446–2454 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, Q., Chen, Y., Pang, Z., Wang, N., Zhang, Z.: Immortal tracker: tracklet never dies. arXiv preprint arXiv:2111.13672 (2021)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)
Article Google Scholar
Wang, Z., Xie, Q., Lai, Y.K., Wu, J., Long, K., Wang, J.: Mlvsnet: multi-level voting siamese network for 3D visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3101–3110 (2021)
Google Scholar
Weng, X., Wang, J., Held, D., Kitani, K.: 3D multi-object tracking: a baseline and new evaluation metrics. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 10359–10366. IEEE (2020)
Google Scholar
Weng, X., Wang, Y., Man, Y., Kitani, K.M.: GNN3DMOT: graph neural network for 3D multi-object tracking with 2D-3D multi-feature learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6499–6508 (2020)
Google Scholar
Wu, Q., Yang, J., Sun, K., Zhang, C., Zhang, Y., Salzmann, M.: Mixcycle: mixup assisted semi-supervised 3D single object tracking with cycle consistency. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 13956–13966 (2023)
Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Google Scholar
Xu, T.X., Guo, Y.C., Lai, Y.K., Zhang, S.H.: Cxtrack: improving 3D point cloud tracking with contextual information. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1084–1093 (2023)
Google Scholar
Yin, T., Zhou, X., Krahenbuhl, P.: Center-based 3D object detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11784–11793 (2021)
Google Scholar
Yoo, J.S., Lee, H., Jung, S.W.: Video object segmentation-aware video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12322–12333 (2023)
Google Scholar
Zarzar, J., Giancola, S., Ghanem, B.: Efficient bird eye view proposals for 3D siamese tracking. arXiv preprint arXiv:1903.10168 (2019)
Zhang, X., Yang, J., Zhang, S., Zhang, Y.: 3D registration with maximal cliques. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 17745–17754 (2023)
Google Scholar
Zheng, C., et al.: Box-aware feature enhancement for single object tracking on point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13199–13208 (2021)
Google Scholar
Zheng, C., et al.: Beyond 3D siamese tracking: a motion-centric paradigm for 3D single object tracking in point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8111–8120 (2022)
Google Scholar
Zhou, C., et al.: PTTR: relational 3D point cloud object tracking with transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8531–8540 (2022)
Google Scholar
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Chapter Google Scholar

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (NFSC) under Grants 62372377 and 62176242.

Author information

Authors and Affiliations

Northwestern Polytechnical University, Xi’an, China
Qiao Wu, Yanning Zhang & Jiaqi Yang
China University of Geosciences, Wuhan, China
Kun Sun
HuaZhong University of Science and Technology, Wuhan, China
Pei An
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Mathieu Salzmann

Authors

Qiao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Kun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Pei An
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Salzmann
View author publications
You can also search for this author in PubMed Google Scholar
Yanning Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaqi Yang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1139 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Q., Sun, K., An, P., Salzmann, M., Zhang, Y., Yang, J. (2025). 3D Single-Object Tracking in Point Clouds with High Temporal Variation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15065. Springer, Cham. https://doi.org/10.1007/978-3-031-72667-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-72667-5_16
Published: 29 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72666-8
Online ISBN: 978-3-031-72667-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D Single-Object Tracking in Point Clouds with High Temporal Variation