Skip to main content

P-IoU: Accurate Motion Prediction Based Data Association for Multi-object Tracking

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14451))

Included in the following conference series:

  • 550 Accesses

Abstract

Multi-object tracking in complex scenarios remains a challenging task due to objects’ irregular motions and indistinguishable appearances. Traditional methods often approximate the motion direction of objects solely based on their bounding box information, leading to cumulative noise and incorrect association. Furthermore, the lack of depth information in these methods can result in failed discrimination between foreground and background objects due to the perspective projection of the camera. To address these limitations, we propose a Pose Intersection over Union (P-IoU) method to predict the true motion direction of objects by incorporating body pose information, specifically the motion of the human torso. Based on P-IoU, we propose PoseTracker, a novel approach that combines bounding box IoU and P-IoU effectively during association to improve tracking performance. Exploiting the relative stability of the human torso and the confidence of keypoints, our method effectively captures the genuine motion cues, reducing identity switches caused by irregular movements. Experiments on the DanceTrack and MOT17 datasets demonstrate that the proposed PoseTracker outperforms existing methods. Our method highlights the importance of accurate motion prediction of objects for data association in MOT and provides a new perspective for addressing the challenges posed by irregular object motion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andriluka, M., et al.: PoseTrack: a benchmark for human pose estimation and tracking. In: CVPR (2018)

    Google Scholar 

  2. Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: CVPR (2008)

    Google Scholar 

  3. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP (2016)

    Google Scholar 

  4. Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: AVSS (2017)

    Google Scholar 

  5. Cao, J., Pang, J., Weng, X., Khirodkar, R., Kitani, K.: Observation-centric sort: rethinking sort for robust multi-object tracking. In: CVPR (2023)

    Google Scholar 

  6. Chu, P., Wang, J., You, Q., Ling, H., Liu, Z.: TransMOT: spatial-temporal graph transformer for multiple object tracking. In: WACV (2023)

    Google Scholar 

  7. Du, Y., et al.: StrongSORT: make DeepSORT great again. IEEE Trans. Multimedia (2023). https://doi.org/10.1109/TMM.2023.3240881

  8. Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect. In: ICCV (2017)

    Google Scholar 

  9. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)

  10. Han, S., Huang, P., Wang, H., Yu, E., Liu, D., Pan, X.: MAT: motion-aware multi-object tracking. Neurocomputing 473, 75–86 (2022)

    Article  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  12. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45

    Chapter  Google Scholar 

  13. Lehmann, E.L., Casella, G.: Theory of Point Estimation. Springer, New York (2006). https://doi.org/10.1007/b98854

  14. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)

    Google Scholar 

  15. Luiten, J.: HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis. 129, 548–578 (2020). https://doi.org/10.1007/s11263-020-01375-2

    Article  Google Scholar 

  16. Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: TrackFormer: multi-object tracking with transformers. In: CVPR (2022)

    Google Scholar 

  17. Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)

  18. Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: CVPR (2021)

    Google Scholar 

  19. Saribas, H., Cevikalp, H., Köpüklü, O., Uzun, B.: TRAT: tracking by attention using spatio-temporal features. Neurocomputing 492, 150–161 (2022)

    Article  Google Scholar 

  20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  21. Sun, P., et al.: DanceTrack: multi-object tracking in uniform appearance and diverse motion. In: CVPR (2022)

    Google Scholar 

  22. Sun, P., et al.: TransTrack: multiple object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)

  23. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  24. Wan, X., Cao, J., Zhou, S., Wang, J., Zheng, N.: Tracking beyond detection: learning a global response map for end-to-end multi-object tracking. IEEE Trans. Image Process. 30, 8222–8235 (2021)

    Article  Google Scholar 

  25. Wang, S., Sheng, H., Zhang, Y., Wu, Y., Xiong, Z.: A general recurrent tracking framework without real data. In: ICCV (2021)

    Google Scholar 

  26. Welch, G., Bishop, G., et al.: An introduction to the Kalman filter (1995)

    Google Scholar 

  27. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP (2017)

    Google Scholar 

  28. Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: an online multi-object tracker. In: CVPR (2021)

    Google Scholar 

  29. Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 472–487. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_29

    Chapter  Google Scholar 

  30. Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: transformers with dense queries for multiple-object tracking. arXiv e-prints, pp. arXiv-2103 (2021)

    Google Scholar 

  31. Yang, F., Odashima, S., Masui, S., Jiang, S.: Hard to track objects with irregular motions and similar appearances? Make it easier by buffering the matching space. In: WACV (2023)

    Google Scholar 

  32. Yu, F., Li, W., Li, Q., Liu, Yu., Shi, X., Yan, J.: POI: multiple object tracking with high performance detection and appearance feature. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 36–42. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_3

    Chapter  Google Scholar 

  33. Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., Wei, Y.: MOTR: end-to-end multiple-object tracking with transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision, ECCV 2022. LNCS, vol. 13687, pp. 659–675. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_38

  34. Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision, ECCV 2022. LNCS, vol. 13682, pp. 1–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_1

  35. Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: the fairness of detection and re-identification in multiple object tracking. IJCV 129, 1–19 (2021)

    Article  Google Scholar 

  36. Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28

    Chapter  Google Scholar 

  37. Zhou, X., Yin, T., Koltun, V., Krähenbühl, P.: Global tracking transformers. In: CVPR (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinhua Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, X., Xu, J. (2024). P-IoU: Accurate Motion Prediction Based Data Association for Multi-object Tracking. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14451. Springer, Singapore. https://doi.org/10.1007/978-981-99-8073-4_37

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8073-4_37

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8072-7

  • Online ISBN: 978-981-99-8073-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics