Skip to main content
Log in

WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

With the surge in object detection, Multi-Object Tracking (MOT) research has recently witnessed significant advancements. However, most previous studies have primarily focused on benchmarks involving distinguishing appearances and linear motion. In scenarios involving non-linear motion and similar appearances, these methods exhibit a drastic drop in performance. To address this issue, we propose WDTtrack, which incorporates spatiotemporal proximity, velocity orientation, and appearance similarity simultaneously. Firstly, we employ the Centroid Triplet Loss ReID (CTL) model to extract high-quality appearance embeddings. Second, we introduce Wider Bounding Box (W-BBox) and Direction Bank (DB) to capture abundant credible, and discriminative motion cues. Finally, we devise the Tracklet Recovery Mechanism (TRM) to facilitate long-term tracking maintenance. Extensive empirical results demonstrate that WDTtrack outperforms other trackers on the DanceTrack and SportsMOT dataset, highlighting its effectiveness and potential for further development. Specifically, WDTtrack achieves a 66.8 HOTA score, a 72.8 IDF1 score and a 55.9 AssA score on DanceTrack, and a 73.8 HOTA score, a 80.5 IDF1 score and a 64.3 AssA score on SportsMOT, substantially surpassing other non-Transformer algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 1
Fig. 7
Algorithm 2
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of Data and Materials

The DanceTrack [4] dataset is available at https://github.com/DanceTrack/DanceTrack. And the Sports-MOT [36] dataset is available at https://github.com/MCG-NJU/SportsMOT.

References

  1. Xia X, Meng Z, Han X et al (2023) An automated driving systems data acquisition and analytics platform. Transp Res C Emerg Technol 151:104120. https://doi.org/10.1016/j.trc.2023.104120

  2. Gloudemans D, Work DB (2021) Fast vehicle turning-movement counting using localization-based tracking. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition workshops (CVPRW), pp 4150–4159. https://doi.org/10.1109/cvprw53098.2021.00469

  3. Cioppa A, Giancola S, Deliege A et al (2022) Soccernet-tracking: multiple object tracking dataset and benchmark in soccer videos. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition workshops (CVPRW), pp 3490–3501. https://doi.org/10.1109/cvprw56347.2022.00393

  4. Sun P, Cao J, Jiang Y et al (2022) Dancetrack: multi-object tracking in uniform appearance and diverse motion. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 20961–20970. https://doi.org/10.1109/cvpr52688.2022.02032

  5. Li X, Zhao Z, Wu J et al (2022) Y-bgd: broiler counting based on multi-object tracking. Comput Electron Agric 202:107347. https://doi.org/10.1016/j.compag.2022.107347

  6. Du Y, Zhao Z, Song Y et al (2023) Strongsort: make deepsort great again. IEEE Trans Multimed 25:8725–8737. https://doi.org/10.1109/tmm.2023.3240881

    Article  Google Scholar 

  7. Maggiolino G, Ahmad A, Cao J et al (2023) Deep oc-sort: multi-pedestrian tracking by adaptive re-identification. In: 2023 IEEE International conference on image processing (ICIP), pp 3025–3029. https://doi.org/10.1109/ICIP49359.2023.10222576

  8. Yang F, Odashima S, Masui S et al (2023) Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space. In: 2023 IEEE/CVF Winter conference on applications of computer vision (WACV), pp 4788–4797. https://doi.org/10.1109/WACV56688.2023.00478

  9. Bergmann P, Meinhardt T, Leal-Taixé L (2019) Tracking without bells and whistles. In: 2019 IEEE/CVF International conference on computer vision (ICCV). pp 941–951, https://doi.org/10.1109/ICCV.2019.00103

  10. Pang J, Qiu L, Li X et al (2021) Quasi-dense similarity learning for multiple object tracking. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 164–173. https://doi.org/10.1109/CVPR46437.2021.00023

  11. Cao J, Pang J, Weng X et al (2023) Observation-centric sort: rethinking sort for robust multi-object tracking. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 9686–9696. https://doi.org/10.1109/CVPR52729.2023.00934

  12. Liu Z, Wang X, Wang C et al (2023) Sparsetrack: multi-object tracking by performing scene decomposition based on pseudo-depth. arXiv:2306.05238

  13. He K, Gkioxari G, Dollár P et al (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397. https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  14. Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13

  15. Zhu X, Su W, Lu L et al (2021) Deformable detr: deformable transformers for end-to-end object detection. In: 2021 International conference on learning representations (ICLR). OpenReview.net, https://openreview.net/forum?id=gZ9hCDWe6ke

  16. Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721

  17. Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91

  18. Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International conference on image processing (ICIP), pp 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962

  19. Bewley A, Ge Z, Ott L et al (2016) Simple online and realtime tracking. In: 2016 IEEE International conference on image processing (ICIP), pp 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003

  20. Xiao C, Cao Q, Zhong Y et al (2023) Motiontrack: learning motion predictor for multiple object tracking. arXiv:2306.02585

  21. Welch G, Bishop G (1995) An introduction to the kalman filter. Tech. rep, USA

    Google Scholar 

  22. Zhang Y, Sun P, Jiang Y et al (2022) Bytetrack: multi-object tracking by associating every detection box. In: Computer vision – ECCV 2022, Cham, pp 1–21. https://doi.org/10.1007/978-3-031-20047-2_1

  23. Aharon N, Orfaig R, Bobrovsky BZ (2022) Bot-sort: robust associations multi-pedestrian tracking. arXiv:2206.14651

  24. Wang Z, Zheng L, Liu Y et al (2020) Towards real-time multi-object tracking. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 107–122.https://doi.org/10.1007/978-3-030-58621-8_7

  25. Zhou X, Koltun V, Krähenbühl P (2020) Tracking objects as points. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 474–490. https://doi.org/10.1007/978-3-030-58548-8_28

  26. Zhang Y, Wang C, Wang X et al (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129(11):3069–3087. https://doi.org/10.1007/s11263-021-01513-4

    Article  Google Scholar 

  27. Yan B, Jiang Y, Sun P et al (2022) Towards grand unification of object tracking. In: Computer vision – ECCV 2022, Cham, pp 733–751. https://doi.org/10.1007/978-3-031-19803-8_43

  28. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

  29. Sun P, Zhang R, Jiang Y et al (2021) Sparse r-cnn: end-to-end object detection with learnable proposals. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422

  30. Sun P, Cao J, Jiang Y et al (2020) Transtrack: multiple object tracking with transformer. arXiv:2012.15460

  31. Zeng F, Dong B, Zhang Y et al (2022) Motr: end-to-end multiple-object tracking with transformer. In: Computer vision – ECCV 2022. Springer Nature Switzerland, Cham, pp 659–675. https://doi.org/10.1007/978-3-031-19812-0_38

  32. Meinhardt T, Kirillov A, Leal-Taixe L et al (2022) Trackformer: multi-object tracking with transformers. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8834–8844. https://doi.org/10.1109/cvpr52688.2022.00864

  33. Gao R, Wang L (2023) Memotr: long-term memory-augmented transformer for multi-object tracking. In: 2023 IEEE/CVF International conference on computer vision (ICCV), pp 9867–9876. https://doi.org/10.1109/ICCV51070.2023.00908

  34. Zhang Y, Wang T, Zhang X (2023) Motrv2: bootstrapping end-to-end multi-object tracking by pretrained object detectors. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 22056–22065. https://doi.org/10.1109/CVPR52729.2023.02112

  35. Wieczorek M, Rychalska B, Dąbrowski J (2021) On the unreasonable effectiveness of centroids in image retril. In: Neural information processing. Springer International Publishing, Cham, pp 212–223. https://doi.org/10.1007/978-3-030-92273-3_18

  36. Cui Y, Zeng C, Zhao X et al (2023) Sportsmot: a large multi-object tracking dataset in multiple sports scenes. In: 2023 IEEE/CVF International conference on computer vision (ICCV), pp 9887–9897.https://doi.org/10.1109/ICCV51070.2023.00910

  37. Luiten J, Ošep A, Dendorfer P et al (2021) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis 129(2):548–578. https://doi.org/10.1007/s11263-020-01375-2

  38. Ristani E, Solera F, Zou R et al (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: Computer vision – ECCV 2016 Workshops. Springer International Publishing, Cham, pp 17–35. https://doi.org/10.1007/978-3-319-48881-3_2

  39. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Vid Process 1–10. https://doi.org/10.1155/2008/246309

  40. Yan F, Luo W, Zhong Y et al (2023) Bridging the gap between end-to-end and non-end-to-end multi-object tracking. arXiv:2305.12724

  41. Luo R, Song Z, Ma L et al (2024) Diffusiontrack: diffusion model for multi-object tracking. In: 2024 Proceedings of the AAAI conference on artificial intelligence, pp 3991–3999. https://doi.org/10.1609/AAAI.V38I5.28192

  42. Girbau A, Marqués F, Satoh S (2022) Multiple object tracking from appearance by hierarchically clustering tracklets. In: 2022 British machine vision conference (BMVC), p 362. https://bmvc2022.mpi-inf.mpg.de/362/

  43. Wu J, Cao J, Song L et al (2021) Track to detect and segment: an online multi-object tracker. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 12347–12356. https://doi.org/10.1109/CVPR46437.2021.01217

  44. Zhou X, Yin T, Koltun V et al (2022) Global tracking transformers. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8761–8770. https://doi.org/10.1109/CVPR52688.2022.00857

  45. Rezatofighi H, Tsoi N, Gwak J et al (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 658–666. https://doi.org/10.1109/CVPR.2019.00075

  46. Zheng Z, Wang P, Liu W et al (2020) Distance-iou loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34:12993–13000. https://doi.org/10.1609/aaai.v34i07.6999

Download references

Acknowledgements

This research was funded by the National Key Research and Development Program of China, grant number 2018YFC0823002, and the Fundamental Research Fund for the Central Uni-versities of China, grant number FRF-TP-20-10B, FRF-GF-19-010A.

Author information

Authors and Affiliations

Authors

Contributions

Zeyong Zhao: Conceptualization, Methodology, Software, Visualization, Writing - original draft. Jingyi Wu: Methodology, Writing - review. Ruicong Zhi: Supervision, Conceptualization, Writing - review.

Corresponding author

Correspondence to Ruicong Zhi.

Ethics declarations

Competing Interests

The authors declare that there are no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and Informed Consent for Data Used

The DanceTrack dataset and SportsMOT dataset both are open source datasets and are only used for non-commercial research purposes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Z., Wu, J. & Zhi, R. WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion. Appl Intell 54, 10018–10038 (2024). https://doi.org/10.1007/s10489-024-05682-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05682-w

Keywords