WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion

Zhao, Zeyong; Wu, Jingyi; Zhi, Ruicong

doi:10.1007/s10489-024-05682-w

WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion

Published: 01 August 2024

Volume 54, pages 10018–10038, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

304 Accesses
Explore all metrics

Abstract

With the surge in object detection, Multi-Object Tracking (MOT) research has recently witnessed significant advancements. However, most previous studies have primarily focused on benchmarks involving distinguishing appearances and linear motion. In scenarios involving non-linear motion and similar appearances, these methods exhibit a drastic drop in performance. To address this issue, we propose WDTtrack, which incorporates spatiotemporal proximity, velocity orientation, and appearance similarity simultaneously. Firstly, we employ the Centroid Triplet Loss ReID (CTL) model to extract high-quality appearance embeddings. Second, we introduce Wider Bounding Box (W-BBox) and Direction Bank (DB) to capture abundant credible, and discriminative motion cues. Finally, we devise the Tracklet Recovery Mechanism (TRM) to facilitate long-term tracking maintenance. Extensive empirical results demonstrate that WDTtrack outperforms other trackers on the DanceTrack and SportsMOT dataset, highlighting its effectiveness and potential for further development. Specifically, WDTtrack achieves a 66.8 HOTA score, a 72.8 IDF1 score and a 55.9 AssA score on DanceTrack, and a 73.8 HOTA score, a 80.5 IDF1 score and a 64.3 AssA score on SportsMOT, substantially surpassing other non-Transformer algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

StrongOC-SORT: Make Observation-Centric SORT More Robust

ETTrack: enhanced temporal motion predictor for multi-object tracking

Article Open access 27 November 2024

Tracking Small and Fast Moving Objects: A Benchmark

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of Data and Materials

The DanceTrack [4] dataset is available at https://github.com/DanceTrack/DanceTrack. And the Sports-MOT [36] dataset is available at https://github.com/MCG-NJU/Sports MOT.

References

Xia X, Meng Z, Han X et al (2023) An automated driving systems data acquisition and analytics platform. Transp Res C Emerg Technol 151:104120. https://doi.org/10.1016/j.trc.2023.104120
Gloudemans D, Work DB (2021) Fast vehicle turning-movement counting using localization-based tracking. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition workshops (CVPRW), pp 4150–4159. https://doi.org/10.1109/cvprw53098.2021.00469
Cioppa A, Giancola S, Deliege A et al (2022) Soccernet-tracking: multiple object tracking dataset and benchmark in soccer videos. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition workshops (CVPRW), pp 3490–3501. https://doi.org/10.1109/cvprw56347.2022.00393
Sun P, Cao J, Jiang Y et al (2022) Dancetrack: multi-object tracking in uniform appearance and diverse motion. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 20961–20970. https://doi.org/10.1109/cvpr52688.2022.02032
Li X, Zhao Z, Wu J et al (2022) Y-bgd: broiler counting based on multi-object tracking. Comput Electron Agric 202:107347. https://doi.org/10.1016/j.compag.2022.107347
Du Y, Zhao Z, Song Y et al (2023) Strongsort: make deepsort great again. IEEE Trans Multimed 25:8725–8737. https://doi.org/10.1109/tmm.2023.3240881
Article Google Scholar
Maggiolino G, Ahmad A, Cao J et al (2023) Deep oc-sort: multi-pedestrian tracking by adaptive re-identification. In: 2023 IEEE International conference on image processing (ICIP), pp 3025–3029. https://doi.org/10.1109/ICIP49359.2023.10222576
Yang F, Odashima S, Masui S et al (2023) Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space. In: 2023 IEEE/CVF Winter conference on applications of computer vision (WACV), pp 4788–4797. https://doi.org/10.1109/WACV56688.2023.00478
Bergmann P, Meinhardt T, Leal-Taixé L (2019) Tracking without bells and whistles. In: 2019 IEEE/CVF International conference on computer vision (ICCV). pp 941–951, https://doi.org/10.1109/ICCV.2019.00103
Pang J, Qiu L, Li X et al (2021) Quasi-dense similarity learning for multiple object tracking. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 164–173. https://doi.org/10.1109/CVPR46437.2021.00023
Cao J, Pang J, Weng X et al (2023) Observation-centric sort: rethinking sort for robust multi-object tracking. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 9686–9696. https://doi.org/10.1109/CVPR52729.2023.00934
Liu Z, Wang X, Wang C et al (2023) Sparsetrack: multi-object tracking by performing scene decomposition based on pseudo-depth. arXiv:2306.05238
He K, Gkioxari G, Dollár P et al (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397. https://doi.org/10.1109/TPAMI.2018.2844175
Article Google Scholar
Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13
Zhu X, Su W, Lu L et al (2021) Deformable detr: deformable transformers for end-to-end object detection. In: 2021 International conference on learning representations (ICLR). OpenReview.net, https://openreview.net/forum?id=gZ9hCDWe6ke
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721
Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International conference on image processing (ICIP), pp 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962
Bewley A, Ge Z, Ott L et al (2016) Simple online and realtime tracking. In: 2016 IEEE International conference on image processing (ICIP), pp 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003
Xiao C, Cao Q, Zhong Y et al (2023) Motiontrack: learning motion predictor for multiple object tracking. arXiv:2306.02585
Welch G, Bishop G (1995) An introduction to the kalman filter. Tech. rep, USA
Google Scholar
Zhang Y, Sun P, Jiang Y et al (2022) Bytetrack: multi-object tracking by associating every detection box. In: Computer vision – ECCV 2022, Cham, pp 1–21. https://doi.org/10.1007/978-3-031-20047-2_1
Aharon N, Orfaig R, Bobrovsky BZ (2022) Bot-sort: robust associations multi-pedestrian tracking. arXiv:2206.14651
Wang Z, Zheng L, Liu Y et al (2020) Towards real-time multi-object tracking. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 107–122.https://doi.org/10.1007/978-3-030-58621-8_7
Zhou X, Koltun V, Krähenbühl P (2020) Tracking objects as points. In: Computer vision – ECCV 2020. Springer International Publishing, Cham, pp 474–490. https://doi.org/10.1007/978-3-030-58548-8_28
Zhang Y, Wang C, Wang X et al (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129(11):3069–3087. https://doi.org/10.1007/s11263-021-01513-4
Article Google Scholar
Yan B, Jiang Y, Sun P et al (2022) Towards grand unification of object tracking. In: Computer vision – ECCV 2022, Cham, pp 733–751. https://doi.org/10.1007/978-3-031-19803-8_43
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Sun P, Zhang R, Jiang Y et al (2021) Sparse r-cnn: end-to-end object detection with learnable proposals. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422
Sun P, Cao J, Jiang Y et al (2020) Transtrack: multiple object tracking with transformer. arXiv:2012.15460
Zeng F, Dong B, Zhang Y et al (2022) Motr: end-to-end multiple-object tracking with transformer. In: Computer vision – ECCV 2022. Springer Nature Switzerland, Cham, pp 659–675. https://doi.org/10.1007/978-3-031-19812-0_38
Meinhardt T, Kirillov A, Leal-Taixe L et al (2022) Trackformer: multi-object tracking with transformers. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8834–8844. https://doi.org/10.1109/cvpr52688.2022.00864
Gao R, Wang L (2023) Memotr: long-term memory-augmented transformer for multi-object tracking. In: 2023 IEEE/CVF International conference on computer vision (ICCV), pp 9867–9876. https://doi.org/10.1109/ICCV51070.2023.00908
Zhang Y, Wang T, Zhang X (2023) Motrv2: bootstrapping end-to-end multi-object tracking by pretrained object detectors. In: 2023 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 22056–22065. https://doi.org/10.1109/CVPR52729.2023.02112
Wieczorek M, Rychalska B, Dąbrowski J (2021) On the unreasonable effectiveness of centroids in image retril. In: Neural information processing. Springer International Publishing, Cham, pp 212–223. https://doi.org/10.1007/978-3-030-92273-3_18
Cui Y, Zeng C, Zhao X et al (2023) Sportsmot: a large multi-object tracking dataset in multiple sports scenes. In: 2023 IEEE/CVF International conference on computer vision (ICCV), pp 9887–9897.https://doi.org/10.1109/ICCV51070.2023.00910
Luiten J, Ošep A, Dendorfer P et al (2021) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis 129(2):548–578. https://doi.org/10.1007/s11263-020-01375-2
Ristani E, Solera F, Zou R et al (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: Computer vision – ECCV 2016 Workshops. Springer International Publishing, Cham, pp 17–35. https://doi.org/10.1007/978-3-319-48881-3_2
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Vid Process 1–10. https://doi.org/10.1155/2008/246309
Yan F, Luo W, Zhong Y et al (2023) Bridging the gap between end-to-end and non-end-to-end multi-object tracking. arXiv:2305.12724
Luo R, Song Z, Ma L et al (2024) Diffusiontrack: diffusion model for multi-object tracking. In: 2024 Proceedings of the AAAI conference on artificial intelligence, pp 3991–3999. https://doi.org/10.1609/AAAI.V38I5.28192
Girbau A, Marqués F, Satoh S (2022) Multiple object tracking from appearance by hierarchically clustering tracklets. In: 2022 British machine vision conference (BMVC), p 362. https://bmvc2022.mpi-inf.mpg.de/362/
Wu J, Cao J, Song L et al (2021) Track to detect and segment: an online multi-object tracker. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 12347–12356. https://doi.org/10.1109/CVPR46437.2021.01217
Zhou X, Yin T, Koltun V et al (2022) Global tracking transformers. In: 2022 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8761–8770. https://doi.org/10.1109/CVPR52688.2022.00857
Rezatofighi H, Tsoi N, Gwak J et al (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 658–666. https://doi.org/10.1109/CVPR.2019.00075
Zheng Z, Wang P, Liu W et al (2020) Distance-iou loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34:12993–13000. https://doi.org/10.1609/aaai.v34i07.6999

Download references

Acknowledgements

This research was funded by the National Key Research and Development Program of China, grant number 2018YFC0823002, and the Fundamental Research Fund for the Central Uni-versities of China, grant number FRF-TP-20-10B, FRF-GF-19-010A.

Author information

Authors and Affiliations

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Zeyong Zhao, Jingyi Wu & Ruicong Zhi
Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, 100083, China
Zeyong Zhao, Jingyi Wu & Ruicong Zhi

Authors

Zeyong Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Jingyi Wu
View author publications
You can also search for this author inPubMed Google Scholar
Ruicong Zhi
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Zeyong Zhao: Conceptualization, Methodology, Software, Visualization, Writing - original draft. Jingyi Wu: Methodology, Writing - review. Ruicong Zhi: Supervision, Conceptualization, Writing - review.

Corresponding author

Correspondence to Ruicong Zhi.

Ethics declarations

Competing Interests

The authors declare that there are no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and Informed Consent for Data Used

The DanceTrack dataset and SportsMOT dataset both are open source datasets and are only used for non-commercial research purposes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, Z., Wu, J. & Zhi, R. WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion. Appl Intell 54, 10018–10038 (2024). https://doi.org/10.1007/s10489-024-05682-w

Download citation

Accepted: 06 July 2024
Published: 01 August 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10489-024-05682-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

WDTtrack: tracking multiple objects with indistinguishable appearance and irregular motion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

StrongOC-SORT: Make Observation-Centric SORT More Robust

ETTrack: enhanced temporal motion predictor for multi-object tracking

Tracking Small and Fast Moving Objects: A Benchmark

Explore related subjects

Availability of Data and Materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now