Skip to main content
Log in

Online multi-object tracking using multi-function integration and tracking simulation training

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Recently, with the development of deep-learning, the performance of multi-object tracking algorithms based on deep neural networks has been greatly improved. However, most methods separate different functional modules into multiple networks and train them independently on specific tasks. When these network modules are used directly, they are not compatible with each other effectively, nor can they be better adapted to the multi-object tracking task, which leads to a poor tracking effect. Therefore, a network structure is designed to aggregate the regression of objects between frames and the extraction of appearance features into one model to improve the harmony between various functional modules of multi-object tracking. To improve the support for the multi-object tracking task, an end-to-end training method is also proposed to simulate the multi-object tracking process during the training and expand the training data by using the historical position of the target combined with the prediction of the motion model. A metric loss that can take advantage of the historical appearance features of the target is also used to train the extraction module of appearance features to improve the temporal correlation of extracted appearance features. Evaluation results on the MOTChallenge benchmark datasets show that the proposed approach achieves state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited, in: Proceedings of the IEEE international conference on computer vision, pp. 4696–4704

  2. Bae S-H, Yoon K-J (2014) Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1218–1225

  3. Lenz P, Geiger A, Urtasun R (2015) Followme: Efficient online min-cost flow tracking with bounded memory and computation, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 4364–4372

  4. Wu Z, Thangali A, Sclaroff S, Betke M (2012) Coupling detection and data association for multiple object tracking, in: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1948-1955

  5. Xu J, Cao Y, Zhang Z, Hu H (2019) Spatial-temporal relation networks for multi-object tracking, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 3988–3998

  6. Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 4836–4845

  7. Zhu J, Yang H, Liu N, Kim M, Zhang W, Yang M-H (2018) Online multi-object tracking with dual matching attention networks, in: Proceedings of the European Conference on Computer Vision, pp. 366–382

  8. Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6638–6646

  9. Feng W, Hu Z, Wu W, Yan J, Ouyang W (2019) Multi-object tracking with multiple cues and switcher-aware classification, arXiv:1901.06129

  10. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980

  11. Chu P, Fan H, Tan CC, Ling H (2019) Online multi-object tracking with instance-aware tracker and dynamic model refreshment, in: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 161–170

  12. Chen L, Ai H, Zhuang Z, Shang C (2018) Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification, in: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 1–6

  13. Yoon Y-C, Boragule A, Song Y-M, Yoon K, Jeon M (2018) Online multi-object tracking with historical appearance matching and scene adaptive detection filtering, in: Proceedings of the IEEE International conference on advanced video and signal based surveillance, pp. 1–6

  14. Yoon Y-C, Kim DY, Yoon K, Song Y-m, Jeon M (2019) Online multiple pedestrian tracking using deep temporal appearance matching association, arXiv:1907.00831

  15. Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles, in: Proceedings of the IEEE international conference on computer vision, pp. 941–951

  16. Kalman RE (1960) A new approach to linear filtering and prediction problems. ASME J Basic Eng March 82(1):35–45

    Article  MathSciNet  Google Scholar 

  17. Evangelidis GD, Psarakis EZ (2008) Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans Pattern Anal Mach Intell 30(10):1858–1865

    Article  Google Scholar 

  18. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587

  19. Girshick R (2015) Fast r-cnn, in: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448

  20. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks, in: Proceedings of the Advances in neural information processing systems, pp. 91–99

  21. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788

  22. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271

  23. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement, arXiv:1804.02767

  24. Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 3038–3046

  25. Kieritz H, Hubner W, Arens M (2018) Joint detection and online multi-object tracking, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1459–1467

  26. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector, in: Proceedings of the European Conference on Computer Vision, pp. 21–37

  27. Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking, in: Proceedings of the IEEE International Conference on Image Processing, pp. 3464–3468

  28. Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric, in: Proceedings of the IEEE International Conference on Image Processing, pp. 3645–3649

  29. Huang P, Han S, Zhao J, Liu D, Wang H, Yu E, Kot AC (2020) Refinements in motion and appearance for online multi-object tracking, arXiv:2003.07177

  30. Milan A, Rezatofighi SH, Dick A, ReID I, Schindler K (2016) Online multi-object tracking using recurrent neural networks, arXiv:1604.03635

  31. Fang K, Xiang Y, Li X, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking, in: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 466–475

  32. Takala V, Pietikainen M (2007) Multi-object tracking using color, texture and motion, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–7

  33. Yang M, Jia YJCV, Understanding I (2016) Temporal dynamic appearance modeling for online multi-person tracking. Comput Vis Image Underst 153:16–28

    Article  Google Scholar 

  34. Wang L, Xu L, Kim MY, Rigazico L, Yang M-H (2017) Online multiple object tracking via flow and convolutional features, in: Proceedings of the IEEE International Conference on Image Processing, pp. 3630–3634

  35. Yu F, Li W, Li Q, Liu Y, Shi X, Yan J (2016) Poi: Multiple object tracking with high performance detection and appearance feature, in: Proceedings of the European Conference on Computer Vision, pp. 36–42

  36. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9

  37. Mahmoudi N, Ahadi SM, Rahmati M (2019) Multi-object tracking using CNN-based features: CNNMTT. Multimed. Tools Appl 78(6):7077–7096

    Article  Google Scholar 

  38. Hermans A, Beyer L, Leibe B.J.a.p.a. (2017) In defense of the triplet loss for person re-identification, arXiv:1703.07737

  39. Sun S, Akhtar N, Song H, Mian AS, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell:1

  40. Xu Y, Osep A, Ban Y, Horaud R, Leal-Taixé L, Alameda-Pineda X (2020) How to train your deep multi-object tracker, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6787–6796

  41. Chu P, Ling H (2019) Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 6172–6181

  42. Shi X, Ling H, Pang Y, Hu W, Chu P, Xing J (2019) Rank-1 tensor approximation for high-order association in multi-object tracking. Int J Comput Vis 127(8):1063–1083

    Article  MathSciNet  Google Scholar 

  43. G. Brasó, L. Leal-Taixé (2020) Learning a neural solver for multiple object tracking, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6247–6257

  44. Gündüz G, Acarman T (2019) Efficient multi-object tracking by strong associations on temporal window. IEEE Transactions on Intelligent Vehicles 4(3):447–455

    Article  Google Scholar 

  45. Osep A, Mehner W, Mathias M, Leibe B (2017) Combined image-and world-space tracking in traffic scenes. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1988–1995

  46. Yoon JH, Lee CR, Yang MH, Yoon KJ (2016) Online multi-object tracking via structural constraint event aggregation. In Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1392–1400

  47. Wang S, Fowlkes CC (2017) Learning optimal parameters for multi-target tracking with contextual interactions. International journal of computer vision 122(3):484–501

    Article  MathSciNet  Google Scholar 

  48. Gündüz G, Acarman T (2018) A lightweight online multiple object vehicle tracking method. In 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 427–432

  49. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

  50. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125

  51. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  52. Leal-Taixé L, Milan A, ReID I, Roth S, Schindler K (2015) Motchallenge 2015: Towards a benchmark for multi-object tracking, arXiv:1504.01942

  53. Milan A, Leal-Taixé L, ReID I, Roth S, Schindler K (2016) MOT16: A benchmark for multi-object tracking, arXiv:1603.00831

  54. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  55. Yang F, Choi W, Lin Y (2016) Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2129–2137

  56. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP Journal on Image and Video Processing 2008:1–10

    Article  Google Scholar 

  57. Luiten J, Osep A, Dendorfer P, Torr P, Geiger A, Leal-Taixé L, Leibe B (2020) HOTA: a higher order metric for evaluating multi-object tracking. Int J Comput Vis:1–31

  58. Lin TY, Maire M, Belongie S, James P, Perona P, Ramanan D, Piotr D, Zitnick CL (2014). Microsoft Coco: Common Objects in Context. in: Proceedings of the European Conference on Computer Vision, pp. 740–755

Download references

Acknowledgements

This paper was supported by the Graduate Innovation Foundation of Jiangsu Province [grant No. KYLX16_0781]; the Natural Science Foundation of Jiangsu Province [grants No. BK20181340];the 111 Project [grants No. B12018]; PAPD of Jiangsu Higher Education Institutions; National Natural Science Foundation of China [grants No. 61806006]; China Postdoctoral Science Foundation [Grant No. 2019 M660149].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongwei Ge.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Ge, H., Yang, J. et al. Online multi-object tracking using multi-function integration and tracking simulation training. Appl Intell 52, 1268–1288 (2022). https://doi.org/10.1007/s10489-021-02457-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02457-5

Keywords

Navigation