Skip to main content

Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12369))

Included in the following conference series:

  • 4289 Accesses

Abstract

Deep learning based Multiple Object Tracking (MOT) currently relies on off-the-shelf detectors for tracking-by-detection. This results in deep models that are detector biased and evaluations that are detector influenced. To resolve this issue, we introduce Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects’ motion parameters to perform joint detection and association in an end-to-end manner. DMM-Net models object features over multiple frames and simultaneously infers object classes, visibility and their motion parameters. These outputs are readily used to update the tracklets for efficient MOT. DMM-Net achieves PR-MOTA score of 12.80 @ 120+ fps for the popular UA-DETRAC challenge - which is better performance and orders of magnitude faster. We also contribute a synthetic large-scale public dataset Omni-MOT for vehicle tracking that provides precise ground-truth annotations to eliminate the detector influence in MOT evaluation. This 14M+ frames dataset is extendable with our public script (Code at Dataset, Dataset Recorder, Omni-MOT Source). We demonstrate the suitability of Omni-MOT for deep learning with DMM-Net, and also make the source code of our network public.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Notation are adopted from the original work.

References

  1. Andriyenko, A., Schindler, K.: Multi-target tracking by continuous energy minimization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1265–1272 (2011). https://doi.org/10.1109/CVPR.2011.5995311

  2. Andriyenko, A., Schindler, K., Roth, S.: Discrete-continuous optimization for multi-target tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1926–1933 (2012). https://doi.org/10.1109/CVPR.2012.6247893

  3. Bae, S.H., Yoon, K.J.: Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 595–610 (2018). https://doi.org/10.1109/TPAMI.2017.2691769

    Article  Google Scholar 

  4. Berclaz, J., Fleuret, F., Türetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. IEEE TPAMI 33(9), 1806–1819 (2011). https://doi.org/10.1109/TPAMI.2011.21

    Article  Google Scholar 

  5. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. Eurasip J. Image Video Process. 2008 (2008). https://doi.org/10.1155/2008/246309

  6. Bochinski, E., Eiselein, V., Sikora, T.: High-Speed tracking-by-detection without using image information. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2017 (2017). https://doi.org/10.1109/AVSS.2017.8078516

  7. Butt, A.A., Collins, R.T.: Multi-target tracking by Lagrangian relaxation to min-cost network flow. In: Proceedings of CVPR, pp. 1846–1853 (2013)

    Google Scholar 

  8. Chari, V., Lacoste-Julien, S., Laptev, I., Sivic, J.: On pairwise costs for network flow multi-object tracking. In: Proceedings of CVPR, 07–12 June, pp. 5537–5545 (2015). https://doi.org/10.1109/CVPR.2015.7299193

  9. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS 2016 Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 379–387 (2016). https://academic.microsoft.com/paper/2407521645

  10. Dehghan, A., Modiri Assari, S., Shah, M.: GMMCP tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: Proceedings of CVPR, pp. 4091–4099 (2015)

    Google Scholar 

  11. Dicle, C., Camps, O.I., Sznaier, M.: The way they move: Tracking multiple targets with similar appearance. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2304–2311 (2013). https://doi.org/10.1109/ICCV.2013.286

  12. Dollar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2014). https://doi.org/10.1109/TPAMI.2014.2300479

    Article  Google Scholar 

  13. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an Open Urban Driving Simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017). http://arxiv.org/abs/1711.03938

  14. Emami, P., Pardalos, P.M., Elefteriadou, L., Ranka, S.: Machine learning methods for solving assignment problems in multi-target tracking. arXiv:1802.068971(1), 1–35 (2018)

  15. Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect. In: Proceedings of the IEEE International Conference on Computer Vision 2017-October, pp. 3057–3065 (2017). https://doi.org/10.1109/ICCV.2017.330

  16. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE TPAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  17. Ferryman, J., Shahrokni, A.: PETS2009: Dataset and challenge. In: Proceedings of the 12th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS-Winter 2009 (2009). https://doi.org/10.1109/PETS-WINTER.2009.5399556

  18. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074

  19. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014). https://doi.org/10.1109/CVPR.2014.81

  20. Hara, K., Kataoka, H., Satoh, Y.: Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? In: CVPR, pp. 6546–6555 (2018). https://doi.org/10.1109/CVPR.2018.00685, http://arxiv.org/abs/1711.09577

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016). https://doi.org/10.1109/CVPR.2016.90

  22. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv (2015)

    Google Scholar 

  23. Iqbal, U., Milan, A., Gall, J.: PoseTrack: joint multi-person pose estimation and tracking. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 (2017). https://doi.org/10.1109/CVPR.2017.495

  24. Kingma, D.P., Ba, J.L.: Adam: a Method for Stochastic Optimization. In: ICLR 2015 : International Conference on Learning Representations 2015 (2015). https://academic.microsoft.com/paper/2964121744

  25. Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942 [cs] pp. 1–15 (2015). http://arxiv.org/abs/1504.01942

  26. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  27. Luo, W., et al.: Multiple object tracking: a literature review. arXiv:1409.7618v4, pp. 1–18 (2017). https://doi.org/10.1145/0000000.0000000

  28. Milan, A., Leal-Taixe, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. CoRR abs/1603.0 (2016). http://arxiv.org/abs/1603.00831

  29. Milan, A., Roth, S., Schindler, K.: Continuous energy minimization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 58–72 (2014). https://doi.org/10.1109/TPAMI.2013.103

    Article  Google Scholar 

  30. Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957). https://doi.org/10.1137/0105003

    Article  MathSciNet  MATH  Google Scholar 

  31. Nair, V., Hinton, G.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (2010)

    Google Scholar 

  32. Paszke, A., et al.: Automatic differentiation in PyTorch. Adv. Neural Inf. Process. Syst. 30(Nips), 1–4 (2017)

    Google Scholar 

  33. Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1201–1208 (2011). https://doi.org/10.1109/CVPR.2011.5995604

  34. Reid, D., et al.: An algorithm for tracking multiple targets. IEEE Trans. Autom. Control 24(6), 843–854 (1979)

    Article  Google Scholar 

  35. Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2

    Chapter  Google Scholar 

  36. Ristani, E., Tomasi, C.: Tracking multiple people online and in real time. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9007, pp. 444–459. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16814-2_29

    Chapter  Google Scholar 

  37. Roshan Zamir, A., Dehghan, A., Shah, M.: GMCP-tracker: global multi-object tracking using generalized minimum clique graphs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 343–356. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_25

    Chapter  Google Scholar 

  38. Shafique, K., Shah, M.: A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans. Pattern Anal. Mach. Intell. 27(1), 51–65 (2005)

    Article  Google Scholar 

  39. Sheng, H., Zhang, Y., Chen, J., Xiong, Z., Zhang, J.: Heterogeneous association graph fusion for target association in multiple object tracking. IEEE Trans. Circ. Syst. Video Technol. 29, 3269–3280 (2018)

    Article  Google Scholar 

  40. Shitrit, H.B., Berclaz, J., Fleuret, F., Fua, P.: Multi-commodity network flow for tracking multiple people. IEEE TPAMI 36(8), 1614–1627 (2014)

    Article  Google Scholar 

  41. Shu, G., Dehghan, A., Oreifej, O., Hand, E., Shah, M.: Part-based multiple-person tracking with partial occlusion handling. In: Proceedings of CVPR, pp. 1815–1821. IEEE (2012)

    Google Scholar 

  42. Sun, S., Akhtar, N., Song, H., Mian, A., Shah, M.: Deep affinity network for multiple object tracking 13(9), 1–15 (2018). http://arxiv.org/abs/1810.11780

  43. Tian, Y., Dehghan, A., Shah, M.: On detection, data association and segmentation for multi-target tracking. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2146–2160 (2018)

    Article  Google Scholar 

  44. Voigtlaender, P., et al.: Mots: multi-object tracking and segmentation. In: Proceedings of CVPR, pp. 7942–7951 (2019)

    Google Scholar 

  45. Wen, L., et al.: UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking (2015). http://arxiv.org/abs/1511.04136

  46. Wen, L., Du, D., Li, S., Bian, X., Lyu, S.: Learning non-uniform hypergraph for multi-object tracking. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8981–8988 (2019)

    Google Scholar 

  47. Wen, L., Li, W., Yan, J., Lei, Z., Yi, D., Li, S.Z.: Multiple target tracking based on undirected hierarchical relation hypergraph. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 1282–1289 (2014). https://doi.org/10.1109/CVPR.2014.167

  48. Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)

    Article  Google Scholar 

  49. Zhu, J., Yang, H., Liu, N., Kim, M.: Online Multi-Object Tracking with Dual Matching Attention Networks, pp. 1–17 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huansheng Song .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 865 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, S., Akhtar, N., Song, X., Song, H., Mian, A., Shah, M. (2020). Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12369. Springer, Cham. https://doi.org/10.1007/978-3-030-58586-0_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58586-0_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58585-3

  • Online ISBN: 978-3-030-58586-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics