Skip to main content

End-to-End Multiple Object Tracking with Siamese Networks

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1517))

Included in the following conference series:

  • 1871 Accesses

Abstract

Multiple Object Tracking (MOT) consists of two components: detection and data association. In the popular tracking-by-detection models, these two components are separate: all objects of interest in a frame are detected first, and then associated with the objects in tracked queues using intersection-over-union (IOU) of bounding box and/or appearance matching. Appearance feature (ID embedding) can be extracted from a separate re-identification model or from a joint model with the detection network. In this paper, a joint detection and tracking model with Siamese structure is proposed for MOT. We apply CenterNet, an anchor free object detector for detection. Motion prediction and appearance matching are implemented with the network for association. Experimental results demonstrate the effectiveness of our model. State-of-the-art MOTA of 69.9% on MOT20 is achieved.

This work is supported by the National Natural Science Foundation of China under Project 61175116.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)

    Google Scholar 

  2. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468. IEEE (2016)

    Google Scholar 

  3. Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)

    Google Scholar 

  4. Chaabane, M., Zhang, P., Beveridge, J.R., et al.: DEFT: detection embeddings for tracking. arXiv:2102.02267v2 (2021)

  5. Ciaparrone, G., Sánchez, F.L., Tabik, S., Troiano, L., Tagliaferri, R., Herrera, F.: Deep learning in video multi-object tracking: a survey. Neurocomputing 381, 61–88 (2020)

    Article  Google Scholar 

  6. Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3038–3046 (2017)

    Google Scholar 

  7. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  8. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)

    Google Scholar 

  9. Liang, C., et al.: Rethinking the competition between detection and ReID in multi-object tracking. arXiv preprint arXiv:2010.12138 (2020)

  10. Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artif. Intell. 293, 103448 (2020)

    Article  MathSciNet  Google Scholar 

  11. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural. Inf. Process. Syst. 28, 91–99 (2015)

    Google Scholar 

  12. Shuai, B., Berneshawi, A.G., Modolo, D., Tighe, J.: Multi-object tracking with siamese track-RCNN. arXiv preprint arXiv:2004.07786 (2020)

  13. Sun, P., et al.: TransTrack: multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)

  14. Sun, S., Akhtar, N., Song, H., Mian, A., Shah, M.: Deep affinity network for multiple object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 104–119 (2019)

    Google Scholar 

  15. Wang, Y., Kitani, K., Weng, X.: Joint object detection and multi-object tracking with graph neural networks. arXiv preprint arXiv:2006.13164 (2020)

  16. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)

    Google Scholar 

  17. Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: transformers with dense queries for multiple-object tracking. arXiv preprint arXiv:2103.15145 (2021)

  18. Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021). https://doi.org/10.1007/s11263-021-01513-4

    Article  Google Scholar 

  19. Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28

    Chapter  Google Scholar 

  20. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinhua Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qin, J., Huang, C., Xu, J. (2021). End-to-End Multiple Object Tracking with Siamese Networks. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1517. Springer, Cham. https://doi.org/10.1007/978-3-030-92310-5_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92310-5_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92309-9

  • Online ISBN: 978-3-030-92310-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics