Skip to main content

Instance-Wise Contrastive Learning for Multi-object Tracking

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2022)

Abstract

Multi-Object Tracking (MOT) is an important yet challenging problem in the field of computer vision. We observed that it is difficult for a single motion model to maintain the consistency of ID in complex scenes such as camera shake and pedestrian relative motion. Therefore, we propose an Instance-wise Contrastive Learning (ICL) method to jointly perform detection and embedding in a unified network. Specifically, to deal with the instance matching problem in the dynamic clutter situations, an instance-wise contrastive loss is introduced to make all the same instances to be near together, whereas all negatives are separated by a specified distance. Consequently, the semantic embedding space can be learned to not only help detect the moving objects but also contribute to the instance matching process. Furthermore, we adopt contextual information to perform the moving objects association along sequence frames. Comprehensive evaluations on three public tracking datasets (i.e., MOT16, MOT17, and MOT20) well demonstrate the superiority of our ICLTracker over other state-of-the-arts for the multi-object tracking task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    SimOTA dynamically selects the top-k positive samples for each ground-truth.

References

  1. Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: ICCV, pp. 941–951 (2019)

    Google Scholar 

  2. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)

    Article  Google Scholar 

  3. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP, pp. 3464–3468. IEEE (2016)

    Google Scholar 

  4. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)

  5. Guo, S., Wang, J., Wang, X., Tao, D.: Online multiple object tracking with cross-task synergy. In: CVPR, pp. 8136–8145 (2021)

    Google Scholar 

  6. Han, S., Huang, P., Wang, H., Yu, E., Liu, D., Pan, X.: Mat: motion-aware multi-object tracking. Neurocomputing 476, 75–86 (2022)

    Article  Google Scholar 

  7. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR, pp. 9729–9738 (2020)

    Google Scholar 

  8. Kim, C., Fuxin, L., Alotaibi, M., Rehg, J.M.: Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In: CVPR, pp. 9553–9562 (2021)

    Google Scholar 

  9. Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: ICCV, pp. 4696–4704 (2015)

    Google Scholar 

  10. Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese cnn for robust target association. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 33–40 (2016)

    Google Scholar 

  11. Liang, C., Zhang, Z., Zhou, X., Li, B., Lu, Y., Hu, W.: One more check: Making" fake background" be tracked again. arXiv preprint arXiv:2104.09441 (2021)

  12. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: CVPR, pp. 8759–8768 (2018)

    Google Scholar 

  13. Lu, Z., Rathod, V., Votel, R., Huang, J.: Retinatrack: Online single stage joint detection and tracking. In: CVPR, pp. 14668–14678 (2020)

    Google Scholar 

  14. Luiten, J., et al.: Hota: A higher order metric for evaluating multi-object tracking. IJCV 129(2), 548–578 (2021)

    Article  Google Scholar 

  15. Micikevicius, P., et al.: Mixed precision training. arXiv preprint arXiv:1710.03740 (2017)

  16. Milan, A., Rezatofighi, S.H., Dick, A., Reid, I., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  17. Pang, B., Li, Y., Zhang, Y., Li, M., Lu, C.: Tubetk: Adopting tubes to track multi-object in a one-step training model. In: CVPR, pp. 6308–6318 (2020)

    Google Scholar 

  18. Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., Yu, F.: Quasi-dense similarity learning for multiple object tracking. In: CVPR, pp. 164–173 (2021)

    Google Scholar 

  19. Peng, J., et al.: Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9

    Chapter  Google Scholar 

  20. Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2

    Chapter  Google Scholar 

  21. Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: ICCV, pp. 300–311 (2017)

    Google Scholar 

  22. Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5620–5629 (2017)

    Google Scholar 

  23. Sun, P., et al.: Transtrack: Multiple object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)

  24. Tokmakov, P., Li, J., Burgard, W., Gaidon, A.: Learning to track with object permanence. In: ICCV, pp. 10860–10869 (2021)

    Google Scholar 

  25. Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)

    Google Scholar 

  26. Wang, Q., Zheng, Y., Pan, P., Xu, Y.: Multiple object tracking with correlation learning. In: CVPR, pp. 3876–3886 (2021)

    Google Scholar 

  27. Wang, Y., Weng, X., Kitani, K.: Joint detection and multi-object tracking with graph neural networks. arXiv preprint arXiv:2006.13164 1(2) (2020)

  28. Wang, Z., Zheng, L., Liu, Y., Li, Y., Wang, S.: Towards real-time multi-object tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 107–122. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_7

    Chapter  Google Scholar 

  29. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP, pp. 3645–3649. IEEE (2017)

    Google Scholar 

  30. Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: An online multi-object tracker. In: CVPR, pp. 12352–12361 (2021)

    Google Scholar 

  31. Yang, B., Nevatia, R.: An online learned crf model for multi-target tracking. In: CVPR, pp. 2034–2041. IEEE (2012)

    Google Scholar 

  32. Yang, F., Chang, X., Sakti, S., Wu, Y., Nakamura, S.: Remot: a model-agnostic refinement for multiple object tracking. Image Vis. Comput. 106, 104091 (2021)

    Google Scholar 

  33. Zhang, Y., et al.: Bytetrack: Multi-object tracking by associating every detection box. arXiv preprint arXiv:2110.06864 (2021)

  34. Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: on the fairness of detection and re-identification in multiple object tracking. IJCV 129(11), 3069–3087 (2021)

    Article  Google Scholar 

  35. Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunyan Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, Q., Xu, C. (2022). Instance-Wise Contrastive Learning for Multi-object Tracking. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13537. Springer, Cham. https://doi.org/10.1007/978-3-031-18916-6_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18916-6_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18915-9

  • Online ISBN: 978-3-031-18916-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics