Skip to main content

Learning Discriminative Proposal Representation for Multi-object Tracking

  • Conference paper
  • First Online:
Image and Graphics (ICIG 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14356))

Included in the following conference series:

  • 528 Accesses

Abstract

Multiple object tracking (MOT) by tracklets rather than discrete detections has received more attention in recent years. Following the tracking-by-detection paradigm, many approaches treat tracklets as individual units in data association, aiming at exploiting local or global relationships among them. However, the problem of fragmentations still remains. When severe occlusions occur, adjacent trajectories will collapse into many ambiguous tracklets, which renders tracklet representations to be unreliable. To address this, we treat potential tracklets to be linked as a proposal and propose a trainable tracklet-to-proposal embedding framework based on graph attention network (GAT). Guided by tracklet-wise information, our framework mainly designs two tracklet-embedding modules to extract intra- and inter-tracklet features to generate discriminative representations of tracklet-based proposals, enhancing the accuracy of proposal classification. We experimentally demonstrate that the proposed method significantly outperforms previous state-of-the-art techniques on MOT17 public benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Schulter, S., Vernaza, P., Choi, W., Chandraker, M.: Deep network flow for multi-object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6951–6960 (2017)

    Google Scholar 

  2. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP, pp. 3645–3649 (2017)

    Google Scholar 

  3. Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)

    Google Scholar 

  4. Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021)

    Google Scholar 

  5. Jiang, X., Li, P., Li, Y., Zhen, X.: Graph neural based end-to-end data association framework for online multiple-object tracking. arXiv preprint arXiv:1907.05315 (2019)

  6. Wang, G., Gu, R., Liu, Z., Hu, W., Song, M., Hwang, J.N.: Track without appearance: learn box and tracklet embedding with local and global motion patterns for vehicle tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9876–9886 (2021)

    Google Scholar 

  7. Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  8. Brasó, G., Leal-Taixé, L.: Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6247–6257 (2020)

    Google Scholar 

  9. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  10. Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E: Neural message passing for quantum chemistry. In: International Conference on Machine Learning, pp. 1263–1272. PMLR (2017)

    Google Scholar 

  11. Dai, P., Weng, R., Choi, W., Zhang, C., He, Z., Ding, W.: Learning a proposal classifier for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2443–2452 (2021)

    Google Scholar 

  12. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP, pp. 3464–3468 (2016)

    Google Scholar 

  13. Wang, G., Wang, Y., Zhang, H., Gu, R., Hwang, J.N.: Exploit the connectivity: multi-object tracking with trackletnet. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 482–490 (2019)

    Google Scholar 

  14. Zhang, Y., et al.: Long-term tracking with deep tracklet association. IEEE Trans. Image Process. 29, 6694–6706 (2020)

    Article  MATH  Google Scholar 

  15. Chu, P., Wang, J., You, Q., Ling, H., Liu, Z.: TransMOT: spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 4870–4880 (2023)

    Google Scholar 

  16. Chen, J., Sheng, H., Zhang, Y., Xiong, Z.: Enhancing detection model for multiple hypothesis tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 18–27 (2017)

    Google Scholar 

  17. Li, S., Kong, Y., Rezatofighi, H.: Learning of global objective for network flow in multi-object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8855–8865 (2022)

    Google Scholar 

  18. Sheng, H., Chen, J., Zhang, Y., Ke, W., Xiong, Z., Yu, J.: Iterative multiple hypothesis tracking with tracklet-level association. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3660–3672 (2018)

    Article  Google Scholar 

  19. Yang, F., Choi, W., Lin, Y.: Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers. In: CVPR (2016)

    Google Scholar 

  20. Shitrit, H.B., Berclaz, J., Fleuret, F., Fua, P.: Multi-commodity network flow for tracking multiple people. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1614–1627 (2013)

    Article  Google Scholar 

  21. Wang, B., Wang, G., Luk Chan, K., Wang, L.: Tracklet association with online target-specific metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1234–1241 (2014)

    Google Scholar 

  22. Wang, B., Wang, G., Chan, K.L., Wang, L.: Tracklet association by online target-specific metric learning and coherent dynamics estimation. IEEE Trans. Pattern Anal. Mach. Intell. 39(3), 589–602 (2016)

    Article  Google Scholar 

  23. Yang, B., Nevatia, R.: Multi-target tracking by online learning of non-linear motion patterns and robust appearance models. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1918–1925 (2012)

    Google Scholar 

  24. Kim, C., Li, F., Ciptadi, A., Rehg, J. M.: Multiple hypothesis tracking revisited. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4696–4704 (2015)

    Google Scholar 

  25. Wang, Y., Kitani, K., Weng, X.: Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13708–13715 (2021)

    Google Scholar 

  26. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

  27. Hornakova, A., Kaiser, T., Swoboda, P., Rolinek, M., Rosenhahn, B., Henschel, R.: Making higher order mot scalable: an efficient approximate solver for lifted disjoint paths. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6330–6340 (2021)

    Google Scholar 

  28. Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)

  29. He, L., Liao, X., Liu, W., Liu, X., Cheng, P., Mei, T.: FastReID: a pytorch toolbox for general instance re-identification. arXiv preprint arXiv:2006.02631 (2020)

  30. Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Industr. Appl. Math. 5(1), 32–38 (1957)

    Google Scholar 

  31. Liu, Q., Chu, Q., Liu, B., Yu, N.: GSM: graph similarity model for multi-object tracking. In: IJCAI, pp. 530–536 (2020)

    Google Scholar 

  32. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear MOT metrics. EURASIP J. Image Video Process. 2008, 246309 (2008). https://doi.org/10.1155/2008/246309

  33. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)

    Google Scholar 

  34. Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2

    Chapter  Google Scholar 

  35. Luiten, J., et al.: HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis. 129, 548–578 (2021)

    Google Scholar 

  36. Hornakova, A., Henschel, R., Rosenhahn, B., Swoboda, P.: Lifted disjoint paths with application in multiple object tracking. In: International Conference on Machine Learning, pp. 4364–4375. PMLR (2020)

    Google Scholar 

  37. Brody, S., Alon, U., Yahav, E.: How attentive are graph attention networks? arXiv preprint arXiv:2105.14491 (2021)

  38. Lin, T. Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  39. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30 (2017)

    Google Scholar 

  40. Xu, J., Cao, Y., Zhang, Z., Hu, H.: Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3988–3998 (2019)

    Google Scholar 

  41. He, J., Huang, Z., Wang, N., Zhang, Z.: Learnable graph matching: incorporating graph partitioning with deep feature learning for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5299–5309 (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported partially by the NSFC (U19114 01, U1811461, 62076260, 61772570), Guangdong Natural Science Funds Project (2020B1515120085), Guangdong NSF for Distinguished Young Scholar (2022B151 5020009), and the Key-Area Research and Development Program of Guangzhou (202007030004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-Fang Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, Y., Liu, X., Zhang, Y., Hu, JF. (2023). Learning Discriminative Proposal Representation for Multi-object Tracking. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14356. Springer, Cham. https://doi.org/10.1007/978-3-031-46308-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46308-2_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46307-5

  • Online ISBN: 978-3-031-46308-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics