Abstract
Multi-Object Tracking (MOT) is an important yet challenging problem in the field of computer vision. We observed that it is difficult for a single motion model to maintain the consistency of ID in complex scenes such as camera shake and pedestrian relative motion. Therefore, we propose an Instance-wise Contrastive Learning (ICL) method to jointly perform detection and embedding in a unified network. Specifically, to deal with the instance matching problem in the dynamic clutter situations, an instance-wise contrastive loss is introduced to make all the same instances to be near together, whereas all negatives are separated by a specified distance. Consequently, the semantic embedding space can be learned to not only help detect the moving objects but also contribute to the instance matching process. Furthermore, we adopt contextual information to perform the moving objects association along sequence frames. Comprehensive evaluations on three public tracking datasets (i.e., MOT16, MOT17, and MOT20) well demonstrate the superiority of our ICLTracker over other state-of-the-arts for the multi-object tracking task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
SimOTA dynamically selects the top-k positive samples for each ground-truth.
References
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: ICCV, pp. 941–951 (2019)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP, pp. 3464–3468. IEEE (2016)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Guo, S., Wang, J., Wang, X., Tao, D.: Online multiple object tracking with cross-task synergy. In: CVPR, pp. 8136–8145 (2021)
Han, S., Huang, P., Wang, H., Yu, E., Liu, D., Pan, X.: Mat: motion-aware multi-object tracking. Neurocomputing 476, 75–86 (2022)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR, pp. 9729–9738 (2020)
Kim, C., Fuxin, L., Alotaibi, M., Rehg, J.M.: Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In: CVPR, pp. 9553–9562 (2021)
Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: ICCV, pp. 4696–4704 (2015)
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese cnn for robust target association. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 33–40 (2016)
Liang, C., Zhang, Z., Zhou, X., Li, B., Lu, Y., Hu, W.: One more check: Making" fake background" be tracked again. arXiv preprint arXiv:2104.09441 (2021)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: CVPR, pp. 8759–8768 (2018)
Lu, Z., Rathod, V., Votel, R., Huang, J.: Retinatrack: Online single stage joint detection and tracking. In: CVPR, pp. 14668–14678 (2020)
Luiten, J., et al.: Hota: A higher order metric for evaluating multi-object tracking. IJCV 129(2), 548–578 (2021)
Micikevicius, P., et al.: Mixed precision training. arXiv preprint arXiv:1710.03740 (2017)
Milan, A., Rezatofighi, S.H., Dick, A., Reid, I., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Pang, B., Li, Y., Zhang, Y., Li, M., Lu, C.: Tubetk: Adopting tubes to track multi-object in a one-step training model. In: CVPR, pp. 6308–6318 (2020)
Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., Yu, F.: Quasi-dense similarity learning for multiple object tracking. In: CVPR, pp. 164–173 (2021)
Peng, J., et al.: Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: ICCV, pp. 300–311 (2017)
Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5620–5629 (2017)
Sun, P., et al.: Transtrack: Multiple object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)
Tokmakov, P., Li, J., Burgard, W., Gaidon, A.: Learning to track with object permanence. In: ICCV, pp. 10860–10869 (2021)
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Wang, Q., Zheng, Y., Pan, P., Xu, Y.: Multiple object tracking with correlation learning. In: CVPR, pp. 3876–3886 (2021)
Wang, Y., Weng, X., Kitani, K.: Joint detection and multi-object tracking with graph neural networks. arXiv preprint arXiv:2006.13164 1(2) (2020)
Wang, Z., Zheng, L., Liu, Y., Li, Y., Wang, S.: Towards real-time multi-object tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 107–122. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_7
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP, pp. 3645–3649. IEEE (2017)
Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: An online multi-object tracker. In: CVPR, pp. 12352–12361 (2021)
Yang, B., Nevatia, R.: An online learned crf model for multi-target tracking. In: CVPR, pp. 2034–2041. IEEE (2012)
Yang, F., Chang, X., Sakti, S., Wu, Y., Nakamura, S.: Remot: a model-agnostic refinement for multiple object tracking. Image Vis. Comput. 106, 104091 (2021)
Zhang, Y., et al.: Bytetrack: Multi-object tracking by associating every detection box. arXiv preprint arXiv:2110.06864 (2021)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: on the fairness of detection and re-identification in multiple object tracking. IJCV 129(11), 3069–3087 (2021)
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Luo, Q., Xu, C. (2022). Instance-Wise Contrastive Learning for Multi-object Tracking. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13537. Springer, Cham. https://doi.org/10.1007/978-3-031-18916-6_52
Download citation
DOI: https://doi.org/10.1007/978-3-031-18916-6_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18915-9
Online ISBN: 978-3-031-18916-6
eBook Packages: Computer ScienceComputer Science (R0)