Abstract
Target association is an extremely important problem in the field of multi-object tracking, especially for pedestrian scenes with high similarity in appearance and dense distribution. The traditional approach of combining IOU and ReID techniques with the Hungarian algorithm only partially addresses these challenges. To improve the model’s matching ability, this paper proposes a block-matching model that extracts local features using a Block Matching Module (BMM) based on the Transformer model. The BMM extracts features by dividing them into blocks and mines effective features of the target to complete target similarity evaluation. Additionally, a Euclidean Distance Module (EDM) based on the Euclidean distance association matching strategy is introduced to further enhance the model’s association ability. By integrating BMM and EDM into the same multi-object tracking model, this paper establishes a novel model called BWTrack that achieves excellent performance on MOT16, MOT17, and MOT20 while maintaining high performance at 7 FPS on a single GPU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aharon, N., Orfaig, R., Bobrovsky, B.Z.: BoT-SORT: robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 (2022)
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP, pp. 3464–3468. IEEE (2016)
Cai, J., et al.: MeMOT: multi-object tracking with memory. In: CVPR, pp. 8090–8100 (2022)
Cao, J., Pang, J., Weng, X., Khirodkar, R., Kitani, K., et al.: Observation-centric sort: rethinking sort for robust multi-object tracking. arXiv preprint arXiv:2203.14360 (2022)
Chu, P., Fan, H., Tan, C.C., Ling, H.: Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In: WACV, pp. 161–170. IEEE (2019)
Dendorfer, P., et al.: MOT20: a benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Liang, C., Zhang, Z., Zhou, X., Li, B., Zhu, S., Hu, W.: Rethinking the competition between detection and ReiD in multiobject tracking. TIP, 3182–3196 (2022)
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: CVPR, pp. 164–173 (2021)
Peng, J., et al.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9
Peng, J., et al.: TPM: multiple object tracking with tracklet-plane matching. Pattern Recogn. 107, 107480 (2020)
Shan, C., et al.: Tracklets predicting based adaptive graph tracking. arXiv preprint arXiv:2010.09015 (2020)
Sun, P., et al.: Transtrack: multiple object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30, pp. 6000–6010 (2017)
Wang, T., et al.: Spatio-temporal point process for multiple object tracking. IEEE Trans. Neural Netw. Learn. Syst. 34(4), 1777–1788 (2023)
Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: transformers with dense queries for multiple-object tracking. arXiv e-prints, arXiv-2103 (2021)
Yang, F., Odashima, S., Masui, S., Jiang, S.: Hard to track objects with irregular motions and similar appearances? Make it easier by buffering the matching space. In: WACV, pp. 4799–4808 (2023)
Yu, F., Li, W., Li, Q., Liu, Yu., Shi, X., Yan, J.: POI: multiple object tracking with high performance detection and appearance feature. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 36–42. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_3
Yu, E., Li, Z., Han, S., Wang, H.: RelationTrack: relation-aware multiple object tracking with decoupled representation. IEEE Trans. Multimedia, 2686–2697 (2021)
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., Wei, Y.: MOTR: end-to-end multiple-object tracking with transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV, vol. 13687, pp. 659–675. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_38
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021)
Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV, vol. 13682, pp. 1–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_1
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: ICCV, pp. 941–951 (2019)
Chaabane, M., Zhang, P., Beveridge, J.R., O’Hara, S.: Deft: detection embeddings for tracking. arXiv preprint arXiv:2102.02267 (2021)
Chu, P., Wang, J., You, Q., Ling, H., Liu, Z.: TransMOT: spatial-temporal graph transformer for multiple object tracking. In: WACV, pp. 4870–4880 (2023)
Emami, P., Pardalos, P.M., Elefteriadou, L., Ranka, S.: Machine learning methods for data association in multi-object tracking. ACM Comput. Surv. (CSUR), 1–34 (2020)
Fang, K., Xiang, Y., Li, X., Savarese, S.: Recurrent autoregressive networks for online multi-object tracking. In: WACV, pp. 466–475. IEEE (2018)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOx: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Han, S., Huang, P., Wang, H., Yu, E., Liu, D., Pan, X.: MAT: motion-aware multi-object tracking. Neurocomputing 476, 75–86 (2022)
He, L., Liao, X., Liu, W., Liu, X., Cheng, P., Mei, T.: FastReID: a Pytorch toolbox for general instance re-identification. arXiv preprint arXiv:2006.02631 (2020)
Hyun, J., Kang, M., Wee, D., Yeung, D.Y.: Detection recovery in online multi-object tracking with sparse graph tracker. In: WACV, pp. 4850–4859 (2023)
Li, W., Xiong, Y., Yang, S., Xu, M., Wang, Y., Xia, W.: Semi-TCL: semi-supervised track contrastive representation learning. arXiv preprint arXiv:2107.02396 (2021)
Liang, C., Zhang, Z., Zhou, X., Li, B., Hu, W.: One more check: making “fake background” be tracked again. In: AAAI, vol. 36, pp. 1546–1554 (2022)
Mahmoudi, N., Ahadi, S.M., Rahmati, M.: Multi-target tracking using CNN-based features: CNNMTT. Multimedia Tools Appl., 7077–7096 (2019)
Pang, B., Li, Y., Zhang, Y., Li, M., Lu, C.: TubeTK: adopting tubes to track multi-object in a one-step training model. In: CVPR, pp. 6308–6318 (2020)
Seidenschwarz, J., Braso, G., Elezi, I., Leal-Taixe, L.: Simple cues lead to a strong multi-object tracker. arXiv preprint arXiv:2206.04656 (2022)
Stadler, D., Beyerer, J.: Modelling ambiguous assignments for multi-person tracking in crowds. In: WACV, pp. 133–142 (2022)
Tokmakov, P., Li, J., Burgard, W., Gaidon, A.: Learning to track with object permanence. In: ICCV, pp. 10860–10869 (2021)
Wang, Q., Zheng, Y., Pan, P., Xu, Y.: Multiple object tracking with correlation learning. In: CVPR, pp. 3876–3886 (2021)
Wang, Y., Kitani, K., Weng, X.: Joint object detection and multi-object tracking with graph neural networks. In: ICRA, pp. 13708–13715 (2021)
Wang, Z., Zheng, L., Liu, Y., Li, Y., Wang, S.: Towards real-time multi-object tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 107–122. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_7
Wojke, N., Bewley, A.: Deep cosine metric learning for person re-identification. In: WACV, pp. 748–756. IEEE (2018)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: ICIP, pp. 3645–3649. IEEE (2017)
Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: an online multi-object tracker. In: CVPR, pp. 12352–12361 (2021)
Yang, F., Chang, X., Sakti, S., et al.: ReMOT: a model-agnostic refinement for multiple object tracking. Image Vis. Comput. 106, 104091 (2021)
Zheng, L., Tang, M., Chen, Y., Zhu, G., Wang, J., Lu, H.: Improving multiple object tracking with single object tracking. In: CVPR, pp. 2453–2462 (2021)
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, C. (2024). Block-Matching Multi-pedestrian Tracking. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14449. Springer, Singapore. https://doi.org/10.1007/978-981-99-8067-3_9
Download citation
DOI: https://doi.org/10.1007/978-981-99-8067-3_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8066-6
Online ISBN: 978-981-99-8067-3
eBook Packages: Computer ScienceComputer Science (R0)