Abstract
Multi-Object Tracking and Segmentation (MOTS) is a critical task in autonomous driving, robotic perception, and video analysis. The challenge lies in accurately identifying and associating objects within video sequences, especially when the number of objects is uncertain, motion patterns vary, and frequent object overlaps occur. In this paper, we propose a method with pre-matching and selective association (PS-Track) to adaptively combine motion and appearance cues to cope with continuously changing scenes. Unlike solely relying on the single cue or predetermined combination schemes, our method facilitates data association by discerning similarities in appearance among tracks across different scenarios through the dynamic selection of suitable schemes. Through experimentation, we also found that our method exhibits advantages in tracking efficiency compared to complex models. Our method achieved outstanding results on the MOTS dataset, with scores of 61.0 for sMOTA, 56.4 for IDF1, 76.0 for MOTSA, and 76.5 for FPS.
L. Chen and G. Liao—Co-first authors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aharon, N., Orfaig, R., Bobrovsky, B.Z.: Bot-sort: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 (2022)
Ahrnbom, M., Nilsson, M.G., Ardö, H.: Real-time and online segmentation multi-target tracking with track revival re-identification. In: VISIGRAPP (5: VISAPP). pp. 777–784 (2021)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing 2008, 1–10 (2008)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-Convolutional Siamese Networks for Object Tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP). pp. 3464–3468. IEEE (2016)
Brasó, G., Cetintas, O., Leal-Taixé, L.: Multi-object tracking and segmentation via neural message passing. Int. J. Comput. Vision 130(12), 3035–3053 (2022)
Cao, J., Pang, J., Weng, X., Khirodkar, R., Kitani, K.: Observation-centric sort: Rethinking sort for robust multi-object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9686–9696 (2023)
Chen, L., Ai, H., Zhuang, Z., Shang, C.: Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME). pp. 1–6. IEEE (2018)
Du, Y., Zhao, Z., Song, Y., Zhao, Y., Su, F., Gong, T., Meng, H.: Strongsort: Make deepsort great again. IEEE Transactions on Multimedia (2023)
Farkh, R., Alhuwaimel, S., Alzahrani, S., Al Jaloud, K., Quasim, M.T.: Deep learning control for autonomous robot. Computers, Materials & Continua 72(2) (2022)
Gao, Y., Xu, H., Zheng, Y., Li, J., Gao, X.: An object point set inductive tracker for multi-object tracking and segmentation. IEEE Trans. Image Process. 31, 6083–6096 (2022)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 1440–1448 (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 2961–2969 (2017)
Henschel, R., Leal-Taixé, L., Cremers, D., Rosenhahn, B.: Fusion of head and full-body detectors for multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 1428–1437 (2018)
Ke, L., Li, X., Danelljan, M., Tai, Y.W., Tang, C.K., Yu, F.: Prototypical cross-attention networks for multiple object tracking and segmentation. Adv. Neural. Inf. Process. Syst. 34, 1192–1203 (2021)
Keuper, M., Tang, S., Andres, B., Brox, T., Schiele, B.: Motion segmentation & multiple object tracking by correlation co-clustering. IEEE Trans. Pattern Anal. Mach. Intell. 42(1), 140–153 (2018)
Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision. pp. 4696–4704 (2015)
Kuhn, H.W.: The hungarian method for the assignment problem. Naval research logistics quarterly 2(1–2), 83–97 (1955)
Li, Y., Zhang, J., Ma, D., Wang, Y., Feng, C.: Multi-robot scene completion: Towards task-agnostic collaborative perception. In: Conference on Robot Learning. pp. 2062–2072. PMLR (2023)
Liu, D., Cui, Y., Tan, W., Chen, Y.: Sg-net: Spatial granularity network for one-stage video instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9816–9825 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440 (2015)
Lucarno, S., Zago, M., Buckthorpe, M., Grassi, A., Tosarelli, F., Smith, R., Della Villa, F.: Systematic video analysis of anterior cruciate ligament injuries in professional female soccer players. Am. J. Sports Med. 49(7), 1794–1802 (2021)
Luxem, K., Sun, J.J., Bradley, S.P., Krishnan, K., Yttri, E., Zimmermann, J., Pereira, T.D., Laubach, M.: Open-source tools for behavioral video analysis: setup, methods, and best practices. elife 12, e79305 (2023)
Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: Trackformer: Multi-object tracking with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8844–8854 (2022)
Porzi, L., Hofinger, M., Ruiz, I., Serrat, J., Bulo, S.R., Kontschieder, P.: Learning multi-object tracking and segmentation from automatic annotations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6846–6855 (2020)
Sun, P., Cao, J., Jiang, Y., Zhang, R., Xie, E., Yuan, Z., Wang, C., Luo, P.: Transtrack: Multiple object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020)
Viswanath, P., Sistu, G., Ilie, M., Yogamani, S.K., Horgan, J.: Early fusion of dense optical flow with image for semantic segmentation in autonomous driving. In: AICS. pp. 126–137 (2018)
Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B.B.G., Geiger, A., Leibe, B.: Mots: Multi-object tracking and segmentation. In: Proceedings of the ieee/cvf conference on computer vision and pattern recognition. pp. 7942–7951 (2019)
Wang, S., Sheng, H., Yang, D., Zhang, Y., Wu, Y., Wang, S.: Extendable multiple nodes recurrent tracking framework with rtu++. IEEE Trans. Image Process. 31, 5257–5271 (2022)
Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., Xia, H.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8741–8750 (2021)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP). pp. 3645–3649. IEEE (2017)
Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: An online multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12352–12361 (2021)
Xu, W., He, Y., Li, J., Zhou, J., Xu, E., Wang, W., Liu, D.: Robotization and intelligent digital systems in the meat cutting industry: from the perspectives of robotic cutting, perception, and digital development. Trends in Food Science & Technology (2023)
Xu, Z., Yang, W., Zhang, W., Tan, X., Huang, H., Huang, L.: Segment as points for efficient and effective online multi-object tracking and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6424–6437 (2021)
Xu, Z., Zhang, W., Tan, X., Yang, W., Huang, H., Wen, S., Ding, E., Huang, L.: Segment as Points for Efficient Online Multi-Object Tracking and Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 264–281. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_16
Yang, F., Wang, Z., Wu, Y., Sakti, S., Nakamura, S.: Tackling multiple object tracking with complicated motions-re-designing the integration of motion and appearance. Image Vis. Comput. 124, 104514 (2022)
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: Multi-object tracking by associating every detection box. In: European conference on computer vision. pp. 1–21. Springer (2022)
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: European conference on computer vision. pp. 474–490. Springer (2020)
Acknowledgement
This work is supported by Xiamen Natural Science Foundation(Grant No.3502Z202372034), the research startup foundation of Huaqiao university(Grant No.20201XD022) and Quanzhou Science and Technology Projects(Grant No.2023N013).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, L., Liao, G., Zhu, G., Zeng, H. (2025). Adaptive Data Association for Enhanced Multi-object Tracking and Segmentation with Pre-matching and Selective Association. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15322. Springer, Cham. https://doi.org/10.1007/978-3-031-78312-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-78312-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78311-1
Online ISBN: 978-3-031-78312-8
eBook Packages: Computer ScienceComputer Science (R0)