Abstract
Multi-object tracking (MOT) is an important and representative task in the field of computer vision, while tracking-by-detection is the most mainstream paradigm for MOT, so that target detection quality, feature representation ability, and association algorithm greatly affect tracking performance. On the one hand, multiple pedestrians moving together in the same group maintain similar motion pattern, so that they can indicate each other’s moving state. We extract groups from detections and maintain the group relationship of trajectories in tracking. We propose a state transition mechanism to smooth detection bias, recover missing detection and confront false detection. We also build a two-level group-detection association algorithm, which improves the accuracy of association. On the other hand, different areas of the tracking scene have diverse and varying impact on the detections’ appearance feature, which weakens the appearance feature’s representation ability. We propose a self-adaptive feature fusion strategy based on the tracking scene and the group structure, which can help us to get fusion feature with stronger representative ability to use in the trajectory-detection association to improve tracking performance. To summary, in this paper, we propose a novel Group Perception based Self-adaptive Fusion Tracking (GST) framework, including Group concept and Group Exploration Net, Group Perception based State Transition Mechanism, and Self-adaptive Feature Fusion Strategy. Experiments on the MOT17 dataset demonstrate the effectiveness of our method. The method achieves competitive results compared to the state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xiong, Z., Sheng, H., Rong, W., Cooper, D.E.: Intelligent transportation systems for smart cities: a progress review. Sci. Chin. Inf. Sci. 55, 2908–2914 (2012)
Forsyth, D.: Object detection with discriminatively trained part-based models. Computer 47(02), 6–7 (2014)
Yang, F., Choi, W., Lin, Y.: Exploit all the layers: Fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2129–2137 (2016)
Wang, S., Sheng, H., Zhang, Y., Wu, Y., Xiong, Z.: A general recurrent tracking framework without real data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13219–13228 (2021)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artif. Intell. 293, 103448 (2021)
Zhang, Y., et al.: Long-term tracking with deep tracklet association. IEEE Trans. Image Process. 29, 6694–6706 (2020)
Meyer, F., Win, M.Z.: Scalable data association for extended object tracking. In: IEEE Transactions on Signal and Information Processing Over Networks, vol. 6. pp. 491–507. IEEE (2020)
Xie, Z., Zhang, W., Sheng, B., Li, P., Chen, C.P.: BaGFN: broad attentive graph fusion network for high-order feature interactions. IEEE Transactions on Neural Networks and Learning Systems (2021)
Liu, R., et al.: NHBS-Net: a feature fusion attention network for ultrasound neonatal hip bone segmentation. IEEE Trans. Med. Imaging 40(12), 3446–3458 (2021)
Wang, X., Wang, J., Kang, M., Feng, Z., Zhou, X., Liu, B.: LDGC-Net: learnable descriptor graph convolutional network for image retrieval. Vis. Comput. 1–15 (2022)
Yang, Y., Qi, Y., Qi, S.: Relation-consistency graph convolutional network for image super-resolution. Vis. Comput. 1–17 (2023)
Minoura, H., Hirakawa, T., Sugano, Y., Yamashita, T., Fujiyoshi, H.: Utilizing human social norms for multimodal trajectory forecasting via group-based forecasting module. IEEE Trans. Intell. Veh. 8, 836–850 (2022)
Wang, S., Sheng, H., Zhang, Y., Yang, D., Shen, J., Chen, R.: Blockchain-empowered distributed multi-camera multi-target tracking in edge computing. IEEE Transactions on Industrial Informatics (2023)
Sun, Z., Chen, J., Chao, L., Ruan, W., Mukherjee, M.: A survey of multiple pedestrian tracking based on tracking-by-detection framework. In: IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, pp. 1819–1833. IEEE (2020)
Zhang, P., Zhao, J., Bo, C., Wang, D., Lu, H., Yang, X.: Jointly modeling motion and appearance cues for robust RGB-T tracking. In: IEEE Transactions on Image Processing, vol. 30, pp. 3335–3347. IEEE (2021)
Sheng, H., Chen, J., Zhang, Y., Ke, W., Xiong, Z., Yu, J.: Iterative multiple hypothesis tracking with Tracklet-level association. IEEE Trans. Circ. Syst. Video Technol. 29(12), 3660–3672 (2018)
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)
Sheng, H., et al.: Hypothesis testing based tracking with spatio-temporal joint interaction modeling. IEEE Trans. Circ. Syst. Video Technol. 30(9), 2971–2983 (2020)
Wu, Y., Sheng, H., Wang, S., Liu, Y., Xiong, Z., Ke, W.: Group guided data association for multiple object tracking. In: Proceedings of the Asian Conference on Computer Vision, pp. 520–535 (2022)
Wang, L., Yu, Z., Yang, D., Ma, H., Sheng, H.: Efficiently targeted billboard advertising using crowdsensing vehicle trajectory data. IEEE Trans. Industr. Inf. 16(2), 1058–1066 (2019)
Luiten, J., et al.: HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vision 129, 548–578 (2021)
Du, Y., et al.: StrongSORT: make DeepSORT great again. IEEE Trans. Multimedia (2023)
Veeramani, B., Raymond, J.W., Chanda, P.: DeepSORT: deep convolutional networks for sorting haploid maize seeds. BMC Bioinform. 19, 1–9 (2018)
Galor, A., Orfaig, R., Bobrovsky, B.Z.: Strong-TransCenter: improved multi-object tracking based on transformers with dense representations. arXiv preprint arXiv:2210.13570 (2022)
Quach, K.G., et al.: DyGLIP: a dynamic graph model with link prediction for accurate multi-camera multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13784–13793 (2021)
Cao, J., Pang, J., Weng, X., Khirodkar, R., Kitani, K.: Observation-centric SORT: rethinking SORT for robust multi-object tracking. arXiv preprint arXiv:2203.14360, 2022
Aharon, N., Orfaig, R., Bobrovsky, B.-Z.: BoT-SORT: robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 (2022)
Acknowledgements
This study is partially supported by the National Key R &D Program of China (No.2022YFB3306500), the National Natural Science Foundation of China (No.61872025). Thanks for the support from HAWKEYE Group.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xing, Y. et al. (2024). Group Perception Based Self-adaptive Fusion Tracking. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14498. Springer, Cham. https://doi.org/10.1007/978-3-031-50078-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-50078-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50077-0
Online ISBN: 978-3-031-50078-7
eBook Packages: Computer ScienceComputer Science (R0)