Abstract
Recently, most multi-object tracking algorithms adopt the idea of tracking-by-detection. Related studies have shown that significant improvements with the development of detectors. However, missed detection and false detection are more serious in occlusion situations. Therefore, the tracker uses tracklet (short trajectories) to generate more perfect trajectories. There are many tracklet generation algorithms, but the fragmentation problem is still prevalent in crowded scenes. Fixed window tracklet generation strategies are not suitable for dynamic environments with occlusions. To solve this problem, we propose a reinforcement learning-based framework for tracklet generation, where we regard tracklet generation as a Markov decision process and then utilize reinforcement learning to dynamically predict the window size for generating tracklet. Additionally, we introduce a novel scheme that incorporates the temporal order of tracklet for association. Experiments of our method on the MOT17 dataset demonstrate its effectiveness, achieving competitive results compared to the most advanced methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Luo, W., et al.: Multiple object tracking: a literature review (2014). arXiv:1409.7618. http://arxiv.org/abs/1409.7618
Wang, S., Sheng, H., Yang, D., Zhang, Y., Wu, Y., Wang, S.: Extendable multiple nodes recurrent tracking framework with RTU++. IEEE Trans. Image Process. 31, 5257–5271 (2022)
Sheng, H., et al.: High confident evaluation for smart city services. Front. Environ. Sci. 10, 950055 (2022)
Wu, Y., Sheng, H., Zhang, Y., Wang, S., Xiong, Z., Ke, W.: Hybrid motion model for multiple object tracking in mobile devices. IEEE Internet Things J. 10, 1–14 (2022)
Wang, S., Sheng, H., Zhang, Y., Yang, D., Shen, J., Chen, R.: Blockchain-empowered distributed multi-camera multi-target tracking in edge computing. IEEE Trans. Ind. Inform. 1–14 (2023)
Girbau, A., Marques, F., Satoh, S.: Multiple object tracking from appearance by hierarchically clustering tracklets. In: 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, 21–24 November 2022 (2022)
Cao, J., Zhang, J., Li, B., Gao, L., Zhang, J.: RetinaMOT: rethinking anchor-free YOLOv5 for online multiple object tracking. Complex Intell. Syst. (2023)
Sun, P., et al.: TransTrack: Multiple Object Tracking with Transformer, arXiv:2012.15460 (2020)
Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: Transformers with Dense Queries for Multiple-Object Tracking, arXiv (2021)
Wang, S., Sheng, H., Zhang, Y., Wu, Y., Xiong, Z.: A general recurrent tracking framework without real data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Sheng, H., Chen, J., Zhang, Y., Ke, W., Xiong, Z., Yu, J.: Iterative multiple hypothesis tracking with tracklet-level association. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3660–3672 (2019)
Dollar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. Pattern Anal. Mach. Intell. 36, 1532–1545 (2014)
Supancic III, J., Ramanan, D.: Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: ICCV, pp. 322–331 (2017)
Sheng, H., et al.: Hypothesis testing based tracking with spatio-temporal joint interaction modeling. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2971–2983 (2020)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Wang, B., Wang, G., Chan, K.L., Wang, L.: Tracklet association by online target-specific metric learning and coherent dynamics estimation. IEEE Trans. Pattern Anal. Mach. Intell. 39(3), 589–602 (2017)
Yun, S., Choi, J., Yoo, Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning. In: CVPR, pp. 2711–2720 (2017)
Wang, G., Wang, Y., Zhang, H., Gu, R., Hwang, J.-N.: Exploit the connectivity: multi-object tracking with TrackletNet (2018). arXiv:1811.07258. http://arxiv.org/abs/1811.07258
Chen, L., Ai, H., Chen, R., Zhuang, Z.: Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process. Lett. 26(11), 1613–1617 (2019)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. Image Video Process. (2008)
Yang, B., et al.: ST3D: A Simple and Efficient Single Shot Multi-Object Tracker with Multi-Feature Fusion, arXiv preprint arXiv:2002.01604 (2020)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. IEEE Trans. Pattern Anal. Mach. Intell. (2017)
Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: ICCV, pp. 4705–4713 (2015)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep Q-learning with model-based acceleration. In: ICML, pp. 2829–2838 (2016)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML, pp. 387–395 (2014)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Cao, Q., Lin, L., Shi, Y., Liang, X., Li, G.: Attention-aware face hallucination via deep reinforcement learning. In: CVPR, pp. 690–698 (2017)
Liu, S., Zhu, Z., Ye, N., Guadarrama, S., Murphy, K.: Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.00370 (2016)
MOTChallenge: MOT17: a benchmark for multi-object tracking. http://motchallenge.net/data/MOT17/. Accessed 25 April 2023
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008(1), 1–10 (2008)
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Sheng, H., et al.: Near-online tracking with co-occurrence constraints in blockchain-based edge computing. IEEE Internet Things J. 8(4), 2193–2207 (2021)
Luo, Q., Shao, J., Dang, W., et al.: An efficient multi-scale channel attention network for person re-identification. Vis. Comput. (2023)
Li, Y., et al.: A lightweight scheme of deep appearance extraction for robust online multi-object tracking. Vis. Comput. 1–17 (2023)
Zhang, X., Wang, X., Chunhua, G.: Online multi-object tracking with pedestrian re-identification and occlusion processing. Vis. Comput. 37, 1089–1099 (2021)
Zhang, Y., Yang, Z., Ma, B., et al.: Structural-appearance information fusion for visual tracking. Vis Comput (2023). https://doi.org/10.1007/s00371-023-03013-7
Acknowledgement
This study is partially supported by the National Key R &D Program of China (No. 2022YFB3306500), the National Natural Science Foundation of China (No. 61872025). Thank you for the support from HAWKEYE Group.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ouyang, J., Wang, S., Zhang, Y., Wu, Y., Shen, J., Sheng, H. (2024). Reinforce Model Tracklet for Multi-Object Tracking. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14497. Springer, Cham. https://doi.org/10.1007/978-3-031-50075-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-50075-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50074-9
Online ISBN: 978-3-031-50075-6
eBook Packages: Computer ScienceComputer Science (R0)