Reinforce Model Tracklet for Multi-Object Tracking

Ouyang, Jianhong; Wang, Shuai; Zhang, Yang; Wu, Yubin; Shen, Jiahao; Sheng, Hao

doi:10.1007/978-3-031-50075-6_7

Jianhong Ouyang^12,13,
Shuai Wang^12,13,
Yang Zhang¹⁴,
Yubin Wu^12,13,
Jiahao Shen^12,13 &
…
Hao Sheng^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14497))

Included in the following conference series:

Computer Graphics International Conference

174 Accesses

Abstract

Recently, most multi-object tracking algorithms adopt the idea of tracking-by-detection. Related studies have shown that significant improvements with the development of detectors. However, missed detection and false detection are more serious in occlusion situations. Therefore, the tracker uses tracklet (short trajectories) to generate more perfect trajectories. There are many tracklet generation algorithms, but the fragmentation problem is still prevalent in crowded scenes. Fixed window tracklet generation strategies are not suitable for dynamic environments with occlusions. To solve this problem, we propose a reinforcement learning-based framework for tracklet generation, where we regard tracklet generation as a Markov decision process and then utilize reinforcement learning to dynamically predict the window size for generating tracklet. Additionally, we introduce a novel scheme that incorporates the temporal order of tracklet for association. Experiments of our method on the MOT17 dataset demonstrate its effectiveness, achieving competitive results compared to the most advanced methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Luo, W., et al.: Multiple object tracking: a literature review (2014). arXiv:1409.7618. http://arxiv.org/abs/1409.7618
Wang, S., Sheng, H., Yang, D., Zhang, Y., Wu, Y., Wang, S.: Extendable multiple nodes recurrent tracking framework with RTU++. IEEE Trans. Image Process. 31, 5257–5271 (2022)
Article Google Scholar
Sheng, H., et al.: High confident evaluation for smart city services. Front. Environ. Sci. 10, 950055 (2022)
Article Google Scholar
Wu, Y., Sheng, H., Zhang, Y., Wang, S., Xiong, Z., Ke, W.: Hybrid motion model for multiple object tracking in mobile devices. IEEE Internet Things J. 10, 1–14 (2022)
Google Scholar
Wang, S., Sheng, H., Zhang, Y., Yang, D., Shen, J., Chen, R.: Blockchain-empowered distributed multi-camera multi-target tracking in edge computing. IEEE Trans. Ind. Inform. 1–14 (2023)
Google Scholar
Girbau, A., Marques, F., Satoh, S.: Multiple object tracking from appearance by hierarchically clustering tracklets. In: 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, 21–24 November 2022 (2022)
Google Scholar
Cao, J., Zhang, J., Li, B., Gao, L., Zhang, J.: RetinaMOT: rethinking anchor-free YOLOv5 for online multiple object tracking. Complex Intell. Syst. (2023)
Google Scholar
Sun, P., et al.: TransTrack: Multiple Object Tracking with Transformer, arXiv:2012.15460 (2020)
Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: Transformers with Dense Queries for Multiple-Object Tracking, arXiv (2021)
Google Scholar
Wang, S., Sheng, H., Zhang, Y., Wu, Y., Xiong, Z.: A general recurrent tracking framework without real data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Sheng, H., Chen, J., Zhang, Y., Ke, W., Xiong, Z., Yu, J.: Iterative multiple hypothesis tracking with tracklet-level association. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3660–3672 (2019)
Article Google Scholar
Dollar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. Pattern Anal. Mach. Intell. 36, 1532–1545 (2014)
Article Google Scholar
Supancic III, J., Ramanan, D.: Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: ICCV, pp. 322–331 (2017)
Google Scholar
Sheng, H., et al.: Hypothesis testing based tracking with spatio-temporal joint interaction modeling. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2971–2983 (2020)
Article Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Article Google Scholar
Wang, B., Wang, G., Chan, K.L., Wang, L.: Tracklet association by online target-specific metric learning and coherent dynamics estimation. IEEE Trans. Pattern Anal. Mach. Intell. 39(3), 589–602 (2017)
Article Google Scholar
Yun, S., Choi, J., Yoo, Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning. In: CVPR, pp. 2711–2720 (2017)
Google Scholar
Wang, G., Wang, Y., Zhang, H., Gu, R., Hwang, J.-N.: Exploit the connectivity: multi-object tracking with TrackletNet (2018). arXiv:1811.07258. http://arxiv.org/abs/1811.07258
Chen, L., Ai, H., Chen, R., Zhuang, Z.: Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process. Lett. 26(11), 1613–1617 (2019)
Article Google Scholar
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. Image Video Process. (2008)
Google Scholar
Yang, B., et al.: ST3D: A Simple and Efficient Single Shot Multi-Object Tracker with Multi-Feature Fusion, arXiv preprint arXiv:2002.01604 (2020)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. IEEE Trans. Pattern Anal. Mach. Intell. (2017)
Google Scholar
Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: ICCV, pp. 4705–4713 (2015)
Google Scholar
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep Q-learning with model-based acceleration. In: ICML, pp. 2829–2838 (2016)
Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML, pp. 387–395 (2014)
Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Article Google Scholar
Cao, Q., Lin, L., Shi, Y., Liang, X., Li, G.: Attention-aware face hallucination via deep reinforcement learning. In: CVPR, pp. 690–698 (2017)
Google Scholar
Liu, S., Zhu, Z., Ye, N., Guadarrama, S., Murphy, K.: Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.00370 (2016)
MOTChallenge: MOT17: a benchmark for multi-object tracking. http://motchallenge.net/data/MOT17/. Accessed 25 April 2023
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008(1), 1–10 (2008)
Article Google Scholar
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Chapter Google Scholar
Sheng, H., et al.: Near-online tracking with co-occurrence constraints in blockchain-based edge computing. IEEE Internet Things J. 8(4), 2193–2207 (2021)
Article Google Scholar
Luo, Q., Shao, J., Dang, W., et al.: An efficient multi-scale channel attention network for person re-identification. Vis. Comput. (2023)
Google Scholar
Li, Y., et al.: A lightweight scheme of deep appearance extraction for robust online multi-object tracking. Vis. Comput. 1–17 (2023)
Google Scholar
Zhang, X., Wang, X., Chunhua, G.: Online multi-object tracking with pedestrian re-identification and occlusion processing. Vis. Comput. 37, 1089–1099 (2021)
Article Google Scholar
Zhang, Y., Yang, Z., Ma, B., et al.: Structural-appearance information fusion for visual tracking. Vis Comput (2023). https://doi.org/10.1007/s00371-023-03013-7
Article Google Scholar

Download references

Acknowledgement

This study is partially supported by the National Key R &D Program of China (No. 2022YFB3306500), the National Natural Science Foundation of China (No. 61872025). Thank you for the support from HAWKEYE Group.

Author information

Authors and Affiliations

State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering, Beihang University, Beijing, 100191, People’s Republic of China
Jianhong Ouyang, Shuai Wang, Yubin Wu, Jiahao Shen & Hao Sheng
Zhongfa Aviation Institute, Beihang University, Hangzhou, 311115, China
Jianhong Ouyang, Shuai Wang, Yubin Wu, Jiahao Shen & Hao Sheng
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, People’s Republic of China
Yang Zhang

Authors

Jianhong Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yubin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiahao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Hao Sheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianhong Ouyang .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneva, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ouyang, J., Wang, S., Zhang, Y., Wu, Y., Shen, J., Sheng, H. (2024). Reinforce Model Tracklet for Multi-Object Tracking. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14497. Springer, Cham. https://doi.org/10.1007/978-3-031-50075-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-50075-6_7
Published: 22 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50074-9
Online ISBN: 978-3-031-50075-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reinforce Model Tracklet for Multi-Object Tracking