Enhancing Active Visual Tracking Under Distractor Environments

Ouyang, Qianying; Zhao, Chenran; Xie, Jing; Biao, Zhang; Li, Tongyue; Zheng, Yuxi; Shi, Dianxi

doi:10.1007/978-981-99-8435-0_36

Qianying Ouyang^15,16,
Chenran Zhao^16,17,
Jing Xie^15,16,
Zhang Biao^16,17,
Tongyue Li^15,16,
Yuxi Zheng^15,16 &
…
Dianxi Shi^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14427))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

421 Accesses

Abstract

Active Visual Tracking (AVT) faces significant challenges in distracting environments characterized by occlusions and confusion. Current methodologies address this challenge through the integration of a mixed multi-agent game and Imitation Learning(IL). However, during the IL phase, if the training data of students generated by the teacher lacks diversity, it can lead to a noticeable degradation in the performance of the student visual tracker. Furthermore, existing works neglect visual occlusion issues from distractors beyond the collision distance. To enhance AVT performance, we introduce a novel method. Firstly, to tackle the limited diversity issue, we propose an intrinsic reward mechanism known as Asymmetric Random Network Distillation (AS-RND). This mechanism fosters target exploration, augmenting the variety of states among trackers and distractors, thereby enriching the heterogeneity of the visual tracker’s training data. Secondly, to address visual occlusion, we present a distractor-occlusion avoidance reward predicated on the positional distribution of the distractors. Lastly, we integrate a classification score map prediction module to bolster the tracker’s discriminative abilities. Experiments show that our approach significantly outperforms previous AVT algorithms in a complex distractor environment.

This work was supported by the Science and Technology Innovation 2030 Major Project under Grant No.2020AAA0104802 and the National Natural Science Foundation of China(Grant No.91948303). Supplementary material is checked from URL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Burda, Y., Edwards, H., Storkey, A.J., Klimov, O.: Exploration by random network distillation. ArXiv abs/1810.12894 (2018)
Google Scholar
Dionigi, A., Devo, A., Guiducci, L., Costante, G.: E-vat: an asymmetric end-to-end approach to visual active exploration and tracking. IEEE Robot. Autom. Lett. 7, 4259–4266 (2022)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
Kristan, M., et al.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2137–2155 (2015)
Article Google Scholar
Lin, T.Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)
Google Scholar
Luo, W., Sun, P., Zhong, F., Liu, W., Zhang, T., Wang, Y.: End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1317–1332 (2019)
Article Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Google Scholar
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 488–489 (2017)
Google Scholar
Qiu, W., et al.: Unrealcv: virtual worlds for computer vision. In: Proceedings of the 25th ACM International Conference on Multimedia (2017)
Google Scholar
Tangkaratt, V., Han, B., Khan, M.E., Sugiyama, M.: Variational imitation learning with diverse-quality demonstrations. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9407–9417 (2020)
Google Scholar
Wilson, M., Hermans, T.: Learning to manipulate object collections using grounded state representations. In: Conference on Robot Learning, pp. 490–502. PMLR (2020)
Google Scholar
Xi, M., Zhou, Y., Chen, Z., Zhou, W.G., Li, H.: Anti-distractor active object tracking in 3d environments. IEEE Trans. Circ. Syst. Video Technol. 32, 3697–3707 (2022)
Article Google Scholar
Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Ad-vat: an asymmetric dueling mechanism for learning visual active tracking. In: International Conference on Learning Representations (2019)
Google Scholar
Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Towards distraction-robust active visual tracking. In: International Conference on Machine Learning, pp. 12782–12792. PMLR (2021)
Google Scholar
Zhu, W., Hayashibe, M.: Autonomous navigation system in pedestrian scenarios using a dreamer-based motion planner. IEEE Robot. Autom. Lett. 8, 3835–3842 (2023)
Article Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Game and Decision Lab(IGDL), Beijing, China
Qianying Ouyang, Jing Xie, Tongyue Li, Yuxi Zheng & Dianxi Shi
Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
Qianying Ouyang, Chenran Zhao, Jing Xie, Zhang Biao, Tongyue Li, Yuxi Zheng & Dianxi Shi
College of Computer, National University of Defense Technology, Changsha, China
Chenran Zhao & Zhang Biao

Authors

Qianying Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Chenran Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Biao
View author publications
You can also search for this author in PubMed Google Scholar
Tongyue Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuxi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Dianxi Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dianxi Shi .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 663 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ouyang, Q. et al. (2024). Enhancing Active Visual Tracking Under Distractor Environments. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14427. Springer, Singapore. https://doi.org/10.1007/978-981-99-8435-0_36

Download citation

DOI: https://doi.org/10.1007/978-981-99-8435-0_36
Published: 24 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8434-3
Online ISBN: 978-981-99-8435-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics