Abstract
Active Visual Tracking (AVT) faces significant challenges in distracting environments characterized by occlusions and confusion. Current methodologies address this challenge through the integration of a mixed multi-agent game and Imitation Learning(IL). However, during the IL phase, if the training data of students generated by the teacher lacks diversity, it can lead to a noticeable degradation in the performance of the student visual tracker. Furthermore, existing works neglect visual occlusion issues from distractors beyond the collision distance. To enhance AVT performance, we introduce a novel method. Firstly, to tackle the limited diversity issue, we propose an intrinsic reward mechanism known as Asymmetric Random Network Distillation (AS-RND). This mechanism fosters target exploration, augmenting the variety of states among trackers and distractors, thereby enriching the heterogeneity of the visual tracker’s training data. Secondly, to address visual occlusion, we present a distractor-occlusion avoidance reward predicated on the positional distribution of the distractors. Lastly, we integrate a classification score map prediction module to bolster the tracker’s discriminative abilities. Experiments show that our approach significantly outperforms previous AVT algorithms in a complex distractor environment.
This work was supported by the Science and Technology Innovation 2030 Major Project under Grant No.2020AAA0104802 and the National Natural Science Foundation of China(Grant No.91948303). Supplementary material is checked from URL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Burda, Y., Edwards, H., Storkey, A.J., Klimov, O.: Exploration by random network distillation. ArXiv abs/1810.12894 (2018)
Dionigi, A., Devo, A., Guiducci, L., Costante, G.: E-vat: an asymmetric end-to-end approach to visual active exploration and tracking. IEEE Robot. Autom. Lett. 7, 4259–4266 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Kristan, M., et al.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2137–2155 (2015)
Lin, T.Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)
Luo, W., Sun, P., Zhong, F., Liu, W., Zhang, T., Wang, Y.: End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1317–1332 (2019)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 488–489 (2017)
Qiu, W., et al.: Unrealcv: virtual worlds for computer vision. In: Proceedings of the 25th ACM International Conference on Multimedia (2017)
Tangkaratt, V., Han, B., Khan, M.E., Sugiyama, M.: Variational imitation learning with diverse-quality demonstrations. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9407–9417 (2020)
Wilson, M., Hermans, T.: Learning to manipulate object collections using grounded state representations. In: Conference on Robot Learning, pp. 490–502. PMLR (2020)
Xi, M., Zhou, Y., Chen, Z., Zhou, W.G., Li, H.: Anti-distractor active object tracking in 3d environments. IEEE Trans. Circ. Syst. Video Technol. 32, 3697–3707 (2022)
Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Ad-vat: an asymmetric dueling mechanism for learning visual active tracking. In: International Conference on Learning Representations (2019)
Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Towards distraction-robust active visual tracking. In: International Conference on Machine Learning, pp. 12782–12792. PMLR (2021)
Zhu, W., Hayashibe, M.: Autonomous navigation system in pedestrian scenarios using a dreamer-based motion planner. IEEE Robot. Autom. Lett. 8, 3835–3842 (2023)
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ouyang, Q. et al. (2024). Enhancing Active Visual Tracking Under Distractor Environments. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14427. Springer, Singapore. https://doi.org/10.1007/978-981-99-8435-0_36
Download citation
DOI: https://doi.org/10.1007/978-981-99-8435-0_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8434-3
Online ISBN: 978-981-99-8435-0
eBook Packages: Computer ScienceComputer Science (R0)