Skip to main content

Enhancing Active Visual Tracking Under Distractor Environments

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14427))

Included in the following conference series:

  • 421 Accesses

Abstract

Active Visual Tracking (AVT) faces significant challenges in distracting environments characterized by occlusions and confusion. Current methodologies address this challenge through the integration of a mixed multi-agent game and Imitation Learning(IL). However, during the IL phase, if the training data of students generated by the teacher lacks diversity, it can lead to a noticeable degradation in the performance of the student visual tracker. Furthermore, existing works neglect visual occlusion issues from distractors beyond the collision distance. To enhance AVT performance, we introduce a novel method. Firstly, to tackle the limited diversity issue, we propose an intrinsic reward mechanism known as Asymmetric Random Network Distillation (AS-RND). This mechanism fosters target exploration, augmenting the variety of states among trackers and distractors, thereby enriching the heterogeneity of the visual tracker’s training data. Secondly, to address visual occlusion, we present a distractor-occlusion avoidance reward predicated on the positional distribution of the distractors. Lastly, we integrate a classification score map prediction module to bolster the tracker’s discriminative abilities. Experiments show that our approach significantly outperforms previous AVT algorithms in a complex distractor environment.

This work was supported by the Science and Technology Innovation 2030 Major Project under Grant No.2020AAA0104802 and the National Natural Science Foundation of China(Grant No.91948303). Supplementary material is checked from URL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Burda, Y., Edwards, H., Storkey, A.J., Klimov, O.: Exploration by random network distillation. ArXiv abs/1810.12894 (2018)

    Google Scholar 

  2. Dionigi, A., Devo, A., Guiducci, L., Costante, G.: E-vat: an asymmetric end-to-end approach to visual active exploration and tracking. IEEE Robot. Autom. Lett. 7, 4259–4266 (2022)

    Article  Google Scholar 

  3. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  4. Kristan, M., et al.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2137–2155 (2015)

    Article  Google Scholar 

  5. Lin, T.Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)

    Google Scholar 

  6. Luo, W., Sun, P., Zhong, F., Liu, W., Zhang, T., Wang, Y.: End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1317–1332 (2019)

    Article  Google Scholar 

  7. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)

    Google Scholar 

  8. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 488–489 (2017)

    Google Scholar 

  9. Qiu, W., et al.: Unrealcv: virtual worlds for computer vision. In: Proceedings of the 25th ACM International Conference on Multimedia (2017)

    Google Scholar 

  10. Tangkaratt, V., Han, B., Khan, M.E., Sugiyama, M.: Variational imitation learning with diverse-quality demonstrations. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9407–9417 (2020)

    Google Scholar 

  11. Wilson, M., Hermans, T.: Learning to manipulate object collections using grounded state representations. In: Conference on Robot Learning, pp. 490–502. PMLR (2020)

    Google Scholar 

  12. Xi, M., Zhou, Y., Chen, Z., Zhou, W.G., Li, H.: Anti-distractor active object tracking in 3d environments. IEEE Trans. Circ. Syst. Video Technol. 32, 3697–3707 (2022)

    Article  Google Scholar 

  13. Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Ad-vat: an asymmetric dueling mechanism for learning visual active tracking. In: International Conference on Learning Representations (2019)

    Google Scholar 

  14. Zhong, F., Sun, P., Luo, W., Yan, T., Wang, Y.: Towards distraction-robust active visual tracking. In: International Conference on Machine Learning, pp. 12782–12792. PMLR (2021)

    Google Scholar 

  15. Zhu, W., Hayashibe, M.: Autonomous navigation system in pedestrian scenarios using a dreamer-based motion planner. IEEE Robot. Autom. Lett. 8, 3835–3842 (2023)

    Article  Google Scholar 

  16. Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dianxi Shi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 663 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ouyang, Q. et al. (2024). Enhancing Active Visual Tracking Under Distractor Environments. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14427. Springer, Singapore. https://doi.org/10.1007/978-981-99-8435-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8435-0_36

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8434-3

  • Online ISBN: 978-981-99-8435-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics