Educational Pattern Guided Self-knowledge Distillation for Siamese Visual Tracking

Zhang, Quan; Zhang, Xiaowei

doi:10.1007/978-981-99-8181-6_3

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1968))

Included in the following conference series:

International Conference on Neural Information Processing

432 Accesses

Abstract

Existing Siamese-based trackers divide visual tracking into two stages, i.e., feature extraction (backbone subnetwork), and prediction (head subnetwork). However, they mainly implement task-level supervision (classification and regression), barely considering the feature-level supervision in the knowledge learning process, which could result in deficient knowledge interaction among the features of the tracker’s targets and background interference during the online tracking process. To solve the issues, this paper proposes an educational pattern-guided self-knowledge distillation methodology by guiding Siamese-based trackers to learn feature knowledge by themselves, which can serve as a generic training protocol to improve any Siamese-based tracker. Our key insight is to utilize two educational self-distillation patterns, i.e., focal self-distillation and discriminative self-distillation, to educate the tracker to possess self-learning ability. The focal self-distillation pattern educates the tracking network to focus on valuable pixels and channels by decoupling the spatial learning and channel learning of target features. The discriminative self-distillation pattern aims at maximizing the discrimination between foreground and background features, ensuring that the trackers are unaffected by background pixels. As one of the first attempts to introduce self-knowledge distillation into the visual tracking field, our method is effective and efficient and has a strong generalization ability, which might be instructive for other research. Codes and data are publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Cao, Z., Fu, C., Ye, J., Li, B., Li, Y.: HiFT: hierarchical feature transformer for aerial tracking. In: ICCV, pp. 15437–15446 (2021). https://doi.org/10.1109/ICCV48922.2021.01517
Cao, Z., Fu, C., Ye, J., Li, B., Li, Y.: SiamAPN++: Siamese attentional aggregation network for real-time UAV tracking. In: IEEE IROS, pp. 3086–3092 (2021). https://doi.org/10.1109/IROS51168.2021.9636309
Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese box adaptive network for visual tracking. In: CVPR, pp. 6667–6676 (2020). https://doi.org/10.1109/CVPR42600.2020.00670
Dong, X., Shen, J., Shao, L., Porikli, F.: CLNet: a compact latent network for fast adjusting Siamese trackers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 378–395. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_23
Chapter Google Scholar
Du, F., Liu, P., Zhao, W., Tang, X.: Correlation-guided attention for corner detection based visual tracking. In: CVPR, pp. 6835–6844 (2020). https://doi.org/10.1109/CVPR42600.2020.00687
Fan, H., Bai, H., Lin, L., Yang, F., Ling, H.: LaSOT: a high-quality large-scale single object tracking benchmark. IJCV 129, 439–461 (2020)
Article Google Scholar
Fu, C., Cao, Z., Li, Y., et al.: Onboard real-time aerial tracking with efficient Siamese anchor proposal network. IEEE TGRS 60, 1–13 (2022). https://doi.org/10.1109/TGRS.2021.3083880
Article Google Scholar
Galoogahi, H.K., Fagg, A., Huang, C., Ramanan, D., Lucey, S.: Need for Speed: a benchmark for higher frame rate object tracking. In: ICCV, pp. 1134–1143 (2017). https://doi.org/10.1109/ICCV.2017.128
Guo, D., Shao, Y., Cui, Y., Wang, Z., Zhang, L., Shen, C.: Graph attention tracking. In: CVPR, June 2021
Google Scholar
Guo, D., Wang, J., Cui, Y., Wang, Z., Chen, S.: SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In: CVPR, pp. 6268–6276 (2020). https://doi.org/10.1109/CVPR42600.2020.00630
Guo, M., et al.: Learning target-aware representation for visual tracking via informative interactions (2022)
Google Scholar
Ji, M., Shin, S., Hwang, S., Park, G., Moon, I.C.: Refine myself by teaching myself: feature refinement via self-knowledge distillation. In: CVPR, pp. 10659–10668 (2021). https://doi.org/10.1109/CVPR46437.2021.01052
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: CVPR, pp. 4277–4286 (2019). https://doi.org/10.1109/CVPR.2019.00441
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: CVPR, pp. 8971–8980 (2018). https://doi.org/10.1109/CVPR.2018.00935
Li, S., Yeung, D.Y.: Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: ICCV, pp. 4140–4146 (2017)
Google Scholar
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE TIP 24(12), 5630–5644 (2015). https://doi.org/10.1109/TIP.2015.2482905
Article MathSciNet MATH Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115, 211–252 (2015)
Article MathSciNet Google Scholar
Shen, Q., et al.: Unsupervised learning of accurate Siamese tracking. In: CVPR, pp. 8091–8100 (2022). https://doi.org/10.1109/CVPR52688.2022.00793
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: ICCV, pp. 3712–3721 (2019). https://doi.org/10.1109/ICCV.2019.00381
Zhang, Z., Peng, H.: Deeper and wider Siamese networks for real-time visual tracking. In: CVPR, pp. 4586–4595 (2019). https://doi.org/10.1109/CVPR.2019.00472
Zhang, Z., Peng, H., Fu, J., Li, B., Hu, W.: Ocean: object-aware anchor-free tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 771–787. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_46
Chapter Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Qingdao University, Qingdao, 266071, Shandong, China
Quan Zhang & Xiaowei Zhang

Authors

Quan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaowei Zhang .

Editor information

Editors and Affiliations

Scholl of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Q., Zhang, X. (2024). Educational Pattern Guided Self-knowledge Distillation for Siamese Visual Tracking. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1968. Springer, Singapore. https://doi.org/10.1007/978-981-99-8181-6_3

Download citation

DOI: https://doi.org/10.1007/978-981-99-8181-6_3
Published: 27 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8180-9
Online ISBN: 978-981-99-8181-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics