Abstract
Recently, template based discriminative trackers, especially Siamese network based trackers have shown great potential in terms of balanced accuracy and tracking speed. However, it is still difficult for Siamese models to adapt the target variations from offline learning. In this paper, we introduced an Adaptive Feature Selection Siamese (AFS-Siam) network to learn the most discriminative feature information for better tracking. Features from different layers contain complementary information for discrimination. Proposed adaptive feature selection module selects the most useful feature information from different convolutional layers while suppresses the irrelevant ones. Proposed tracking algorithm not only alleviates the over-fitting problem but also increases the discriminative ability. The proposed tracking framework is trained end-to-end. And extensive experimental results over OTB50, OTB100, TC-128, and VOT2017 demonstrate that our tracking algorithm exhibits favorable performance compared to other state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Avidan, S.: Support vector tracking. IEEE Trans. Pattern Anal. Mach. Intell. 26(8), 1064–1072 (2004)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chen, S., Qiu, D., Huo, Q.: Siamese networks with discriminant correlation filters and channel attention. In: 2018 14th International Conference on Computational Intelligence and Security (CIS), pp. 110–114. IEEE (2018)
Choi, J., Jin Chang, H., Jeong, J., Demiris, Y., Young Choi, J.: Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4321–4330 (2016)
Cui, Z., Xiao, S., Feng, J., Yan, S.: Recurrently target-attending tracking. In: IEEE CVPR, pp. 1449–1458 (2016)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 58–66 (2015)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 4310–4318 (2015)
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2016)
Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: ECO: efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6638–6646 (2017)
Dong, X., Shen, J.: Triplet loss in Siamese network for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 472–488. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_28
Fan, H., Ling, H.: SANet: structure-aware network for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 42–49 (2017)
Fiaz, M., Mahmood, A., Jung, S.K.: Tracking noisy targets: a review of recent object tracking approaches. arXiv preprint arXiv:180203098 (2018)
Fiaz, M., Mahmood, A., Javed, S., Jung, S.K.: Handcrafted and deep trackers: recent visual object tracking approaches and trends. ACM Comput. Surv. (CSUR) 52(2), 43 (2019)
Fiaz, M., Mahmood, A., Jung, S.K.: Convolutional neural network with structural input for visual object tracking. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 1345–1352. ACM (2019)
Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S.: Learning dynamic Siamese network for visual object tracking. In: IEEE CVPR, pp. 1763–1771 (2017)
He, A., Luo, C., Tian, X., Zeng, W.: A twofold Siamese network for real-time object tracking. In: IEEE CVPR, pp. 4834–4843 (2018)
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. arXiv preprint arXiv:181011981 (2018)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., et al.: The visual object tracking VOT2017 challenge results. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1949–1972 (2017)
Kwak, S., Nam, W., Han, B., Han, J.H.: Learning occlusion with likelihoods for visual tracking. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1551–1558. IEEE (2011)
Li, P., Wang, D., Wang, L., Lu, H.: Deep visual tracking: review and experimental comparison. Pattern Recogn. 76, 323–338 (2018)
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–519 (2019)
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)
Lukezic, A., Vojir, T., Cehovin, Z.L., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6309–6318 (2017)
Ma, B., Hu, H., Shen, J., Liu, Y., Shao, L.: Generalized pooling for robust object tracking. IEEE Trans. Image Process. 25(9), 4199–4208 (2016)
Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: IEEE CVPR, pp. 3074–3082 (2015)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)
Pu, S., Song, Y., Ma, C., Zhang, H., Yang, M.H.: Deep attentive tracking via reciprocative learning. In: Advances in Neural Information Processing Systems, pp. 1931–1941 (2018)
Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1910–1917. IEEE (2012)
Shaban, M., Mahmood, A., Al-maadeed, S., Rajpoot, N.: Multi-person head segmentation in low resolution crowd scenes using convolutional encoder-decoder framework. In: Chen, L., Ben Amor, B., Ghorbel, F. (eds.) RFMI 2017. CCIS, vol. 842, pp. 82–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19816-9_7
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2016)
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805–2813 (2017)
Wang, D., Lu, H., Xiao, Z., Yang, M.H.: Inverse sparse tracker with a locally weighted distance metric. IEEE Trans. Image Process. 24(9), 2646–2657 (2015)
Wang, F.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Wang, N., Song, Y., Ma, C., Zhou, W., Liu, W., Li, H.: Unsupervised deep tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S.: Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4854–4863 (2018)
Wen, L., Cai, Z., Lei, Z., Yi, D., Li, S.Z.: Online spatio-temporal structural context learning for visual tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 716–729. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_51
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Yun, S., Choi, J., Yoo, Y., Yun, K., Young Choi, J.: Action-decision networks for visual tracking with deep reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2711–2720 (2017)
Zhu, Z., Huang, G., Zou, W., Du, D., Huang, C.: UCT: learning unified convolutional networks for real-time visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1973–1982 (2017)
Acknowledgment
This research was supported by Development project of leading technology for future vehicle of the business of Daegu metropolitan city (No. 20171105).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Fiaz, M., Rahman, M.M., Mahmood, A., Farooq, S.S., Baek, K.Y., Jung, S.K. (2020). Adaptive Feature Selection Siamese Networks for Visual Tracking. In: Ohyama, W., Jung, S. (eds) Frontiers of Computer Vision. IW-FCV 2020. Communications in Computer and Information Science, vol 1212. Springer, Singapore. https://doi.org/10.1007/978-981-15-4818-5_13
Download citation
DOI: https://doi.org/10.1007/978-981-15-4818-5_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4817-8
Online ISBN: 978-981-15-4818-5
eBook Packages: Computer ScienceComputer Science (R0)