skip to main content
10.1145/3301506.3301544acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvipConference Proceedingsconference-collections
research-article

Template Attentional Siamese Network for Object Tracking

Published: 29 December 2018 Publication History

Abstract

Recent years, visual object tracking has attracted more and more attention as a fundamental topic. Many deep based trackers, especially Siamese Network based trackers, have achieved state-of-the-art performance on multiple benchmarks. However, most of these trackers applied with the first frame as template throughout the tracking process. We propose a Template Attentional Siamese Network called TASNet. The core of TASNet is combining the detection results of two template frames, where the first frame extracting discriminative features and the latest frame capturing the motion changes, to enhance model tracking effect. Template-wise weights are calculated from attention mechanism to integrate the detecting results of two templates in current frame tracking. The proposed architecture is trained from end to end on the ILSVRC2015 video dataset. Our tracker operates at frame-rates real-time and achieves state-of-the-art tracking accuracy while large deformation of the object is appeared.

References

[1]
Wu, Y., Lim, J., and Yang, M.-H. 2013. Online Object Tracking: A Benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2411--2418.
[2]
Wu, Y., Lim, J., andYang, M.-H. 2015. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1834--1848.
[3]
Kristan, M., and et al. 2017. The visual object tracking vot2015 challenge results. In Proceedings of the IEEE international conference on computer vision workshops.
[4]
Wang, N., and Yeung, D. 2013. Learning a deep compact image representation for visual tracking. in Advances in neural information processing systems.
[5]
Ma, C., Huang, J.-B., Yang, X., Yang, M.-H. 2015. Hierarchical convolutional features for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision
[6]
Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q. Lim, J., Yang, M.-H. 2016. Hedged Deep Tracking. in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[7]
Danelljan, M., Robinson, A., Khan, F. S., Felsberg, M. 2016. Beyond correlation filters: Learning continuous convolution operators for visual tracking. in Proceedings of European Conference on Computer Vision.
[8]
Danelljan, M., Bhat, G., Khan, F. S., Felsberg, M. 2016. ECO: Efficient Convolution Operators for Tracking. arXiv preprint arXiv:1611.09224.
[9]
Nam, H., Han, B. 2016. Learning Multi-domain Convolutional Neural Networks for Visual Tracking. in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 4293--4302.
[10]
Nam, H., Baek, M., Han, B. 2016. Modeling and Propagating CNNs in a Tree Structure for Visual Tracking. arXiv preprint arXiv:1608.07242.
[11]
Held, D., Thrun, S., and Savarese, S. 2016. Learning to Track at 100 FPS with Deep Regression Networks. In Proceedings of European Conference on Computer Vision, pages 749--765.
[12]
Bertinetto, L., Valmadre, J., Henriques, J. F., Vedaldi, A., Torr, P. H. S. 2016. Fully-Convolutional Siamese Networks for Object Tracking. In Proceedings of European Conference on Computer Vision, pages 850--865.
[13]
He, A., Luo, C., Tian, X., Zeng, W. 2018. A Twofold Siamese Network for Real-Time Object Tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[14]
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X. 2018. High Performance Visual Tracking with Siamese Region Proposal Network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[15]
Zhu, Z., Wu, W., Zou, W., Yan, J. 2018. End-to-end Flow Correlation Tracking with Spatial-temporal Attention. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[16]
Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S. 2018. Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[17]
Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S. 2017. Learning Dynamic Siamese Network for Visual Object Tracking. In Proceedings of IEEE International Conference on Computer Vision.
[18]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211--252.
[19]
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P. H. S. 2016. Staple: Complementary Learners for Real-Time Tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

Index Terms

  1. Template Attentional Siamese Network for Object Tracking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICVIP '18: Proceedings of the 2018 2nd International Conference on Video and Image Processing
    December 2018
    252 pages
    ISBN:9781450366137
    DOI:10.1145/3301506
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Kyoto University: Kyoto University
    • TU: Tianjin University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 December 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Siamese network
    2. attention mechanism
    3. discriminative features
    4. motion change
    5. object tracking

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICVIP 2018

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 112
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media