skip to main content
10.1145/3604078.3604133acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdipConference Proceedingsconference-collections
research-article

Object Tracking Algorithm Using Siamese Attention Mechanism based Key Point Network

Published:26 October 2023Publication History

ABSTRACT

The tracker based on Siamese network usually adopts the cross-correlation of convolutional features between the object branch and the search branch to describe the visual object tracking task as a similarity matching. However, most trackers apply backbone networks to represent multi-scale features in a hierarchical manner, which has limitations in distinguishing similar object. In addition, the feature information on each channel and spatial region is treated equally in the calculation process. In fact, the feature importance of these locations is not the same, and the anchor box detection mechanism requires setting a large number of anchor boxes with redundant hyperparameters, which can lead to computational and memory overload. As for these problems, this paper proposes an object tracking algorithm using siamese attention mechanism based key point network, called SAMR (Siamese Attention Mechanism Reppoints). First of all, the backbone network res2net is used to represent fine-grained multi-scale features on a hierarchical basis for increasing the receptive field of each layer. Then the channel and spatial attention information of the attention mechanism is to enhance the discriminant ability and positioning ability of the feature map. Finally, a set of key points anchor-free frames is adopted to detect and track the object. Experiments on challenging datasets such as VOT2018, VOT2019, and OTB100 illustrate that the SAMR proposed in this paper is superior to other advanced trackers in tracking accuracy.

References

  1. Lee K H, Hwang J N, Okopal G, . Ground-Moving-Platform-Based Human Tracking Using Visual SLAM and Constrained Multiple Kernels[J].IEEE Transactions on Intelligent Transportation Systems,2016,17(12): 3602 − 3612.Google ScholarGoogle Scholar
  2. Gao M, Jin L, Jiang Y, . Manifold Siamese Network: A Novel Visual Tracking ConvNet for Autonomous Vehicles[J]. IEEE Transactions on Intelligent Transportation Systems,2019: 21(4): 1612 − 1623.Google ScholarGoogle Scholar
  3. Oudah M, Al-Naji A, Chahl J. Hand Gesture Recognition Based on Computer Vision: A Review of Techniques. J Imaging. 2020;6(8):73. Published 2020 Jul 23. doi:10.3390/jimaging6080073.Google ScholarGoogle ScholarCross RefCross Ref
  4. Vilela D, Cossío U, Parmar J, Medical imaging for the tracking of micromotors[J]. ACS nano, 2018, 12(2): 1220-1227.Google ScholarGoogle Scholar
  5. Bolme D S, Beveridge J R, Draper B A, Visual object tracking using adaptive correlation filters[C]//2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 2010: 2544-2550.Google ScholarGoogle Scholar
  6. J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “Exploiting the Circulant Structure of Tracking-by-Detection with Kernels,” in Computer Vision – ECCV 2012, 2012, pp. 702–715.Google ScholarGoogle Scholar
  7. J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. Highspeed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015.Google ScholarGoogle Scholar
  8. Bertinetto L, Valmadre J, Henriques J F, Fullyconvolutional siamese networks for object tracking[C]//European conference on computer vision. Springer, Cham, 2016: 850-865.Google ScholarGoogle Scholar
  9. C. Ma, J. B. Huang, X. Yang, Hierarchical convolutional features for visual tracking [C]. IEEE International Conference on Computer Vision, 2015: 3074-3082.Google ScholarGoogle Scholar
  10. Wang Q, Gao J, Xing J L, Dcfnet: Discriminant correlation filters network for visual tracking[Z]. arXiv: 1704.04057v1, 2017Google ScholarGoogle Scholar
  11. Wang Q, Zhang L, Bertinetto L, Hu W, Torr PHS. Fast Online Object Tracking and Segmentation: A Unifying Approach. arXiv.org. [C].2019 May 05.Google ScholarGoogle Scholar
  12. Li B, Wu W, Wang Q, Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 4282-4291.Google ScholarGoogle Scholar
  13. Chen Z, Zhong B, Li G, Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 6668-6677.Google ScholarGoogle Scholar
  14. Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: Point set representation for object detection. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. pp.9656–9665. IEEE (2019)Google ScholarGoogle ScholarCross RefCross Ref
  15. Melekhov I, Kannala J, Rahtu E. Siamese network features for image matching[C]//2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016: 378-383.Google ScholarGoogle Scholar
  16. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Adv. Neural Inform.Process. Syst., pages 1097–1105, 2012.Google ScholarGoogle Scholar
  17. K. Simonyan and A. Zisserman. V ery deep convolutional networks for large-scale image recognition. In Int. Conf. Learn. Represent., 2014.Google ScholarGoogle Scholar
  18. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conf. Comput. Vis. Pattern Recog., pages 770–778, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  19. Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, and Weiming Hu. Distractor-aware siamese networks for visual object tracking. In ECCV, 2018.Google ScholarGoogle Scholar
  20. S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: European Conference on Computer Vision (ECCV), Springer, 2018, pp. 3–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Kristan, A. Leonardis, J. Matas, M. Felsberg,R. Pfugfelder, L. C. Zajc, T. V ojir, G. Bhat, A. Lukezic,A. Eldesokey, G. Fernandez, and The sixth visual object tracking vot2018 challenge results. In ECCV Workshops,2018. 2, 6, 7, 8Google ScholarGoogle Scholar
  22. M. Kristan, A. Leonardis, J. Matas, M. Felsberg,R. Pflugfelder, L. ˇCehovin Zajc, T. Vojir, G. Häger,A. Lukeˇziˇc, A. Eldesokey, G. Fernandez, and The seventh visual object tracking vot2019 challenge results, 2019.Google ScholarGoogle Scholar
  23. Y. Wu, J. Lim, and M.-H. Yang. Object tracking benchmark.TPAMI, 2015. 1, 2, 5, 6, 7Google ScholarGoogle Scholar

Index Terms

  1. Object Tracking Algorithm Using Siamese Attention Mechanism based Key Point Network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing
      May 2023
      711 pages
      ISBN:9798400708237
      DOI:10.1145/3604078

      Copyright © 2023 ACM

      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format