ABSTRACT
The tracker based on Siamese network usually adopts the cross-correlation of convolutional features between the object branch and the search branch to describe the visual object tracking task as a similarity matching. However, most trackers apply backbone networks to represent multi-scale features in a hierarchical manner, which has limitations in distinguishing similar object. In addition, the feature information on each channel and spatial region is treated equally in the calculation process. In fact, the feature importance of these locations is not the same, and the anchor box detection mechanism requires setting a large number of anchor boxes with redundant hyperparameters, which can lead to computational and memory overload. As for these problems, this paper proposes an object tracking algorithm using siamese attention mechanism based key point network, called SAMR (Siamese Attention Mechanism Reppoints). First of all, the backbone network res2net is used to represent fine-grained multi-scale features on a hierarchical basis for increasing the receptive field of each layer. Then the channel and spatial attention information of the attention mechanism is to enhance the discriminant ability and positioning ability of the feature map. Finally, a set of key points anchor-free frames is adopted to detect and track the object. Experiments on challenging datasets such as VOT2018, VOT2019, and OTB100 illustrate that the SAMR proposed in this paper is superior to other advanced trackers in tracking accuracy.
- Lee K H, Hwang J N, Okopal G, . Ground-Moving-Platform-Based Human Tracking Using Visual SLAM and Constrained Multiple Kernels[J].IEEE Transactions on Intelligent Transportation Systems,2016,17(12): 3602 − 3612.Google Scholar
- Gao M, Jin L, Jiang Y, . Manifold Siamese Network: A Novel Visual Tracking ConvNet for Autonomous Vehicles[J]. IEEE Transactions on Intelligent Transportation Systems,2019: 21(4): 1612 − 1623.Google Scholar
- Oudah M, Al-Naji A, Chahl J. Hand Gesture Recognition Based on Computer Vision: A Review of Techniques. J Imaging. 2020;6(8):73. Published 2020 Jul 23. doi:10.3390/jimaging6080073.Google ScholarCross Ref
- Vilela D, Cossío U, Parmar J, Medical imaging for the tracking of micromotors[J]. ACS nano, 2018, 12(2): 1220-1227.Google Scholar
- Bolme D S, Beveridge J R, Draper B A, Visual object tracking using adaptive correlation filters[C]//2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 2010: 2544-2550.Google Scholar
- J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “Exploiting the Circulant Structure of Tracking-by-Detection with Kernels,” in Computer Vision – ECCV 2012, 2012, pp. 702–715.Google Scholar
- J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. Highspeed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015.Google Scholar
- Bertinetto L, Valmadre J, Henriques J F, Fullyconvolutional siamese networks for object tracking[C]//European conference on computer vision. Springer, Cham, 2016: 850-865.Google Scholar
- C. Ma, J. B. Huang, X. Yang, Hierarchical convolutional features for visual tracking [C]. IEEE International Conference on Computer Vision, 2015: 3074-3082.Google Scholar
- Wang Q, Gao J, Xing J L, Dcfnet: Discriminant correlation filters network for visual tracking[Z]. arXiv: 1704.04057v1, 2017Google Scholar
- Wang Q, Zhang L, Bertinetto L, Hu W, Torr PHS. Fast Online Object Tracking and Segmentation: A Unifying Approach. arXiv.org. [C].2019 May 05.Google Scholar
- Li B, Wu W, Wang Q, Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 4282-4291.Google Scholar
- Chen Z, Zhong B, Li G, Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 6668-6677.Google Scholar
- Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: Point set representation for object detection. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. pp.9656–9665. IEEE (2019)Google ScholarCross Ref
- Melekhov I, Kannala J, Rahtu E. Siamese network features for image matching[C]//2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016: 378-383.Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Adv. Neural Inform.Process. Syst., pages 1097–1105, 2012.Google Scholar
- K. Simonyan and A. Zisserman. V ery deep convolutional networks for large-scale image recognition. In Int. Conf. Learn. Represent., 2014.Google Scholar
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conf. Comput. Vis. Pattern Recog., pages 770–778, 2016.Google ScholarCross Ref
- Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, and Weiming Hu. Distractor-aware siamese networks for visual object tracking. In ECCV, 2018.Google Scholar
- S. Woo, J. Park, J.-Y. Lee, I. So Kweon, Cbam: Convolutional block attention module, in: European Conference on Computer Vision (ECCV), Springer, 2018, pp. 3–19.Google ScholarDigital Library
- M. Kristan, A. Leonardis, J. Matas, M. Felsberg,R. Pfugfelder, L. C. Zajc, T. V ojir, G. Bhat, A. Lukezic,A. Eldesokey, G. Fernandez, and The sixth visual object tracking vot2018 challenge results. In ECCV Workshops,2018. 2, 6, 7, 8Google Scholar
- M. Kristan, A. Leonardis, J. Matas, M. Felsberg,R. Pflugfelder, L. ˇCehovin Zajc, T. Vojir, G. Häger,A. Lukeˇziˇc, A. Eldesokey, G. Fernandez, and The seventh visual object tracking vot2019 challenge results, 2019.Google Scholar
- Y. Wu, J. Lim, and M.-H. Yang. Object tracking benchmark.TPAMI, 2015. 1, 2, 5, 6, 7Google Scholar
Index Terms
- Object Tracking Algorithm Using Siamese Attention Mechanism based Key Point Network
Recommendations
Object Tracking Algorithm for Siamese Network Combined with Channel Attention Mechanism
ICIAI '22: Proceedings of the 2022 6th International Conference on Innovation in Artificial IntelligenceAs an important branch in the field of computer vision, object tracking has been widely used in many fields such as intelligent video surveillance, human-computer interaction and autonomous driving. Although object tracking has imposing development in ...
Dual Attention based Siamese Network for Visual Object Tracking
ICFEICT 2021: International Conference on Frontiers of Electronics, Information and Computation TechnologiesVideo object tracking is a highly challenging problem, in which the initialization of the target object is given by the bounding box of first frame. The trackers based on deep Siamese network have achieved promising performance, while the robustness is ...
Object tracking based on siamese network with 3D attention and multiple graph attention
AbstractCurrently, the object tracking algorithm based on the siamese network is the most popular research direction of object tracking. However, most siamese network trackers are unable to update the template, resulting in hardly dealing with the fuzzy ...
Highlights- The proposed algorithm can be trained with a smaller dataset while performing well.
- Introducing 3D attention SiamAM improves the accuracy of target feature extraction.
- Multiple graph attention improves the accuracy of feature ...
Comments