SiamMN: Siamese modulation network for visual object tracking

Fu, Li-hua; Ding, Yu; Du, Yu-bin; Zhang, Bo; Wang, Lu-yuan; Wang, Dan

doi:10.1007/s11042-020-09546-6

SiamMN: Siamese modulation network for visual object tracking

Published: 28 August 2020

Volume 79, pages 32623–32641, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Li-hua Fu ORCID: orcid.org/0000-0003-1271-2518¹,
Yu Ding¹,
Yu-bin Du¹,
Bo Zhang¹,
Lu-yuan Wang¹ &
…
Dan Wang¹

332 Accesses
5 Citations
Explore all metrics

Abstract

Visual object tracking methods based on Siamese network are often difficult to distinguish objects with the same semantic or similar appearance as tracking target in tracking process due to the lack of discriminating strategies for the confusing objects. We propose a visual object tracking method based on Siamese modulation network. It takes the given bounding box in the target frame and the current frame as input, and fuses these multi-layer convolutional features to obtain more target appearance information of bounding box and the current frame. The feature modulator generates feature modulation vector based on the given bounding box to enhance visual appearance information of target instance in multi-layer feature of the current frame, so as to make target instance obtain higher score in response map of region proposal network, and thus realize target instance-specific tracking task. Experiments on two public benchmark datasets, OTB2015 and VOT2018, show that the proposed tracker has a competitive performance among other state-of-the art trackers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards a Better Match in Siamese Network Based Visual Object Tracker

Siamese object tracking based on multi-frequency enhancement feature

Article 27 February 2023

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

References

L Bertinetto, J Valmadre, JF Henriques, et al. (2016). Fully-convolutional Siamese networks for object tracking[C]. 2016 European Conference on Computer Vision(ECCV), Springer International publishing
G Bhat, J Johnander, M Danelljan, et al. (2018). Unveiling the Power of Deep Tracking[C]. 2018 European Conference on Computer Vision(ECCV), Springer International publishing
DS Bolme, JR Beveridge, BA Draper, et al (2010). Visual object tracking using adaptive correlation filters[C], 2010 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
F Chelsea, A Pieter, L Sergey (2017). Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv: 1703.03400
M Danelljan, G Bhat, F Shahbaz Khan, and M Felsberg (2017) [C]. Eco: Efficient convolution operators for tracking. 2017 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
M Danelljan, G Hager, FS Khan, and M Felsberg (2015). Learning spatially regularized correlation filters for visual tracking[C]. 2015 IEEE International Conference on Computer Vision(ICCV)
M Danelljan, G Hager, FS Khan, et al. (2014). Accurate scale estimation for robust visual tracking[C]. 2014 British Machine Vision Conference(BMVC)
M Danelljan, G Hager, FS Khan, et al. (2015). Convolutional features for correlation filter based visual tracking[C]. 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)
M Danelljan, A Robinson, FS Khan, and M Felsberg (2016). Beyond correlation filters: Learning continuous convolution operators for visual tracking[C]. 2016 European Conference on Computer Vision(ECCV), Springer International publishing
D Fan, W Wang, M Cheng, et al. (2019). Shifting More Attention to Video Salient Object Detection[C]. 2019 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
HK Galoogahi, A Fagg, and S Lucey (2017). Learning background-aware correlation filters for visual tracking[C]. 2017 IEEE International Conference on Computer Vision(ICCV)
B Hariharan and R Girshick (2017). Low-shot visual recognition by shrinking and hallucinating features[C]. 2017 IEEE International Conference on Computer Vision(ICCV)
K He, X Zhang, S Ren, J Sun (2015). Deep residual learning for image recognition[C]. 2015 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
D Held, S Thrun, S Savarese (2016). Learning to track at 100 fps with deep regression networks[C]. 2016 European Conference on Computer Vision(ECCV), Springer International publishing
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters[J]. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article Google Scholar
JF Henriques, R Caseiro, P Martins, et al. (2012). Exploiting the circulant structure of tracking-by-detection with kernels[C]. 2012 European Conference on Computer Vision(ECCV), Springer International publishing
J Hu, J Lu, Y Tan (2014). Discriminative deep metric learning for face verification in the wild[C]. 2014 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
M Kristan, A Leonardis, J Matas, M Felsberg, R Pflflugfelder, L Cehovin Zajc, T Vojir, G Hager, A Lukezic, A Eldesokey, G Fernandez (2017). The visual object tracking VOT2017 challenge results[C]. 2017 IEEE International Conference on Computer Vision Workshop(ICCVW)
M Kristan, A Leonardis, J Matas, M Felsberg, R Pfugfelder, LC Zajc, T Vojir, G Bhat, A Lukezic, A Eldesokey, G Fernandez, and et al. (2018). The sixth visual object tracking vot2018 challenge results[C]. 2018 European Conference on Computer Vision(ECCV)
Lee KH, Hwang JN (2015) On-road pedestrian tracking across multiple driving recorders[J]. IEEE Transactions on Multimedia 17(9):1429–1438
Article Google Scholar
F Li, C Tian, W Zuo, et al. (2018). Learning spatial-temporal regularized correlation filters for visual tracking[C]. 2018 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
B Li, W Wu, Q Wang, F Zhang, J Xing, J Yan (2019). SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks[C]. 2019 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
B Li, J Yan, W Wu, Z Zhu, X Hu (2018). High performance visual tracking with siamese region proposal network[C]. 2018 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
Li J, Zhou X, Chan S, Chen S (2017) Object tracking using a convolutional network and a structured output SVM[J]. Computa-tional visual media 003(004):325–335
Article Google Scholar
X Lu, B Ni, C Ma, X Yang (2019). Adaptive region proposal with channel regularization for robust Object tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, doi: https://doi.org/10.1109/TCSVT.2019.2944654
Lu X, Ni B, Ma C, Yang X (2019) Learning Transform-Aware Attentive Network for Object Tracking[J]. Neurocomputing 349(JUL.15):133–144
Article Google Scholar
X Lu, W Wang, C Ma, et al. (2019). See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks[C]. 2019 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
Lukežic A, Zajc LC, Kristan M (2017) Deformable parts correlation filters for robust visual tracking[J]. IEEE transactions on cybernetics 48(6):1849–1861
Article Google Scholar
Y Qin, S He, Y Zhao, et al. (2016). Learning multi-domain convolutional neural networks for visual tracking[C]. 2016 2nd International Conference on Artificial Intelligence and Industrial Engineering(AIIE)
S Ravi, H Larochelle (2017). Optimization as a model for few-shot learning[C]. 2017 International Conference on Learning Representations(ICLR)
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
S Tang, M Andriluka, B Andres, B Schiele (2017). Multiple people tracking by lifted multicut and person reidentification[C]. IEEE conference on computer vision and pattern recognition(CVPR), IEEE, 2017
J Valmadre, L Bertinetto, JF Henriques, et al. (2017). End-to-end representation learning for correlation filter based tracking[C]. 2017 IEEE conference on computer vision and pattern recognition(CVPR), IEEE
Q Wang, J Gao, J Xing, M Zhang, and W Hu (2017). DCFNet: Discriminant correlation filters network for visual tracking. arXiv preprint arXiv: 1704.04057
Wang Z, Zou C, Cai W (2020) Small sample classification of Hyperspectral remote sensing images based on sequential joint Deeping Learning model[J]. IEEE Access 8:71353–71363. https://doi.org/10.1109/ACCESS.2020.2986267
Article Google Scholar
Wu Y, Lim J, Yang MH (2015) Object tracking benchmark[J]. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848
Article Google Scholar
B Xiao, H Wu, Y Wei (2018). Simple baselines for human pose estimation and tracking[C]. 2018 European Conference on Computer Vision(ECCV), Springer International publishing
J Xing, H Ai, S Lao (2010). Multiple human tracking based on multi-view upper-body detection and discriminative learning[C]. 20th International Conference on Pattern Recognition(ICPR)
Z Zhu, Q Wang, B Li, W Wu, J Yan, W Hu (2018). Distractor-aware siamese networks for visual object tracking[C]. 2018 European Conference on Computer Vision(ECCV), Springer International publishing

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
Li-hua Fu, Yu Ding, Yu-bin Du, Bo Zhang, Lu-yuan Wang & Dan Wang

Authors

Li-hua Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Ding
View author publications
You can also search for this author in PubMed Google Scholar
Yu-bin Du
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lu-yuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-hua Fu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, Lh., Ding, Y., Du, Yb. et al. SiamMN: Siamese modulation network for visual object tracking. Multimed Tools Appl 79, 32623–32641 (2020). https://doi.org/10.1007/s11042-020-09546-6

Download citation

Received: 10 January 2020
Revised: 23 July 2020
Accepted: 04 August 2020
Published: 28 August 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11042-020-09546-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SiamMN: Siamese modulation network for visual object tracking

Abstract

Access this article

Similar content being viewed by others

Towards a Better Match in Siamese Network Based Visual Object Tracker

Siamese object tracking based on multi-frequency enhancement feature

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SiamMN: Siamese modulation network for visual object tracking

Abstract

Access this article

Similar content being viewed by others

Towards a Better Match in Siamese Network Based Visual Object Tracker

Siamese object tracking based on multi-frequency enhancement feature

Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation