research-article

Deep mutual learning for visual tracking

Authors:

Pingping Zhang,

Huchuan LuAuthors Info & Claims

ACM TURC '19: Proceedings of the ACM Turing Celebration Conference - China

Article No.: 105, Pages 1 - 5

https://doi.org/10.1145/3321408.3323927

Published: 17 May 2019 Publication History

Abstract

In this work, we propose a novel deep learning method to improve the accuracy and the speed of particle filter based object trackers. The main contributions include two aspects. First, to enhance the discrimination of feature representations, we propose two identical CNNs, each of which shares information from the other by introducing the Kullback-Leibler (KL) loss. Second, in order to boost the speed of trackers, we introduce the region of interest (ROI) align, and optimize the structure-identical networks with knowledge distillation and deep mutual learning. It makes the forward propagation only calculate once instead of hundreds of iterations. Experimental results on the OTB2015 and VOT2015 benchmarks demonstrate that our method performs better than several state-of-the-art algorithms.

References

[1]

Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu, "Visual tracking with fully convolutional networks," in ICCV, 2016, pp. 3119--3127.

Digital Library

[2]

Hyeonseob Nam and Bohyung Han, "Learning multi-domain convolutional neural networks for visual tracking," pp. 4293--4302, 2015.

[3]

Z. Chi, H. Li, H. Lu, and M. Yang, "Dual deep network for visual tracking," IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 2005--2015, 2017.

Digital Library

[4]

Chong Sun, Huchuan Lu, and Ming Hsuan Yang, "Learning spatial-aware regressions for visual tracking," 2018.

[5]

Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg, "Eco: Efficient convolution operators for tracking," in CVPR, 2017, pp. 6931--6939.

[6]

Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, and Michael Felsberg, "Beyond correlation filters: Learning continuous convolution operators for visual tracking," in ECCV, 2016.

[7]

David Held, Sebastian Thrun, and Silvio Savarese, "Learning to track at 100 fps with deep regression networks," in ECCV, 2016, pp. 749--765.

[8]

J. F. Henriques, R Caseiro, P Martins, and J Batista, "High-speed tracking with kernelized correlation filters," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. 3, pp. 583--596, 2015.

Digital Library

[9]

Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition," in ICASSP, 2013, pp. 7893--7897.

[10]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick, "Mask r-cnn," in ICCV, 2017, pp. 2980--2988.

[11]

Yi Wu, Jongwoo Lim, and Ming Hsuan Yang, "Object tracking benchmark," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. 9, pp. 1834--1848, 2015.

Digital Library

[12]

M. Kristan, J. Matas, A. Leonardis, M. Felsberg, L. Cehovin, G. Fernandez, T. Vojir, G. Häger, G. Nebehay, and R. Pflugfelder, "The visual object tracking VOT2015 challenge results," in Workshop on the VOT2015 Visual Object Tracking Challenge, 2015.

[13]

Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman, "Return of the devil in the details: Delving deep into convolutional nets," arXiv preprint arXiv:1405.3531, 2014.

[14]

Ying Zhang, Tao Xiang, Timothy M Hospedales, and Huchuan Lu, "Deep mutual learning," in CVPR, 2018, pp. 943--951.

[15]

Junho Yim, Donggyu Joo, Jihoon Bae, and Junmo Kim, "A gift from knowledge distillation: Fast optimization, network minimization and transfer learning," in CVPR, 2017, vol. 2.

[16]

Pedro F. Felzenszwalb, Ross B. Girshick, David Mcallester, and Deva Ramanan, "Object detection with discriminatively trained part-based models," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. 9, pp. 1627--1645, 2010.

Digital Library

[17]

Tianzhu Zhang, Changsheng Xu, and Ming Hsuan Yang, "Multi-task correlation particle filter for robust object tracking," in CVPR, 2017, pp. 4819--4827.

[18]

Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu, "High performance visual tracking with siamese region proposal network," in CVPR, 2018, pp. 8971--8980.

[19]

Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, and Chang Huang, "Uct: Learning unified convolutional networks for real-time visual tracking," in ICCVW, 2017, pp. 1973--1982.

[20]

Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey, "Learning background-aware correlation filters for visual tracking," in CVPR, 2017, pp. 1144--1152.

[21]

Heng Fan and Haibin Ling, "Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking," in ICCV, 2017, pp. 5487--5495.

[22]

Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip HS Torr, "Staple: Complementary learners for real-time tracking," in CVPR, 2016, pp. 1401--1409.

[23]

Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg, "Learning spatially regularized correlation filters for visual tracking," in ICCV, 2015, pp. 4310--4318.

Digital Library

[24]

Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr, "Fully-convolutional siamese networks for object tracking," in ECCV, 2016, pp. 850--865.

[25]

Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg, "Convolutional features for correlation filter based visual tracking," in ICCVW, 2015, pp. 621--629.

Digital Library

[26]

Yibing Song, Chao Ma, Lijun Gong, Jiawei Zhang, Rynson W. H. Lau, and Ming Hsuan Yang, "Crest: Convolutional residual learning for visual tracking," in ICCV, 2017, pp. 2574--2583.

[27]

Martin Danelljan, Gustav Häger, Fahad Khan, and Michael Felsberg, "Accurate scale estimation for robust visual tracking," in BMVC, 2014.

[28]

Gao Zhu, Fatih Porikli, and Hongdong Li, "Beyond local search: Tracking objects everywhere with instance-specific proposals," in CVPR, 2016, pp. 943--951.

[29]

Hyeonseob Nam, Mooyeol Baek, and Bohyung Han, "Modeling and propagating cnns in a tree structure for visual tracking," arXiv preprint arXiv:1608.07242, 2016.

Cited By

Liu QChen YCao W(2023)Dual-domain reciprocal learning design for few-shot image classificationNeural Computing and Applications10.1007/s00521-023-08255-z35:14(10649-10662)Online publication date: 1-Feb-2023
https://doi.org/10.1007/s00521-023-08255-z

Deep mutual learning for visual tracking
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems

Recommendations

Deep visual tracking

The first comprehensive survey on deep-learning-based trackers.Review existing deep visual trackers from three different perspectives.Large-scale benchmark evaluations of deep visual trackers.Summarize cutting-edge research works and discuss future ...
Study on Deep Learning and Its Application in Visual Tracking
BWCCA '15: Proceedings of the 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA)

Inspired by recent advances in deep learning, this paper reviews the deep learning methodologies and its applications in object tracking. To overcome the complexity and low-efficiency of existing full-connected deep learning based tracker, we use a ...
Regional deep learning model for visual tracking

Deep learning has been successfully applied to visual tracking due to its powerful feature learning characteristic. However, existing deep learning trackers rely on single observation model and focus on the holistic representation of the tracking ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACM TURC '19: Proceedings of the ACM Turing Celebration Conference - China

May 2019

963 pages

ISBN:9781450371582

DOI:10.1145/3321408

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ACM TURC 2019

ACM TURC 2019: ACM Turing Celebration Conference - China

May 17 - 19, 2019

Chengdu, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
169
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu QChen YCao W(2023)Dual-domain reciprocal learning design for few-shot image classificationNeural Computing and Applications10.1007/s00521-023-08255-z35:14(10649-10662)Online publication date: 1-Feb-2023
https://doi.org/10.1007/s00521-023-08255-z

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten