skip to main content
10.1145/3321408.3323927acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-turcConference Proceedingsconference-collections
research-article

Deep mutual learning for visual tracking

Published: 17 May 2019 Publication History

Abstract

In this work, we propose a novel deep learning method to improve the accuracy and the speed of particle filter based object trackers. The main contributions include two aspects. First, to enhance the discrimination of feature representations, we propose two identical CNNs, each of which shares information from the other by introducing the Kullback-Leibler (KL) loss. Second, in order to boost the speed of trackers, we introduce the region of interest (ROI) align, and optimize the structure-identical networks with knowledge distillation and deep mutual learning. It makes the forward propagation only calculate once instead of hundreds of iterations. Experimental results on the OTB2015 and VOT2015 benchmarks demonstrate that our method performs better than several state-of-the-art algorithms.

References

[1]
Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu, "Visual tracking with fully convolutional networks," in ICCV, 2016, pp. 3119--3127.
[2]
Hyeonseob Nam and Bohyung Han, "Learning multi-domain convolutional neural networks for visual tracking," pp. 4293--4302, 2015.
[3]
Z. Chi, H. Li, H. Lu, and M. Yang, "Dual deep network for visual tracking," IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 2005--2015, 2017.
[4]
Chong Sun, Huchuan Lu, and Ming Hsuan Yang, "Learning spatial-aware regressions for visual tracking," 2018.
[5]
Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg, "Eco: Efficient convolution operators for tracking," in CVPR, 2017, pp. 6931--6939.
[6]
Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, and Michael Felsberg, "Beyond correlation filters: Learning continuous convolution operators for visual tracking," in ECCV, 2016.
[7]
David Held, Sebastian Thrun, and Silvio Savarese, "Learning to track at 100 fps with deep regression networks," in ECCV, 2016, pp. 749--765.
[8]
J. F. Henriques, R Caseiro, P Martins, and J Batista, "High-speed tracking with kernelized correlation filters," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. 3, pp. 583--596, 2015.
[9]
Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition," in ICASSP, 2013, pp. 7893--7897.
[10]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick, "Mask r-cnn," in ICCV, 2017, pp. 2980--2988.
[11]
Yi Wu, Jongwoo Lim, and Ming Hsuan Yang, "Object tracking benchmark," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. 9, pp. 1834--1848, 2015.
[12]
M. Kristan, J. Matas, A. Leonardis, M. Felsberg, L. Cehovin, G. Fernandez, T. Vojir, G. Häger, G. Nebehay, and R. Pflugfelder, "The visual object tracking VOT2015 challenge results," in Workshop on the VOT2015 Visual Object Tracking Challenge, 2015.
[13]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman, "Return of the devil in the details: Delving deep into convolutional nets," arXiv preprint arXiv:1405.3531, 2014.
[14]
Ying Zhang, Tao Xiang, Timothy M Hospedales, and Huchuan Lu, "Deep mutual learning," in CVPR, 2018, pp. 943--951.
[15]
Junho Yim, Donggyu Joo, Jihoon Bae, and Junmo Kim, "A gift from knowledge distillation: Fast optimization, network minimization and transfer learning," in CVPR, 2017, vol. 2.
[16]
Pedro F. Felzenszwalb, Ross B. Girshick, David Mcallester, and Deva Ramanan, "Object detection with discriminatively trained part-based models," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. 9, pp. 1627--1645, 2010.
[17]
Tianzhu Zhang, Changsheng Xu, and Ming Hsuan Yang, "Multi-task correlation particle filter for robust object tracking," in CVPR, 2017, pp. 4819--4827.
[18]
Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu, "High performance visual tracking with siamese region proposal network," in CVPR, 2018, pp. 8971--8980.
[19]
Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, and Chang Huang, "Uct: Learning unified convolutional networks for real-time visual tracking," in ICCVW, 2017, pp. 1973--1982.
[20]
Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey, "Learning background-aware correlation filters for visual tracking," in CVPR, 2017, pp. 1144--1152.
[21]
Heng Fan and Haibin Ling, "Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking," in ICCV, 2017, pp. 5487--5495.
[22]
Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip HS Torr, "Staple: Complementary learners for real-time tracking," in CVPR, 2016, pp. 1401--1409.
[23]
Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg, "Learning spatially regularized correlation filters for visual tracking," in ICCV, 2015, pp. 4310--4318.
[24]
Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr, "Fully-convolutional siamese networks for object tracking," in ECCV, 2016, pp. 850--865.
[25]
Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg, "Convolutional features for correlation filter based visual tracking," in ICCVW, 2015, pp. 621--629.
[26]
Yibing Song, Chao Ma, Lijun Gong, Jiawei Zhang, Rynson W. H. Lau, and Ming Hsuan Yang, "Crest: Convolutional residual learning for visual tracking," in ICCV, 2017, pp. 2574--2583.
[27]
Martin Danelljan, Gustav Häger, Fahad Khan, and Michael Felsberg, "Accurate scale estimation for robust visual tracking," in BMVC, 2014.
[28]
Gao Zhu, Fatih Porikli, and Hongdong Li, "Beyond local search: Tracking objects everywhere with instance-specific proposals," in CVPR, 2016, pp. 943--951.
[29]
Hyeonseob Nam, Mooyeol Baek, and Bohyung Han, "Modeling and propagating cnns in a tree structure for visual tracking," arXiv preprint arXiv:1608.07242, 2016.

Cited By

View all
  • (2023)Dual-domain reciprocal learning design for few-shot image classificationNeural Computing and Applications10.1007/s00521-023-08255-z35:14(10649-10662)Online publication date: 1-Feb-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACM TURC '19: Proceedings of the ACM Turing Celebration Conference - China
May 2019
963 pages
ISBN:9781450371582
DOI:10.1145/3321408
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep mutual learning
  2. knowledge distillation
  3. particle filter object tracking

Qualifiers

  • Research-article

Conference

ACM TURC 2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Dual-domain reciprocal learning design for few-shot image classificationNeural Computing and Applications10.1007/s00521-023-08255-z35:14(10649-10662)Online publication date: 1-Feb-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media