skip to main content
10.1145/3374587.3374615acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Visual Tracking by Gated PixelCNN Model

Published: 04 March 2020 Publication History

Abstract

This paper proposed a novel visual tracking method based on the Gated PixelCNN and particle filtration. First, unlike the convolutional layers in traditional trackers, Gated PixelCNN is a generative model of images with a tractable likelihood. When trained, it is not limited to the existing classified datasets and video datasets. Second, similar to RNN, its shared convolutional layers extract the generic image features by calculating the pixel relationship maps of the pixels between the first pixel and the N th pixel, instead of just adjacent pixels. Through comparative experiments, We prove that the convolutional layers of Gated PixelCNN are not inferior to convolutional layers in traditional trackers. In this paper, we employ this new kind of convolutional layers in tracking. In addition, Particle Filtration is introduced in our algorithm to increase the tracker's resilience to occlusions. In the experiments, we compared our tracking method with other well-known trackers on 12 video sequences to evaluate the performance. The experimental results showed that the proposed algorithm demonstrated better overall performance than other state-of-the-art methods.

References

[1]
X. Zhou, K. Jin, Q. Chen, M. Xu, and Y. Shang. Multiple face tracking and recognition with identity-specifific localized metric learning. Pattern Recognition, 75:41--50, 2018.
[2]
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580--587, 2014.
[3]
P. Feng, C. Xu, Z. Zhao, F. Liu, J. Guo, C. Yuan, T. Wang, and K. Duan. A deep features based generative model for visual tracking. Neurocomputing, 308:245--254, 2018.
[4]
H. Yang, D. Zhong, C. Liu, K. Song, and Z. Yin. Robust visual tracking based on deep convolutional neural networks and kernelized correlation fifilters. Journal of Electronic Imaging, 27(2):023008.
[5]
S. Pang, J. J. del Coz, Z. Yu, O. Luaces, and J. Díez. Deep learning to frame objects for visual target tracking. Engineering Applications of Artifificial Intelligence, 65:406--420, 2017.
[6]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al.Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211--252, 2015. 1,2
[7]
H. Nam and B. Han. Learning multi-domain convolutionalneural networks for visual tracking. arXiv preprint arXiv:1510.07945, 2015. 1, 2, 4, 6, 7, 8
[8]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A largescale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248--255. Ieee, 2009.
[9]
A. Van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al. Conditional image generation with pixelcnn decoders. In Advances in neural information processing systems, pages 4790--4798, 2016.
[10]
A. v. d. Oord, N. Kalchbrenner, and K. Kavukcuoglu. Pixel recurrent neural net works. arXiv preprint arXiv:1601.06759, 2016.
[11]
Nam H, Han B. Learning Multi-Domain Convolutional Neural Networks for Visual Tracking[J]. 2015.
[12]
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016.
[13]
A. Doucet, S. Godsill, and C. Andrieu. On sequential monte carlo sampling methods for bayesian fifiltering. Statistics and computing, 10(3):197--208, 2000.
[14]
J. V. Candy. Bayesian signal processing: classical, modern, and particle fifiltering methods, volume 54. John Wiley & Sons, 2016.
[15]
Y. Wu, J. Lim, and M.-H. Yang. Online object tracking: A benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2411--2418, 2013.
[16]
C. Ma, Z. Miao, X.-P. Zhang, and M. Li. A saliency prior context model for real time object tracking. IEEE Transactions on Multimedia, 19(11):2415--2424, 2017.
[17]
X. Jia, H. Lu, and M.-H. Yang. Visual tracking via adaptive structural local sparse appearance model. In 2012 IEEE Conference on computer vision and pattern recognition, pages 1822--1829. IEEE, 2012.
[18]
T. B. Dinh, N. Vo, and G. Medioni. Context tracker: Exploring supporters and distracters in unconstrained environments. In CVPR 2011, pages 1177--1184. IEEE, 2011.
[19]
D. A. Ross, J. Lim, R.-S. Lin, and M.-H. Yang. Incremental learning for robust visual tracking. International journal of computer vision, 77(1-3):125--141, 2008.
[20]
Y. Wu, B. Shen, and H. Ling. Online robust image alignment via iterative convex optimization. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 1808--1814. IEEE, 2012.
[21]
R. T. Collins et al. Mean-shift blob tracking through scale space. In CVPR (2), pages 234--240, 2003.
[22]
K. Zhang, L. Zhang, Q. Liu, D. Zhang, and M.-H. Yang. Fast visual tracking via dense spatio-temporal context learning. In European conference on computer vision, pages 127--141. Springer, 2014.
[23]
K. Zhang, L. Zhang, and M.-H. Yang. Real-time compressive tracking. In European conference on computer vision, pages 864--877. Springer, 2012.
[24]
J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. Exploiting the circulant structure of tracking-by-detection with kernels. In European conference on computer vision, pages 702--715. Springer, 2012.
[25]
Z. Kalal, K. Mikolajczyk, and J. Matas. Tracking-learning-detection. IEEE transactions on pattern analysis and machine intelligence, 34(7):1409--1422, 2011.
[26]
B. Babenko, M.-H. Yang, and S. Belongie. Robust object tracking with online multiple instance learning. IEEE transactions on pattern analysis and machine intelligence, 33(8):1619--1632, 2010.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSAI '19: Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence
December 2019
370 pages
ISBN:9781450376273
DOI:10.1145/3374587
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Shenzhen University: Shenzhen University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Gated PixelCNN
  2. Visual tracking
  3. convolutional neural network
  4. particle fifilter

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CSAI2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 47
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media