Wasserstein Distance-Based Auto-Encoder Tracking

Xu, Long; Wei, Ying; Dong, Chenhe; Xu, Chuaqiao; Diao, Zhaofu

doi:10.1007/s11063-021-10507-9

Wasserstein Distance-Based Auto-Encoder Tracking

Published: 04 April 2021

Volume 53, pages 2305–2329, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Long Xu¹,
Ying Wei ORCID: orcid.org/0000-0003-0915-5378¹,
Chenhe Dong¹,
Chuaqiao Xu¹ &
…
Zhaofu Diao¹

424 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

Most of the existing visual object trackers are based on deep convolutional feature maps, but there have fewer works about finding new features for tracking. This paper proposes a novel tracking framework based on a full convolutional auto-encoder appearance model, which is trained by using Wasserstein distance and maximum mean discrepancy . Compared with previous works, the proposed framework has better performance in three aspects, including appearance model, update scheme, and state estimation. To address the issues of the original update scheme including poor discriminant performance under limited supervisory information, sample pollution caused by long term object occlusion, and sample importance unbalance, in this paper, a novel latent space importance weighting algorithm, a novel sample space management algorithm, and a novel IOU-based label smoothing algorithm are proposed respectively. Besides, an improved weighted loss function is adopted to address the sample imbalance issue. Finally, to improve the state estimation accuracy, the combination of Kullback-Leibler divergence and generalized intersection over union is introduced. Extensive experiments are performed on the three widely used benchmarks, and the results demonstrate the state-of-the-art performance of the proposed method. Code and models are available at https://github.com/wahahamyt/CAT.git.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9

Visual object tracking with online sample selection via lasso regularization

Article 11 January 2017

Visual Tracking Based on Multi-cue Proposals and Long Short-Term Features Learning

Unsupervised Deep Representation Learning for Real-Time Tracking

Article 21 September 2020

Notes

https://developer.nvidia.com/deep-learning-performance-training-inference.

References

Bertinetto L, Valmadre J, Henriques J F, et al (2016) Fully-convolutional siamese networks for object tracking[C]. In: European conference on computer vision. Springer, Cham, pp 850–865
Danelljan M, Hager G, Shahbaz Khan F, et al (2015) Learning spatially regularized correlation filters for visual tracking[C]. In: Proceedings of the IEEE international conference on computer vision, pp 4310–4318
Ma C, Huang J B, Yang X, et al (2015) Hierarchical convolutional features for visual tracking[C]. In: Proceedings of the IEEE international conference on computer vision, pp 3074–3082
Hong S, You T, Kwak S, et al (2015) Online tracking by learning discriminative saliency map with convolutional neural network[C]. In: International conference on machine learning, pp 597–606
Islam MM, Hu G, Liu Q et al (2018) Correlation filter based moving object tracking with scale adaptation and online re-detection[J]. IEEE Access 6:75244–75258
Article Google Scholar
Wu Y, Lim J, Yang MH (2013) Online Object Tracking: A Benchmark[C]. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society
Wu Y, Lim J, Yang MH (2015) Object tracking benchmark[J]. IEEE Transac Pattern Analy Mach Intell 37(9):1834–1848
Article Google Scholar
Liang P, Blasch E, Ling H (2015) Encoding color information for visual tracking: algorithms and benchmark[J]. IEEE Transac Image Process 24(12):5630–5644
Article MathSciNet Google Scholar
Li B, Yan J, Wu W, et al (2018) High performance visual tracking with siamese region proposal network[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8971–8980
Li B, Wu W, Wang Q, et al (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4282–4291
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4293–4302
Danelljan M, Bhat G, Khan F S, et al (2019) Atom: Accurate tracking by overlap maximization[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4660–4669
Zhang J, Ma S, Sclaroff S (2014) MEEM: robust tracking via multiple experts using entropy minimization[C]. In: European conference on computer vision. Springer, Cham, pp 188–203
Bolme D S, Beveridge J R, Draper B A, et al (2010) Visual object tracking using adaptive correlation filters[C]. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2544–2550
Henriques JF, Caseiro R, Martins P et al (2015) High-speed tracking with kernelized correlation filters[J]. IEEE Transac Pattern Analy Mach Intell 37(3):583–596
Article Google Scholar
Bertinetto L, Valmadre J, Golodetz S, et al (2016) Staple: Complementary learners for real-time tracking[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1401–1409
Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking[C]. Advances in neural information processing systems 809–817
Girshick R, Donahue J, Darrell T, et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection[J]. IEEE Transac Pattern Analy Mach Intell PP(99):2999–3007
Google Scholar
Ren S, He K, Girshick R et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in neural information processing systems 91–99
He Y, Zhu C, Wang J, et al (2019) Bounding box regression with uncertainty for accurate object detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2888–2897
Rezatofighi H, Tsoi N, Gwak J Y, et al (2019) Generalized intersection over union: A metric and a loss for bounding box regression[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 658–666
Tolstikhin I, Bousquet O, Gelly S, et al (2018) Wasserstein Auto-Encoders[C]. In: International Conference on Learning Representations (ICLR 2018). OpenReview. net
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge[J]. Int J comput vision 115(3):211–252
Article MathSciNet Google Scholar
Li B, Liu Y, Wang X (2019) Gradient harmonized single-stage detector[C]. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 8577–8584
Fan H, Lin L, Yang F, et al (2019) Lasot: A high-quality benchmark for large-scale single object tracking[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5374–5383
Danelljan M, Bhat G, Shahbaz Khan F, et al (2017) Eco: Efficient convolution operators for tracking[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646
Bhat G, Danelljan M, Gool L V, et al (2019) Learning discriminative model prediction for tracking[C]. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6182–6191
Li F, Tian C, Zuo W, et al (2018) Learning spatial-temporal regularized correlation filters for visual tracking[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4904–4913
Zhang Y, Wang L, Qi J, et al (2018) Structured siamese network for real-time visual tracking[C]. In: Proceedings of the European conference on computer vision (ECCV), pp 351–366
Choi J, Jin Chang H, Fischer T, et al (2018) Context-aware deep feature compression for high-speed visual tracking[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 479–488
Zhang M, Lucas J, Ba J, et al (2019) Lookahead Optimizer: k steps forward, 1 step back[C]. Advances in Neural Information Processing Systems 9593–9604
Kristan M, Leonardis A, Matas J, et al (2018) The sixth visual object tracking vot2018 challenge results[C]. In: Proceedings of the European Conference on Computer Vision (ECCV)
Xie S, Girshick R, Dollár P, et al (2017) Aggregated residual transformations for deep neural networks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Ma N, Zhang X, Zheng H T, et al (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 116–131
Lin TY, Maire M, Belongie S et al (2014) Microsoft coco: Common objects in context[C]. European conference on computer vision. Springer, Cham, pp 740–755
Google Scholar

Download references

Acknowledgements

This work is supported by National Nature Science Foundation of China (grant No. 61871106), Key R & D projects of Liaoning Province, 460 China (grant No. 2020JH2/10100029), and the Open Project Program Foundation of the Key Laboratory of Opto-Electronics Information Processing, Chinese Academy of Sciences (OEIP-O-202002).

Author information

Authors and Affiliations

Northeastern University, Shenyang, China
Long Xu, Ying Wei, Chenhe Dong, Chuaqiao Xu & Zhaofu Diao

Authors

Long Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wei
View author publications
You can also search for this author in PubMed Google Scholar
Chenhe Dong
View author publications
You can also search for this author in PubMed Google Scholar
Chuaqiao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhaofu Diao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Wei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, L., Wei, Y., Dong, C. et al. Wasserstein Distance-Based Auto-Encoder Tracking. Neural Process Lett 53, 2305–2329 (2021). https://doi.org/10.1007/s11063-021-10507-9

Download citation

Accepted: 24 March 2021
Published: 04 April 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11063-021-10507-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Wasserstein Distance-Based Auto-Encoder Tracking

Abstract

Access this article

Similar content being viewed by others

Visual object tracking with online sample selection via lasso regularization

Visual Tracking Based on Multi-cue Proposals and Long Short-Term Features Learning

Unsupervised Deep Representation Learning for Real-Time Tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Wasserstein Distance-Based Auto-Encoder Tracking

Abstract

Access this article

Similar content being viewed by others

Visual object tracking with online sample selection via lasso regularization

Visual Tracking Based on Multi-cue Proposals and Long Short-Term Features Learning

Unsupervised Deep Representation Learning for Real-Time Tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation