Real-time stage-wise object tracking in traffic scenes: an online tracker selection method via deep reinforcement learning

Lu, Xiao; Cao, Yihong; Liu, Sheng; Zhou, Xuanyu; Yang, Yimin

doi:10.1007/s00521-021-06439-z

Real-time stage-wise object tracking in traffic scenes: an online tracker selection method via deep reinforcement learning

Review
Published: 19 September 2021

Volume 33, pages 16831–16846, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xiao Lu¹,
Yihong Cao¹,
Sheng Liu¹,
Xuanyu Zhou ORCID: orcid.org/0000-0003-0823-2247² &
…
Yimin Yang³

951 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

How to ensemble the high representative ability for discriminating the target from its background and high adaptive ability to fast appearance changes, while keeping the real-time performance simultaneously is still an open topic in the field of object tracking, especially in the complex urban traffic scenes. To address this issue, motivated by that existing excellent trackers may have their advantages in tackling different kinds of tracking difficulties respectively, we propose a new real-time stage-wise object tracking method that allows different trackers to complement each other and combines their respective advantages. A tracker selection agent is trained to learn the policy of switching to the most appropriate candidate tracker according to the current tracking environment. To capture the dynamics of tracking environment effectively, we consider the tracker selection problem as a Partially Observable Markov Decision Process problem. A lightweight deep neural network with the recurrent unit is designed for learning the optimal policy accurately and rapidly. We also elaborately collected Traffic Scenes Object Tracking Annotated Dataset (TS-OTAD) for demonstrating the effectiveness of our method. Experimental results conducted on TS-OTAD and OTB-100 demonstrate that our method has superior performance than any of the candidate tracker and has a good trade-off between accuracy and efficiency compared with other state-of-the-art methods. Besides, our stage-wise tracking framework is not limited to any specific tracker, and any excellent tracker can be used as the candidate, which provides a new way for boosting object tracking accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tracker-Level Decision by Deep Reinforcement Learning for Robust Visual Tracking

Real-Time Visual Object Tracking Based on Reinforcement Learning with Twin Delayed Deep Deterministic Algorithm

Collaborative Deep Reinforcement Learning for Multi-object Tracking

References

Bailer C, Pagani A, Stricker D (2014) A superior tracking approach: building a strong tracker through fusion. In: European conference on computer vision
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: complementary learners for real-time tracking. In: Proceedings of IEEE Conference on computer vision and pattern recognition, pp 1401–1409
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. In: Proceedings of the European conference on computer vision, pp 850–865
Caicedo JC, Lazebnik S (2015) Active object localization with deep reinforcement learning. In: Proceedings of the IEEE international conference on computer vision, pp 2488–2496
Chau DP, Bremond F, Thonnat M, Bak S (2014) Automatic tracker selection w.r.t object detection performance. In: 2014 IEEE Winter conference on applications of computer vision, WACV 2014
Chen B, Wang D, Li P, Wang S, Lu H (2018) Real-time ’actor-critic’ tracking. In: Proceedings of the European conference on computer vision, pp 328–345
Dai M, Cheng S, He X (2019) Object tracking in the presence of shaking motions. Neural Comput Appl 31:5917–5934
Article Google Scholar
Danelljan M, Hager G, Khan FS, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 621–629
Danelljan M, Hager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 4310–4318
Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: Proceedings of the European conference on computer vision, vol 9909, pp 472–488
Fan J, Song H, Zhang K, Yang K, Liu Q (2020) Feature alignment and aggregation siamese networks for fast visual tracking. IEEE Trans Circuits Syst Video Technol 31:1296–1307
Article Google Scholar
Gao Y, Ji R, Zhang L, Hauptmann A (2014) Symbiotic tracker ensemble toward a unified tracking framework. IEEE Trans Circuits Syst Video Technol 24(7):1122–1131
Article Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3354–3361
Hausknecht M, Stone P (2015) Deep recurrent q-learning for partially observable mdps. In: Learning
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article Google Scholar
Henriques JF, Rui C, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of the 12th European conference on computer vision—volume Part IV
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Huang C, Lucey S, Ramanan D (2017) Learning policies for adaptive tracking with deep feature cascades. In: Proceedings of the IEEE international conference computer vision, pp 105–114
Khalid O, SanMiguel JC, Cavallaro A (2017) Multi-tracker partition fusion. IEEE Trans Circuits Syst Video Technol 27(7):1527–1539
Article Google Scholar
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Zajc LC, Vojir T, Hager G, Lukezic A, Eldesokey A (2017) The visual object tracking vot2017 challenge results. In: Proceedings of the IEEE international conference on computer vision, pp 1949–1972
Kristan M (2016) The visual object tracking vot2016 challenge results. In: European conference on computer vision, pp 777–823
Lample G, Chaplot DS (2016) Playing fps games with deep reinforcement learning. In: Artificial intelligence
Marvasti-Zadeh S, Ghanei-Yakhdan H, Kasaei S (2021) Efficient scale estimation methods using lightweight deep convolutional neural networks for visual tracking. Neural Comput Appl 9:1–16
Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3951–3960
Song Y, Ma C, Gong L, Zhang J, Lau RWH, Yang M (2017) Crest: convolutional residual learning for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 2574–2583
Supancic JS, Ramanan D (2017) Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: Proceedings of the IEEE international conference on computer vision, pp 322–331
Sutton RS, Barto AG (1999) Reinforcement learning: an introduction. In: Proceedings of advances in neural information processing systems
Watkins C, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(3):279–292
MATH Google Scholar
Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 9:1834–1848
Article Google Scholar
Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848
Article Google Scholar
Yang K, Song H, Zhang K (2020) Hierarchical attentive siamese network for real-time visual tracking. Neural Comput Appl 18:14335–14346
Article Google Scholar
Yoon JH, Kim DY, Yoon KJ (2012) Visual tracking via adaptive tracker selection with multiple features. In: European conference on computer vision
Yun S, Choi J, Yoo Y, Yun K, Choi J (2017) Action-decision networks for visual tracking with deep reinforcement learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1349–1358
Zhang K, Zhang L, Liu Q, Zhang D, Yang M (2014) Fast visual tracking via dense spatio-temporal context learning. In: Proceedings of the European conference on computer vision, pp 127–141
Zhong Z, Yang Z, Feng W, Wu W, Hu Y, Liu C (2019) Decision controller for object tracking with deep reinforcement learning. IEEE Access 7:28069–28079
Article Google Scholar
Zhou Y, Sun X, Zha Z, Zeng W (2019) Context-reinforced semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4046–4055

Download references

Acknowledgements

The authors would like to thank the editor and anonymous reviewers for their invaluable suggestions. This work is supported in part by the National Natural Science Foundation of China (Grant Nos. 62007007, 61703155), Natural Sciences and Engineering Research Council of Canada and Hunan Provincial Innovation Foundation For Postgraduate (Grant No. CX20190404).

Author information

Authors and Affiliations

College of Engineering and Design, Hunan Normal University, Changsha, China
Xiao Lu, Yihong Cao & Sheng Liu
Key Laboratory of Big Data Research and Application for Basic Education, Hunan Normal University, Changsha, China
Xuanyu Zhou
Computer Science Department, Lakehead University, Ontario, Canada
Yimin Yang

Authors

Xiao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Cao
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xuanyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yimin Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuanyu Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, X., Cao, Y., Liu, S. et al. Real-time stage-wise object tracking in traffic scenes: an online tracker selection method via deep reinforcement learning. Neural Comput & Applic 33, 16831–16846 (2021). https://doi.org/10.1007/s00521-021-06439-z

Download citation

Received: 03 February 2021
Accepted: 17 August 2021
Published: 19 September 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s00521-021-06439-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time stage-wise object tracking in traffic scenes: an online tracker selection method via deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Tracker-Level Decision by Deep Reinforcement Learning for Robust Visual Tracking

Real-Time Visual Object Tracking Based on Reinforcement Learning with Twin Delayed Deep Deterministic Algorithm

Collaborative Deep Reinforcement Learning for Multi-object Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time stage-wise object tracking in traffic scenes: an online tracker selection method via deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Tracker-Level Decision by Deep Reinforcement Learning for Robust Visual Tracking

Real-Time Visual Object Tracking Based on Reinforcement Learning with Twin Delayed Deep Deterministic Algorithm

Collaborative Deep Reinforcement Learning for Multi-object Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation