Skip to main content
Log in

Reinforcement learning applied to machine vision: state of the art

  • Trends and Surveys
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Reinforcement learning (RL) is gaining a foothold in artificial intelligence-based research academia. More and more applications are coming to fore where RL is being applied in a novel and successful manner. As the areas of application are diverse, one important area of industrial and research significance having a high scope is machine vision. In this survey paper, the basics of RL are discussed first in order to give the reader an overview of the technology. Subsequently, the important, novel and upcoming state-of-the-art applications are discussed in the field of RL. The research areas discussed include image segmentation, object detection, object tracking, robotic vision, autonomous driving and image classification/retrieval. Various state-of-the-art works are discussed and an impression of the potential and impact of RL in the field of machine vision is built.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Akhloufi MA, Arola S, Bonnet A (2019) Drones chasing drones: reinforcement learning and deep search area proposal. Drones. https://doi.org/10.3390/drones3030058

    Article  Google Scholar 

  2. Bellman R (1957) A markovian decision process. J Math Mech 6(5):679–684

    MathSciNet  MATH  Google Scholar 

  3. Bohg J, Morales A, Asfour T, Kragic D (2014) Data-driven grasp synthesis—a survey. IEEE Trans Rob 30(2):289–309. https://doi.org/10.1109/TRO.2013.2289018

    Article  Google Scholar 

  4. Caicedo JC, Lazebnik S (2015) Active object localization with deep reinforcement learning. In: IEEE international conference on computer vision (ICCV), pp 2488–2496. https://doi.org/10.1109/ICCV.2015.286

  5. Carta S, Ferreira A, Podda AS, Reforgiato Recupero D, Sanna A (2021) Multi-dqn: an ensemble of deep q-learning agents for stock market forecasting. Expert Syst Appl 164:113820. https://doi.org/10.1016/j.eswa.2020.113820

    Article  Google Scholar 

  6. Casanova A, Pinheiro PO, Rostamzadeh N, Pal CJ (2020) Reinforced active learning for image segmentation. arXiv preprint. arXiv:2002.06583

  7. Chavan-Dafle N, Rodriguez A (2018) Stable prehensile pushing: in-hand manipulation with alternating sticking contacts. In: IEEE international conference on robotics and automation (ICRA), pp 254–261. https://doi.org/10.1109/ICRA.2018.8461243

  8. Chen J, Yuan B, Tomizuka M (2019) Model-free deep reinforcement learning for urban autonomous driving. In: IEEE intelligent transportation systems conference (ITSC). IEEE Press, pp 2765–2771. https://doi.org/10.1109/ITSC.2019.8917306

  9. Chen L, Huang H, Feng Y, Cheng G, Huang J, Liu Z (2020) Active one-shot learning by a deep q-network strategy. Neurocomputing 383:324–335. https://doi.org/10.1016/j.neucom.2019.11.017

    Article  Google Scholar 

  10. Chen T, Wang Z, Li G, Lin L (2018) Recurrent attentional reinforcement learning for multi-label image recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, p 1. https://ojs.aaai.org/index.php/AAAI/article/view/12281

  11. Christiano P, Shah Z, Mordatch I, Schneider J, Blackwell T, Tobin J, Abbeel, P., Zaremba W (2016) Transfer from simulation to real world through learning deep inverse dynamics model. arXiv preprint arXiv:1610.03518

  12. Czech J (2021) Distributed methods for reinforcement learning survey. In: Reinforcement learning algorithms: analysis and applications. Springer, pp 151–161

  13. Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) CARLA: an open urban driving simulator. In: Levine S, Vanhoucke V, Goldberg K (eds) Proceedings of the 1st annual conference on robot learning, Proceedings of machine learning research, vol. 78. PMLR, pp 1–16. http://proceedings.mlr.press/v78/dosovitskiy17a.html

  14. Ejaz MM, Tang TB, Lu CK (2021) Vision-based autonomous navigation approach for a tracked robot using deep reinforcement learning. IEEE Sens J 21(2):2230–2240. https://doi.org/10.1109/JSEN.2020.3016299

    Article  Google Scholar 

  15. Fan J, Wang Z, Xie Y, Yang Z (2020) A theoretical analysis of deep q-learning. In: Bayen AM, Jadbabaie A, Pappas G, Parrilo PA, Recht B, Tomlin C, Zeilinger M (eds) Proceedings of the 2nd conference on learning for dynamics and control, Proceedings of machine learning research, vol 120. PMLR, The Cloud, pp. 486–489. http://proceedings.mlr.press/v120/yang20a.html

  16. Gao M, Yu R, Li A, Morariu VI, Davis LS (2018) Dynamic zoom-in network for fast object detection in large images. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6926–6935. https://doi.org/10.1109/CVPR.2018.00724

  17. Georgiou T, Liu Y, Chen W, Lew M (2020) A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int J Multimed Inform Retrieval 9(3):135–170. https://doi.org/10.1007/s13735-019-00183-w

    Article  Google Scholar 

  18. Ghadirzadeh A, Maki A, Kragic D, Björkman M (2017) Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2351–2358. https://doi.org/10.1109/IROS.2017.8206046

  19. Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3022–3031. https://doi.org/10.1109/CVPR.2015.7298921

  20. Gözen D, Ozer S (2021) Visual object tracking in drone images with deep reinforcement learning. cs.bilkent.edu.tr

  21. Gupta A, Devin C, Liu Y, Abbeel P, Levine S (2017) Learning invariant feature spaces to transfer skills with reinforcement learning. arXiv preprint arXiv:1703.02949

  22. Hafiz AM (2020) Image classification by reinforcement learning with two-state q-learning. arXiv:2007.01298

  23. Hafiz AM, Bhat GM (2020) Deep q-network based multi-agent reinforcement learning with binary action agents. arXiv preprint arXiv:2008.04109

  24. Hafiz AM, Bhat GM (2020) A survey on instance segmentation: state of the art. Int J Multimed Inform Retrieval 9(3):171–189. https://doi.org/10.1007/s13735-020-00195-x

    Article  Google Scholar 

  25. He H, Daumé III H, Eisner J (2012) Cost-sensitive dynamic feature selection. In: ICML inferning workshop

  26. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  27. Hsu RC, Liu CT, Chen WY, Hsieh HI, Wang HL (2015) A reinforcement learning-based maximum power point tracking method for photovoltaic array. Int J Photoenergy

  28. Hundt A, Killeen B, Greene N, Wu H, Kwon H, Paxton C, Hager GD (2020) Good robot!: efficient reinforcement learning for multi-step visual tasks with sim to real transfer. IEEE Robot Autom Lett 5(4):6724–6731. https://doi.org/10.1109/LRA.2020.3015448

    Article  Google Scholar 

  29. Isele D, Rahimi R, Cosgun A, Subramanian K, Fujimura K (2018) Navigating occluded intersections with autonomous vehicles using deep reinforcement learning. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 2034–2039. https://doi.org/10.1109/ICRA.2018.8461233

  30. Jain SD, Grauman K (2016) Active image segmentation propagation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2864–2873. https://doi.org/10.1109/CVPR.2016.313

  31. Jie Z, Liang X, Feng J, Jin X, Lu WF, Yan S (2016) Tree-structured reinforcement learning for sequential object localization. In: Proceedings of the 30th international conference on neural information processing systems, NIPS’16. Curran Associates Inc., Red Hook, pp 127–135

  32. Kalakrishnan M, Righetti L, Pastor P, Schaal S (2011) Learning force control policies for compliant manipulation. In: 2011 IEEE/RSJ international conference on intelligent robots and systems, pp 4639–4644. https://doi.org/10.1109/IROS.2011.6095096

  33. Kalashnikov D, Irpan A, Pastor P, Ibarz J, Herzog A, Jang E, Quillen D, Holly E, Kalakrishnan M, Vanhoucke V et al (2018) Scalable deep reinforcement learning for vision-based robotic manipulation. In: Conference on robot learning. PMLR, pp 651–673

  34. Karayev S, Fritz M, Darrell T (2014) Anytime recognition of objects and scenes. In: 2014 IEEE conference on computer vision and pattern recognition, pp 572–579. https://doi.org/10.1109/CVPR.2014.80

  35. Kardell S, Kuosku M (2017) Autonomous vehicle control via deep reinforcement learning. Master’s Thesis

  36. Kelleher JD (2019) Deep learning

  37. Keselman A, Ten S, Ghazali A, Jubeh M (2018) Reinforcement learning with a* and a deep heuristic. arXiv preprint arXiv:1811.07745

  38. Kiran BR, Sobh I, Talpaert V, Mannion P, Sallab AAA, Yogamani S, Pérez P (2021) Deep reinforcement learning for autonomous driving: a survey. arXiv preprint arXiv:2002.00444v2

  39. Konda VR, Tsitsiklis JN (1999) Actor-citic agorithms. In: Proceedings of the 12th international conference on neural information processing systems, NIPS’99. MIT Press, Cambridge, pp 1008–1014

  40. Konyushkova K, Sznitman R, Fua P (2015) Introducing geometry in active learning for image segmentation. In: 2015 IEEE international conference on computer vision (ICCV), pp. 2974–2982. https://doi.org/10.1109/ICCV.2015.340

  41. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  42. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  43. Li C, Czarnecki K (2019) Urban driving with multi-objective deep reinforcement learning. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems, AAMAS ’19. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, pp 359–367

  44. Li FF, Andreetto M, Ranzato MA (2004) Caltech-101. Database

  45. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37

  46. Liu W, Peng L, Cao J, Fu X, Liu Y, Pan Z, Yang J (2021) Ensemble bootstrapped deep deterministic policy gradient for vision-based robotic grasping. IEEE Access

  47. Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2020) End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans Pattern Anal Mach Intell 42(6):1317–1332. https://doi.org/10.1109/TPAMI.2019.2899570

    Article  Google Scholar 

  48. Mackowiak R, Lenz P, Ghori O, Diego F, Lange O, Rother C (2018) Cereals-cost-effective region-based active learning for semantic segmentation. In: BMVC

  49. Mahler J, Matl M, Liu X, Li A, Gealy D, Goldberg K (2018) Dex-net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 5620–5627. https://doi.org/10.1109/ICRA.2018.8460887

  50. Martens J, Grosse R (2015) Optimizing neural networks with kronecker-factored approximate curvature. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, Proceedings of machine learning research, vol 37. PMLR, Lille, pp 2408–2417. http://proceedings.mlr.press/v37/martens15.html

  51. Mathe S, Pirinen A, Sminchisescu C (2016) Reinforcement learning for visual object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2894–2902. https://doi.org/10.1109/CVPR.2016.316

  52. Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, Proceedings of machine learning research, vol 48. PMLR, New York, pp 1928–1937. http://proceedings.mlr.press/v48/mniha16.html

  53. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  54. Morrison D, Tow AW, McTaggart M, Smith R, Kelly-Boxall N, Wade-McCue S, Erskine J, Grinover R, Gurman A, Hunn T, Lee D, Milan A, Pham T, Rallos G, Razjigaev A, Rowntree T, Vijay K, Zhuang Z, Lehnert C, Reid I, Corke P, Leitner J (2018) Cartman: the low-cost cartesian manipulator that won the amazon robotics challenge. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 7757–7764. https://doi.org/10.1109/ICRA.2018.8463191

  55. Mousavi, HK, Liu, G, Yuan, W, Takáč, M, Muñoz-Avila, H, Motee, N (2019) A layered architecture for active perception: image classification using deep reinforcement learning. arXiv preprint arXiv:1909.09705

  56. Mousavi HK, Nazari M, Takáč M, Motee N (2019) Multi-agent image classification via reinforcement learning. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 5020–5027. https://doi.org/10.1109/IROS40897.2019.8968129

  57. Ngai DCK, Yung NHC (2011) A multiple-goal reinforcement learning method for complex vehicle overtaking maneuvers. IEEE Trans Intell Transp Syst 12(2):509–522. https://doi.org/10.1109/TITS.2011.2106158

    Article  Google Scholar 

  58. Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 50(9):3826–3839. https://doi.org/10.1109/TCYB.2020.2977374

    Article  Google Scholar 

  59. Pan X, You Y, Wang Z, Lu C (2017) Virtual to real reinforcement learning for autonomous driving. arXiv preprint arXiv:1704.03952

  60. Park YJ, Lee YJ, Kim SB (2020) Cooperative multi-agent reinforcement learning with approximate model learning. IEEE Access 8:125389–125400. https://doi.org/10.1109/ACCESS.2020.3007219

    Article  Google Scholar 

  61. Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3498–3505. https://doi.org/10.1109/CVPR.2012.6248092

  62. Peters J, Schaal S (2008) Reinforcement learning of motor skills with policy gradients. Neural Netw 21(4):682–697. https://doi.org/10.1016/j.neunet.2008.02.003

    Article  Google Scholar 

  63. Pirinen A, Sminchisescu C (2018) Deep reinforcement learning of region proposal networks for object detection. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6945–6954. https://doi.org/10.1109/CVPR.2018.00726

  64. Rao K, Harris C, Irpan A, Levine S, Ibarz J, Khansari M (2020) Rl-cyclegan: reinforcement learning aware simulation-to-real. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11154–11163. https://doi.org/10.1109/CVPR42600.2020.01117

  65. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, NIPS’15. MIT Press, Cambridge, pp 91–99

  66. Rusu AA, Večerík M, Rothörl T, Heess N, Pascanu R, Hadsell R (2017) Sim-to-real robot learning from pixels with progressive nets. In: Levine S, Vanhoucke V, Goldberg K (eds) Proceedings of the 1st annual conference on robot learning, Proceedings of machine learning research, vol 78. PMLR, pp 262–270. http://proceedings.mlr.press/v78/rusu17a.html

  67. Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell 34(05):7236–7243. https://doi.org/10.1609/aaai.v34i05.6214

    Article  Google Scholar 

  68. Sadeghi F, Levine S (2016) Cad2rl: real single-image flight without a single real image. arXiv preprint arXiv:1611.04201

  69. Sallab AE, Abdou M, Perot E, Yogamani S (2016) End-to-end deep reinforcement learning for lane keeping assist. arXiv preprint arXiv:1612.04340

  70. Sallab AE, Abdou M, Perot E, Yogamani S (2017) Deep reinforcement learning framework for autonomous driving. Electron Imaging 19:70–76. https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023

    Article  Google Scholar 

  71. Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, Proceedings of machine learning research, vol 37. PMLR, Lille, pp. 1889–1897. http://proceedings.mlr.press/v37/schulman15.html

  72. Settles B, Craven M, Friedland L (2008) Active learning with real annotation costs. In: Proceedings of the NIPS workshop on cost-sensitive learning, vol 1. Vancouver

  73. Singla A, Padakandla S, Bhatnagar S (2021) Memory-based deep reinforcement learning for obstacle avoidance in uav with limited environment knowledge. IEEE Trans Intell Transp Syst 22(1):107–118. https://doi.org/10.1109/TITS.2019.2954952

    Article  Google Scholar 

  74. Smith RL, Ackerley IM, Wells K, Bartley L, Paisey S, Marshall C (2019) Reinforcement learning for object detection in pet imaging. In: 2019 IEEE nuclear science symposium and medical imaging conference (NSS/MIC), pp 1–4. https://doi.org/10.1109/NSS/MIC42101.2019.9060031

  75. Sun C, Liu W, Dong L (2020) Reinforcement learning with task decomposition for cooperative multiagent systems. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.2996209

    Article  Google Scholar 

  76. Sun S, Zhao X, Li Q, Tan M (2020) Inverse reinforcement learning-based time-dependent a* planner for human-aware robot navigation with local vision. Adv Robot 34(13):888–901. https://doi.org/10.1080/01691864.2020.1753569

    Article  Google Scholar 

  77. Supancic J, Ramanan D (2017) Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: 2017 IEEE international conference on computer vision (ICCV), pp 322–331. https://doi.org/10.1109/ICCV.2017.43

  78. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge

    MATH  Google Scholar 

  79. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308

  80. Tallamraju R, Saini N, Bonetto E, Pabst M, Liu YT, Black MJ, Ahmad A (2020) Aircaprl: autonomous aerial human motion capture using deep reinforcement learning. IEEE Robot Autom Lett 5(4):6678–6685. https://doi.org/10.1109/LRA.2020.3013906

    Article  Google Scholar 

  81. Tobin J, Fong R, Ray A, Schneider J, Zaremba W, Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 23–30. https://doi.org/10.1109/IROS.2017.8202133

  82. Tzeng E, Devin C, Hoffman J, Finn C, Abbeel P, Levine S, Saenko K, Darrell T (2020) Adapting deep visuomotor representations with weak pairwise constraints. In: Algorithmic foundations of robotics, vol XII. Springer, pp 688–703

  83. ten Pas A, Gualtieri M, Saenko K, Platt R (2017) Grasp pose detection in point clouds. Int J Rob Res 36(13–14):1455–1473. https://doi.org/10.1177/0278364917735594

    Article  Google Scholar 

  84. Uzkent B, Yeh C, Ermon S (2020) Efficient object detection in large images using deep reinforcement learning. In: 2020 IEEE winter conference on applications of computer vision (WACV), pp 1813–1822. https://doi.org/10.1109/WACV45572.2020.9093447

  85. Vezhnevets A, Buhmann JM, Ferrari V (2012) Active learning for semantic segmentation with expected change. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3162–3169. https://doi.org/10.1109/CVPR.2012.6248050

  86. van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 30(1). https://ojs.aaai.org/index.php/AAAI/article/view/10295

  87. Vijayanarasimhan S, Grauman K (2009) What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2262–2269. https://doi.org/10.1109/CVPR.2009.5206705

  88. Wang P, Chan C (2017) Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), pp 1–6. https://doi.org/10.1109/ITSC.2017.8317735

  89. Wang P, Chan C, de La Fortelle A (2018) A reinforcement learning based approach for automated lane change maneuvers. In: 2018 IEEE intelligent vehicles symposium (IV), pp 1379–1384. https://doi.org/10.1109/IVS.2018.8500556

  90. Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybernet 11(4):747–750. https://doi.org/10.1007/s13042-020-01096-5

    Article  Google Scholar 

  91. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292

    MATH  Google Scholar 

  92. Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256. https://doi.org/10.1007/BF00992696

    Article  MATH  Google Scholar 

  93. Wojek C, Dorkó G, Schulz A, Schiele B (2008) Sliding-windows for rapid object class localization: a parallel technique. In: Proceedings of the 30th DAGM symposium on pattern recognition, pp 71–81. Springer, Berlin. https://doi.org/10.1007/978-3-540-69321-5_8

  94. Wu Y, Mansimov E, Liao S, Grosse R, Ba J (2017) Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Curran Associates Inc., Red Hook, pp 5285–5294 (2017)

  95. Wymann B, Dimitrakakis C, Sumner A, Espié E, Guionneau C (2015) Torcs: the open racing car simulator. Simulation Software

  96. Yahya A, Li A, Kalakrishnan M, Chebotar Y, Levine S (2017) Collective robot reinforcement learning with distributed asynchronous guided policy search. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 79–86. https://doi.org/10.1109/IROS.2017.8202141

  97. Yang D, Roth H, Xu Z, Milletari F, Zhang L, Xu D (2019) Searching learning strategy with reinforcement learning for 3d medical image segmentation. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, Yap PT, Khan A (eds) Medical image computing and computer assisted intervention-MICCAI 2019. Springer, Cham, pp 3–11

    Chapter  Google Scholar 

  98. Yang L, Zhang Y, Chen J, Zhang S, Chen DZ (2017) Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S (eds) Medical image computing and computer assisted intervention-MICCAI 2017. Springer, Cham, pp 399–407

    Chapter  Google Scholar 

  99. Yang P, Huang J (2019) Trackdqn: visual tracking via deep reinforcement learning. In: 2019 IEEE 1st international conference on civil aviation safety and information technology (ICCASIT), pp 277–282. https://doi.org/10.1109/ICCASIT48058.2019.8973189

  100. Yun S, Choi J, Yoo Y, Yun K, Choi JY (2017) Action-decision networks for visual tracking with deep reinforcement learning. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1349–1358. https://doi.org/10.1109/CVPR.2017.148

  101. Zeng A, Song S, Welker S, Lee J, Rodriguez A, Funkhouser T (2018) Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4238–4245. https://doi.org/10.1109/IROS.2018.8593986

  102. Zhang D, Maei H, Wang X, Wang YF (2017) Deep reinforcement learning for visual object tracking in videos. arXiv preprint arXiv:1701.08936

  103. Zhang H, Chen W, Huang Z, Li M, Yang Y, Zhang W, Wang J (2020) Bi-level actor-critic for multi-agent coordination. Proc AAAI Conf Artif Intell 34(05):7325–7332. https://doi.org/10.1609/aaai.v34i05.6226

    Article  Google Scholar 

  104. Zhang W, Song K, Rong X, Li Y (2019) Coarse-to-fine uav target tracking with deep reinforcement learning. IEEE Trans Autom Sci Eng 16(4):1522–1530. https://doi.org/10.1109/TASE.2018.2877499

    Article  Google Scholar 

  105. Zhang Z, Wang D, Gao J (2020) Learning automata-based multiagent reinforcement learning for optimization of cooperative tasks. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3025711

    Article  Google Scholar 

  106. Zhao D, Chen Y, Lv L (2017) Deep reinforcement learning with visual attention for vehicle classification. IEEE Trans Cogn Dev Syst 9(4):356–367. https://doi.org/10.1109/TCDS.2016.2614675

    Article  Google Scholar 

  107. Zhong Z, Yang Z, Feng W, Wu W, Hu Y, Liu C (2019) Decision controller for object tracking with deep reinforcement learning. IEEE Access 7:28069–28079. https://doi.org/10.1109/ACCESS.2019.2900476

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. M. Hafiz.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hafiz, A.M., Parah, S.A. & Bhat, R.A. Reinforcement learning applied to machine vision: state of the art. Int J Multimed Info Retr 10, 71–82 (2021). https://doi.org/10.1007/s13735-021-00209-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-021-00209-2

Keywords

Navigation