Skip to main content

Augmented Curiosity: Depth and Optical Flow Prediction for Efficient Exploration

  • Conference paper
  • First Online:
  • 2031 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11844))

Abstract

Exploring novel environments for a specific target poses the challenge of how to adequately provide positive external rewards to an artificial agent. In scenarios with sparse external rewards, a reinforcement learning algorithm often cannot develop a successful policy function to govern an agent’s behavior. However, intrinsic rewards can provide feedback on an agent’s actions and enable updates towards a proper policy function in sparse scenarios. Our approaches called the Optical Flow-Augmented Curiosity Module (OF-ACM) and Depth-Augmented Curiosity Module (D-ACM) extend the Intrinsic Curiosity Model (ICM) by Pathak et al. The ICM forms an intrinsic reward signal from the error between a prediction and the ground truth of the next state. Shown with experiments in visually rich and sparse feature scenarios in ViZDoom, our predictive modules exhibit improved exploration capabilities and learning of an ideal policy function. Our modules leverage additional sources of information, such as depth images and optical flow, to generate superior embeddings that serve as inputs for next state prediction. With D-ACM we show a 63.3% average improvement in time to convergence of a policy over ICM in “My Way Home” scenarios.

J. Carvajal, T. Molnar and L. Burzawa—Equal Contribution.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Brockman, G., et al.: OpenAI gym. CoRR abs/1606.01540 (2016)

    Google Scholar 

  2. Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50

    Chapter  Google Scholar 

  3. Grześ, M.: Reward shaping in episodic reinforcement learning. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2017, pp. 565–573. International Foundation for Autonomous Agents and Multiagent Systems, Richland (2017)

    Google Scholar 

  4. He, Y., Chen, S.: Advances in sensing and processing methods for three-dimensional robot vision. Int. J. Adv. Robot. Syst. 15(2) (2018). https://doi.org/10.1177/1729881418760623

    Article  Google Scholar 

  5. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR abs/1605.02097 (2016)

    Google Scholar 

  6. Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Auton. Robots 4(4), 333–349 (1997)

    Article  Google Scholar 

  7. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016)

    Google Scholar 

  8. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: ICML (2017)

    Google Scholar 

  9. Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. CoRR abs/1610.01733 (2016)

    Google Scholar 

  10. Wu, Y., Mansimov, E., Liao, S., Grosse, R.B., Ba, J.: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. CoRR abs/1708.05144 (2017)

    Google Scholar 

  11. Zhang, M., Levine, S., McCarthy, Z., Finn, C., Abbeel, P.: Policy learning with continuous memory states for partially observed robotic control. CoRR abs/1507.01273 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Carvajal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carvajal, J., Molnar, T., Burzawa, L., Culurciello, E. (2019). Augmented Curiosity: Depth and Optical Flow Prediction for Efficient Exploration. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33720-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33719-3

  • Online ISBN: 978-3-030-33720-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics