Robustness of Embodied Point Navigation Agents

Rajič, Frano

doi:10.1007/978-3-031-25075-0_15

Frano Rajič ORCID: orcid.org/0000-0002-3540-0499¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13806))

Included in the following conference series:

European Conference on Computer Vision

1358 Accesses

Abstract

We make a step towards robust embodied AI by analyzing the performance of two successful Habitat Challenge 2021 agents under different visual corruptions (low lighting, blur, noise, etc.) and robot dynamics corruptions (noisy egomotion). The agents had underperformed overall. However, one of the agents managed to handle multiple corruptions with ease, as the authors deliberately tackled robustness in their model. For specific corruptions, we concur with observations from literature that there is still a long way to go to recover the performance loss caused by corruptions, warranting more research on the robustness of embodied AI.

Code available at m43.github.io/projects/embodied-ai-robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Anderson, P., et al.: On evaluation of embodied navigation agents (2018). https://doi.org/10.48550/arXiv.1807.06757
Anderson, P., et al.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3674–3683 (2018). https://doi.org/10.1109/CVPR.2018.00387
Armeni, I., et al.: 3d semantic parsing of large-scale indoor spaces. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1534–1543 ( June 2016). https://doi.org/10.1109/CVPR.2016.170
Chang, A., et al.: Matterport3D: Learning from RGB-D data in indoor environments. In: International Conference on 3D Vision (3DV) (2017). https://doi.org/10.1109/3dv.2017.00081
Chattopadhyay, P., Hoffman, J., Mottaghi, R., Kembhavi, A.: Robustnav: Towards benchmarking robustness in embodied navigation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15691–15700 (October 2021). https://doi.org/10.1109/ICCV48922.2021.01540
Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5556–5565 (2015). https://doi.org/10.1109/CVPR.2015.7299195
Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.: Embodied question answering. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018). https://doi.org/10.1109/CVPR.2018.00008
Deitke, M., et al.: Robothor: An open simulation-to-real embodied ai platform. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3164–3174 (2020)
Google Scholar
Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.: Iqa: Visual question answering in interactive environments. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4089–4098 (2018). https://doi.org/10.1109/CVPR.2018.00430
Habitat Challenge 2019 @ Habitat Embodied Agents Workshop. In: CVPR 2019. https://aihabitat.org/challenge/2019/
Habitat Challenge 2020 @ Embodied AI Workshop. In: CVPR 2020. https://aihabitat.org/challenge/2020/
Habitat Challenge 2021 @ Embodied AI Workshop. In: CVPR 2021. https://aihabitat.org/challenge/2021/
Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations (2019). https://doi.org/10.48550/arXiv.1903.12261
Hermann, K.Met al.: Grounded language learning in a simulated 3d world (2017). https://doi.org/10.48550/arXiv.1706.06551
Kadian, A., et al.: Sim2real predictivity: Does evaluation in simulation predict real-world performance? IEEE Robot. Autom. Lett. 5(4), 6670–6677 (2020). https://doi.org/10.1109/LRA.2020.3013848
Article Google Scholar
Kolve, E., et al.: Ai2-thor: An interactive 3d environment for visual ai (2017). https://doi.org/10.48550/arXiv.1712.05474
Murali, A., et al.: Pyrobot: An open-source robotics framework for research and benchmarking (2019). https://doi.org/10.48550/arXiv.1906.08236
Partsey, R.: Robust Visual Odometry for Realistic PointGoal Navigation. Master’s thesis, Ukrainian Catholic University (2021)
Google Scholar
Partsey, R., Wijmans, E., Yokoyama, N., Dobosevych, O., Batra, D., Maksymets, O.: Is mapping necessary for realistic pointgoal navigation? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17232–17241 (6 2022)
Google Scholar
Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T., Koltun, V.: Minos: Multimodal indoor simulator for navigation in complex environments (2017). https://doi.org/10.48550/arXiv.1712.03931
Savva, M., et al.: Habitat: A platform for embodied ai research. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9338–9346 (2019). https://doi.org/10.1109/ICCV.2019.00943
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). https://doi.org/10.48550/arXiv.1707.06347
Straub, J., et al.: The replica dataset: A digital replica of indoor spaces. arXiv:1906.05797 (2019). https://doi.org/10.48550/arXiv.1906.05797
Weihs, L., et al.: Allenact: A framework for embodied ai research (2020). https://doi.org/10.48550/arXiv.2008.12760
Wijmans, E., et al.: Dd-ppo: Learning near-perfect pointgoal navigators from 2.5 billion frames (2019). https://doi.org/10.48550/arXiv.1911.00357
Wu, Y., Wu, Y., Gkioxari, G., Tian, Y.: Building generalizable agents with a realistic and rich 3d environment (2018). https://doi.org/10.48550/arXiv.1801.02209
Xia, F., R. Zamir, A., He, Z.Y., Sax, A., Malik, J., Savarese, S.: Gibson env: real-world perception for embodied agents. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/cvpr.2018.00945
Yan, C., Misra, D., Bennnett, A., Walsman, A., Bisk, Y., Artzi, Y.: Chalet: Cornell house agent learning environment (2018). https://doi.org/10.48550/arXiv.1801.07357
Zhao, X., Agrawal, H., Batra, D., Schwing, A.G.: The surprising effectiveness of visual odometry techniques for embodied pointgoal navigation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16127–16136 (October 2021). https://doi.org/10.1109/iccv48922.2021.01582
Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3357–3364 (2017). https://doi.org/10.1109/ICRA.2017.7989381

Download references

Acknowledgements

This paper is based on a course project for the CS-503 Visual Intelligence course at EPFL. The author thanks Donggyun Park for helpful discussions and Ivan Stresec for proofreading the paper. The author also thanks Ruslan Partsey and team UCU MLab for privately sharing their agent checkpoint for testing.

Author information

Authors and Affiliations

Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland
Frano Rajič

Authors

Frano Rajič
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frano Rajič .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 19754 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajič, F. (2023). Robustness of Embodied Point Navigation Agents. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-25075-0_15
Published: 19 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robustness of Embodied Point Navigation Agents