Abstract
We make a step towards robust embodied AI by analyzing the performance of two successful Habitat Challenge 2021 agents under different visual corruptions (low lighting, blur, noise, etc.) and robot dynamics corruptions (noisy egomotion). The agents had underperformed overall. However, one of the agents managed to handle multiple corruptions with ease, as the authors deliberately tackled robustness in their model. For specific corruptions, we concur with observations from literature that there is still a long way to go to recover the performance loss caused by corruptions, warranting more research on the robustness of embodied AI.
Code available at m43.github.io/projects/embodied-ai-robustness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, P., et al.: On evaluation of embodied navigation agents (2018). https://doi.org/10.48550/arXiv.1807.06757
Anderson, P., et al.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3674–3683 (2018). https://doi.org/10.1109/CVPR.2018.00387
Armeni, I., et al.: 3d semantic parsing of large-scale indoor spaces. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1534–1543 ( June 2016). https://doi.org/10.1109/CVPR.2016.170
Chang, A., et al.: Matterport3D: Learning from RGB-D data in indoor environments. In: International Conference on 3D Vision (3DV) (2017). https://doi.org/10.1109/3dv.2017.00081
Chattopadhyay, P., Hoffman, J., Mottaghi, R., Kembhavi, A.: Robustnav: Towards benchmarking robustness in embodied navigation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15691–15700 (October 2021). https://doi.org/10.1109/ICCV48922.2021.01540
Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5556–5565 (2015). https://doi.org/10.1109/CVPR.2015.7299195
Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.: Embodied question answering. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018). https://doi.org/10.1109/CVPR.2018.00008
Deitke, M., et al.: Robothor: An open simulation-to-real embodied ai platform. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3164–3174 (2020)
Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.: Iqa: Visual question answering in interactive environments. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4089–4098 (2018). https://doi.org/10.1109/CVPR.2018.00430
Habitat Challenge 2019 @ Habitat Embodied Agents Workshop. In: CVPR 2019. https://aihabitat.org/challenge/2019/
Habitat Challenge 2020 @ Embodied AI Workshop. In: CVPR 2020. https://aihabitat.org/challenge/2020/
Habitat Challenge 2021 @ Embodied AI Workshop. In: CVPR 2021. https://aihabitat.org/challenge/2021/
Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations (2019). https://doi.org/10.48550/arXiv.1903.12261
Hermann, K.Met al.: Grounded language learning in a simulated 3d world (2017). https://doi.org/10.48550/arXiv.1706.06551
Kadian, A., et al.: Sim2real predictivity: Does evaluation in simulation predict real-world performance? IEEE Robot. Autom. Lett. 5(4), 6670–6677 (2020). https://doi.org/10.1109/LRA.2020.3013848
Kolve, E., et al.: Ai2-thor: An interactive 3d environment for visual ai (2017). https://doi.org/10.48550/arXiv.1712.05474
Murali, A., et al.: Pyrobot: An open-source robotics framework for research and benchmarking (2019). https://doi.org/10.48550/arXiv.1906.08236
Partsey, R.: Robust Visual Odometry for Realistic PointGoal Navigation. Master’s thesis, Ukrainian Catholic University (2021)
Partsey, R., Wijmans, E., Yokoyama, N., Dobosevych, O., Batra, D., Maksymets, O.: Is mapping necessary for realistic pointgoal navigation? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17232–17241 (6 2022)
Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T., Koltun, V.: Minos: Multimodal indoor simulator for navigation in complex environments (2017). https://doi.org/10.48550/arXiv.1712.03931
Savva, M., et al.: Habitat: A platform for embodied ai research. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9338–9346 (2019). https://doi.org/10.1109/ICCV.2019.00943
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). https://doi.org/10.48550/arXiv.1707.06347
Straub, J., et al.: The replica dataset: A digital replica of indoor spaces. arXiv:1906.05797 (2019). https://doi.org/10.48550/arXiv.1906.05797
Weihs, L., et al.: Allenact: A framework for embodied ai research (2020). https://doi.org/10.48550/arXiv.2008.12760
Wijmans, E., et al.: Dd-ppo: Learning near-perfect pointgoal navigators from 2.5 billion frames (2019). https://doi.org/10.48550/arXiv.1911.00357
Wu, Y., Wu, Y., Gkioxari, G., Tian, Y.: Building generalizable agents with a realistic and rich 3d environment (2018). https://doi.org/10.48550/arXiv.1801.02209
Xia, F., R. Zamir, A., He, Z.Y., Sax, A., Malik, J., Savarese, S.: Gibson env: real-world perception for embodied agents. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/cvpr.2018.00945
Yan, C., Misra, D., Bennnett, A., Walsman, A., Bisk, Y., Artzi, Y.: Chalet: Cornell house agent learning environment (2018). https://doi.org/10.48550/arXiv.1801.07357
Zhao, X., Agrawal, H., Batra, D., Schwing, A.G.: The surprising effectiveness of visual odometry techniques for embodied pointgoal navigation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16127–16136 (October 2021). https://doi.org/10.1109/iccv48922.2021.01582
Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3357–3364 (2017). https://doi.org/10.1109/ICRA.2017.7989381
Acknowledgements
This paper is based on a course project for the CS-503 Visual Intelligence course at EPFL. The author thanks Donggyun Park for helpful discussions and Ivan Stresec for proofreading the paper. The author also thanks Ruslan Partsey and team UCU MLab for privately sharing their agent checkpoint for testing.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rajič, F. (2023). Robustness of Embodied Point Navigation Agents. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)