Out of the Box: Embodied Navigation in the Real World

Bigazzi, Roberto; Landi, Federico; Cornia, Marcella; Cascianelli, Silvia; Baraldi, Lorenzo; Cucchiara, Rita

doi:10.1007/978-3-030-89128-2_5

Roberto Bigazzi¹⁴,
Federico Landi¹⁴,
Marcella Cornia¹⁴,
Silvia Cascianelli¹⁴,
Lorenzo Baraldi¹⁴ &
…
Rita Cucchiara¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13052))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

954 Accesses

Abstract

The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances? In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world. Our code and models are available at https://github.com/aimagelab/LoCoNav.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation

Article Open access 26 October 2023

Visual semantic navigation with real robots

Article 23 December 2024

Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics

Article 10 August 2024

References

Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R.: Explore and explain: self-supervised navigation and recounting. In: ICPR (2020)
Google Scholar
Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M.L., Ciarfuglia, T.A.: A robust semi-semantic approach for visual localization in urban environment. In: ISC2 (2016)
Google Scholar
Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural SLAM. In: ICLR (2019)
Google Scholar
Chaplot, D.S., Gandhi, D.P., Gupta, A., Salakhutdinov, R.R.: Object goal navigation using goal-oriented semantic exploration. In: NeurIPS (2020)
Google Scholar
Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: ICLR (2019)
Google Scholar
Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: CVPR (2015)
Google Scholar
Deitke, M., et al.: RoboTHOR: an open simulation-to-real embodied AI platform. In: CVPR (2020)
Google Scholar
Kadian, A., et al.: Sim2Real predictivity: does evaluation in simulation predict real-world performance? IEEE Robot. Autom. Lett. 5(4), 6670–6677 (2020)
Article Google Scholar
Landi, F., Baraldi, L., Cornia, M., Corsini, M., Cucchiara, R.: Multimodal attention networks for low-level vision-and-language navigation. CVIU (2021)
Google Scholar
LoCoBot: An Open Source Low Cost Robot. https://locobot-website.netlify.com
Murali, A., et al.: PyRobot: an open-source robotics framework for research and benchmarking. arXiv preprint arXiv:1906.08236 (2019)
Ramakrishnan, S.K., Al-Halah, Z., Grauman, K.: Occupancy anticipation for efficient exploration and navigation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 400–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_24
Chapter Google Scholar
Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. Int. J. Comput. Vis. 129(5), 1616–1649 (2021). https://doi.org/10.1007/s11263-021-01437-z
Article Google Scholar
Rosano, M., Furnari, A., Gulino, L., Farinella, G.M.: On embodied visual navigation in real environments through habitat. In: ICPR (2020)
Google Scholar
Savva, M., et al.: Habitat: a platform for embodied AI research. In: ICCV (2019)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004)
Article Google Scholar
Wijmans, E., et al.: DD-PPO: learning near-perfect PointGoal navigators from 2.5 billion frames. In: ICLR (2019)
Google Scholar
Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson env: real-world perception for embodied agents. In: CVPR (2018)
Google Scholar

Download references

Acknowledgment

This work has been supported by “Fondazione di Modena” under the project “AI for Digital Humanities” and by the national project “IDEHA” (PON ARS01_00421), cofunded by the Italian Ministry of University and Research.

Author information

Authors and Affiliations

University of Modena and Reggio Emilia, Modena, Italy
Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi & Rita Cucchiara

Authors

Roberto Bigazzi
View author publications
You can also search for this author in PubMed Google Scholar
Federico Landi
View author publications
You can also search for this author in PubMed Google Scholar
Marcella Cornia
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Cascianelli
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Baraldi
View author publications
You can also search for this author in PubMed Google Scholar
Rita Cucchiara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roberto Bigazzi .

Editor information

Editors and Affiliations

Cyprus University of Technology, Limassol, Cyprus
Nicolas Tsapatsoulis
University of Cyprus, Nicosia, Cyprus
Andreas Panayides
University of Cyprus, Nicosia, Cyprus
Theo Theocharides
Cyprus University of Technology, Limassol, Cyprus
Andreas Lanitis
University of Cyprus, Nicosia, Cyprus
Constantinos Pattichis
University of Salerno, Salerno, Italy
Mario Vento

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bigazzi, R., Landi, F., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R. (2021). Out of the Box: Embodied Navigation in the Real World. In: Tsapatsoulis, N., Panayides, A., Theocharides, T., Lanitis, A., Pattichis, C., Vento, M. (eds) Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science(), vol 13052. Springer, Cham. https://doi.org/10.1007/978-3-030-89128-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-89128-2_5
Published: 31 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89127-5
Online ISBN: 978-3-030-89128-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics