skip to main content
10.1145/3669754.3669798acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article
Open access

HFN-SLAM:Hybrid Scene Neural Representation SLAM Based on Frame Alignment and Normal Consistency

Published: 30 August 2024 Publication History

Abstract

Recent advancements in SLAM based on neural radiance fields have demonstrated promising performance. However, existing methods still exhibit shortcomings in terms of reconstruction and pose estimation accuracy, particularly in medium-to-large indoor scenes. These limitations stem from inadequate utilization of structural scene information and ineffective constraint handling for cumulative errors between sequences. To address these challenges, we introduce HFN-SLAM, a neurovisual SLAM system designed to achieve real-time, high-fidelity scene reconstruction and robust camera tracking. To attain fine-grained scene reconstruction without compromising real-time performance, we propose a hybrid representation method for scenes. This method integrates high-resolution, dense 3D hash grid features and 2D plane features, enhancing scene reconstruction accuracy while minimizing parameter overhead. To effectively leverage information between input frames and mitigate accumulated sequence errors, we introduce a frame-aligned algorithm. This algorithm globally aligns input sequences by constraining reprojection errors between keyframes. Furthermore, to enhance scene details, we propose a region-aware normal consistency method. This method implements constraints on large planar scenes, facilitating detailed scene reconstruction. Experimental results demonstrate that our method operates at 3-5 Hz on a desktop PC, surpassing existing methods in both scene reconstruction and camera tracking performance.

References

[1]
John Ashburner and Karl J Friston. 2000. Voxel-based morphometry—the methods. Neuroimage 11, 6 (2000), 805–821.
[2]
Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5828–5839.
[3]
Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision. Springer, 834–849.
[4]
Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. 2022. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5501–5510.
[5]
Petr Kellnhofer, Lars C Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, and Gordon Wetzstein. 2021. Neural lumigraph rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4287–4297.
[6]
Zhaoshuo Li, Thomas Müller, Alex Evans, Russell H Taylor, Mathias Unberath, Ming-Yu Liu, and Chen-Hsuan Lin. 2023. Neuralangelo: High-Fidelity Neural Surface Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8456–8465.
[7]
Donald Meagher. 1982. Geometric modeling using octree encoding. Computer graphics and image processing 19, 2 (1982), 129–147.
[8]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.
[9]
Thomas Müller, Brian McWilliams, Fabrice Rousselle, Markus Gross, and Jan Novák. 2019. Neural importance sampling. ACM Transactions on Graphics (ToG) 38, 5 (2019), 1–19.
[10]
Richard A Newcombe, Steven J Lovegrove, and Andrew J Davison. 2011. DTAM: Dense tracking and mapping in real-time. In 2011 international conference on computer vision. IEEE, 2320–2327.
[11]
Michael Oechsle, Songyou Peng, and Andreas Geiger. 2021. Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5589–5599.
[12]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 165–174.
[13]
Erik Sandström, Yue Li, Luc Van Gool, and Martin R Oswald. 2023. Point-slam: Dense neural point cloud-based slam. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18433–18444.
[14]
Thomas Schops, Torsten Sattler, and Marc Pollefeys. 2019. Bad slam: Bundle adjusted direct rgb-d slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 134–144.
[15]
Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, 2019. The Replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797 (2019).
[16]
Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J Davison. 2021. iMAP: Implicit mapping and positioning in real-time. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6229–6238.
[17]
Cheng Sun, Min Sun, and Hwann-Tzong Chen. 2022. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5459–5469.
[18]
Jingwen Wang, Tymoteusz Bleja, and Lourdes Agapito. 2022. Go-surf: Neural feature grid optimization for fast, high-fidelity rgb-d surface reconstruction. In 2022 International Conference on 3D Vision (3DV). IEEE, 433–442.
[19]
Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021).
[20]
Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. 2021. NeRF–: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021).
[21]
Thomas Whelan, Stefan Leutenegger, Renato Salas-Moreno, Ben Glocker, and Andrew Davison. 2015. ElasticFusion: Dense SLAM without a pose graph. Robotics: Science and Systems.
[22]
Qi Wu, David Bauer, Michael J Doyle, and Kwan-Liu Ma. 2023. Interactive volume visualization via multi-resolution hash encoding based neural representation. IEEE Transactions on Visualization and Computer Graphics (2023).
[23]
Xingrui Yang, Hai Li, Hongjia Zhai, Yuhang Ming, Yuqian Liu, and Guofeng Zhang. 2022. Vox-Fusion: Dense tracking and mapping with voxel-based neural implicit representation. In 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 499–507.
[24]
Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. 2021. Volume rendering of neural implicit surfaces. Advances in Neural Information Processing Systems 34 (2021), 4805–4815.
[25]
Lin Yen-Chen, Pete Florence, Jonathan T Barron, Alberto Rodriguez, Phillip Isola, and Tsung-Yi Lin. 2021. inerf: Inverting neural radiance fields for pose estimation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1323–1330.
[26]
Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R Oswald, and Marc Pollefeys. 2022. Nice-slam: Neural implicit scalable encoding for slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12786–12796.

Index Terms

  1. HFN-SLAM:Hybrid Scene Neural Representation SLAM Based on Frame Alignment and Normal Consistency

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCAI '24: Proceedings of the 2024 10th International Conference on Computing and Artificial Intelligence
    April 2024
    491 pages
    ISBN:9798400717055
    DOI:10.1145/3669754
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 August 2024

    Check for updates

    Author Tags

    1. Simultaneous localization and mapping
    2. frame aligned
    3. hybrid representation
    4. neural radiance fields
    5. normal consistency

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCAI 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 98
      Total Downloads
    • Downloads (Last 12 months)98
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media