TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving: Supplementary Materials

Zhao, Cheng; Sun, Su; Wang, Ruoyu; Guo, Yuliang; Wan, Jun-Jun; Huang, Zhou; Huang, Xinyu; Chen, Yingjie Victor; Ren, Liu

doi:10.1007/978-3-031-73036-8_6

Cheng Zhao¹³,
Su Sun¹⁴,
Ruoyu Wang¹³,
Yuliang Guo¹³,
Jun-Jun Wan¹⁵,
Zhou Huang¹⁵,
Xinyu Huang¹³,
Yingjie Victor Chen¹⁴ &
…
Liu Ren¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15121))

Included in the following conference series:

European Conference on Computer Vision

378 Accesses

Abstract

Most 3D Gaussian Splatting (3D-GS) based methods for urban scenes initialize 3D Gaussians directly with 3D LiDAR points, which not only underutilizes LiDAR data capabilities but also overlooks the potential advantages of fusing LiDAR with camera data. In this paper, we design a novel tightly coupled LiDAR-Camera Gaussian Splatting (TCLC-GS) to fully leverage the combined strengths of both LiDAR and camera sensors, enabling rapid, high-quality 3D reconstruction and novel view RGB/depth synthesis. TCLC-GS designs a hybrid explicit (colorized 3D mesh) and implicit (hierarchical octree feature) 3D representation derived from LiDAR-camera data, to enrich the properties of 3D Gaussians for splatting. 3D Gaussian’s properties are not only initialized in alignment with the 3D mesh which provides more completed 3D shape and color information, but are also endowed with broader contextual information through retrieved octree implicit features. During the Gaussian Splatting optimization process, the 3D mesh offers dense depth information as supervision, which enhances the training process by learning of a robust geometry. Comprehensive evaluations conducted on the Waymo Open Dataset and nuScenes Dataset validate our method’s state-of-the-art (SOTA) performance. Utilizing a single NVIDIA RTX 3090 Ti, our method demonstrates fast training and achieves real-time RGB and depth rendering at 90 FPS in resolution of 1920$\times $1280 (Waymo), and 120 FPS in resolution of 1600$\times $900 (nuScenes) in urban scenarios.

C. Zhao and S. Sun—Equally contributed as co-first author.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LiDAR-guided Geometric Pretraining for Vision-Centric 3D Object Detection

Article 09 February 2025

4D Contrastive Superflows are Dense 3D Representation Learners

Scene reconstruction techniques for autonomous driving: a review of 3D Gaussian splatting

Article Open access 30 November 2024

Notes

1.
https://github.com/BoschRHI3NA/Video-Demo-ECCV24/tree/main.

References

Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-nerf: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-nerf 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Google Scholar
Bhat, S.F., Birkl, R., Wofk, D., Wonka, P., Müller, M.: Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288 (2023)
Caesar, H., et al.: nuscenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Google Scholar
Chen, Y., Gu, C., Jiang, J., Zhu, X., Zhang, L.: Periodic vibration gaussian: Dynamic urban scene reconstruction and real-time rendering. arXiv preprint arXiv:2311.18561 (2023)
Glassner, A.S.: An introduction to ray tracing. Morgan Kaufmann (1989)
Google Scholar
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR (2017)
Google Scholar
Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric regularization for learning shapes. In: Proceedings of Machine Learning and Systems 2020, pp. 3569–3579 (2020)
Google Scholar
Guédon, A., Lepetit, V.: Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
Google Scholar
Guo, J., et al.: Streetsurf: Extending multi-view implicit surface reconstruction to street views. arXiv preprint arXiv:2306.04988 (2023)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Google Scholar
Li, Q., et al.: Scenarionet: open-source platform for large-scale traffic scenario simulation and modeling. Adv. Neural Inform. Process. Syst. 36 (2024)
Google Scholar
Liu, J.Y., Chen, Y., Yang, Z., Wang, J., Manivasagam, S., Urtasun, R.: Real-time neural rasterization for large scenes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8416–8427 (2023)
Google Scholar
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. ACM Siggraph Comput. Graph. 21(4), 163–169 (1987)
Article Google Scholar
Lu, F., Xu, Y., Chen, G., Li, H., Lin, K.Y., Jiang, C.: Urban radiance field representation with deformable neural mesh primitives. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 465–476 (2023)
Google Scholar
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: Nerf in the wild: neural radiance fields for unconstrained photo collections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7210–7219 (2021)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Article Google Scholar
Ortiz, J., et al.: isdf: Real-time neural signed distance fields for robot perception. Robotics: Sci. Syst. (2022)
Google Scholar
Ost, J., Laradji, I., Newell, A., Bahat, Y., Heide, F.: Neural point light fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18419–18429 (2022)
Google Scholar
Rematas, K., et al.: Urban radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12932–12942 (2022)
Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2446–2454 (2020)
Google Scholar
Takikawa, T., et al.: Neural geometric level of detail: Real-time rendering with implicit 3D shapes (2021)
Google Scholar
Tancik, M., et al.: Block-nerf: scalable large scene neural view synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8248–8258 (2022)
Google Scholar
Turki, H., Zhang, J.Y., Ferroni, F., Ramanan, D.: Suds: scalable urban dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12375–12385 (2023)
Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Neural Inform. Process. Syst. (NeurIPS) (2021)
Google Scholar
Xu, Q., Xu, Z., Philip, J., Bi, S., Shu, Z., Sunkavalli, K., Neumann, U.: Point-nerf: point-based neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448 (2022)
Google Scholar
Yan, Y., et al.: Street gaussians for modeling dynamic urban scenes. arXiv preprint arXiv:2401.01339 (2024)
Yang, J., et al.: Emernerf: Emergent spatial-temporal scene decomposition via self-supervision. arXiv preprint arXiv:2311.02077 (2023)
Yang, L., Kang, B., Huang, Z., Xu, X., Feng, J., Zhao, H.: Depth anything: Unleashing the power of large-scale unlabeled data. arXiv preprint arXiv:2401.10891 (2024)
Zhou, X., Lin, Z., Shan, X., Wang, Y., Sun, D., Yang, M.H.: Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. arXiv preprint arXiv:2312.07920 (2023)

Download references

Author information

Authors and Affiliations

Bosch Research North America and Bosch Center for Artificial Intelligence (BCAI), Sunnyvale, USA
Cheng Zhao, Ruoyu Wang, Yuliang Guo, Xinyu Huang & Liu Ren
Purdue University, West Lafayette, USA
Su Sun & Yingjie Victor Chen
XC Cross Domain Computing, Bosch, Karlsruhe, Germany
Jun-Jun Wan & Zhou Huang

Authors

Cheng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Su Sun
View author publications
You can also search for this author in PubMed Google Scholar
Ruoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuliang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Jun Wan
View author publications
You can also search for this author in PubMed Google Scholar
Zhou Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yingjie Victor Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liu Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Zhao .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 11393 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, C. et al. (2025). TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving: Supplementary Materials. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15121. Springer, Cham. https://doi.org/10.1007/978-3-031-73036-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-73036-8_6
Published: 21 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73035-1
Online ISBN: 978-3-031-73036-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving: Supplementary Materials