Text2LiDAR: Text-Guided LiDAR Point Cloud Generation via Equirectangular Transformer

Wu, Yang; Zhang, Kaihua; Qian, Jianjun; Xie, Jin; Yang, Jian

doi:10.1007/978-3-031-72992-8_17

Yang Wu¹³,
Kaihua Zhang¹⁶,
Jianjun Qian¹³,
Jin Xie^14,15 &
…
Jian Yang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15114))

Included in the following conference series:

European Conference on Computer Vision

363 Accesses

Abstract

The complex traffic environment and various weather conditions make the collection of LiDAR data expensive and challenging. Achieving high-quality and controllable LiDAR data generation is urgently needed, controlling with text is a common practice, but there is little research in this field. To this end, we propose Text2LiDAR, the first efficient, diverse, and text-controllable LiDAR data generation model. Specifically, we design an equirectangular transformer architecture, utilizing the designed equirectangular attention to capture LiDAR features in a manner with data characteristics. Then, we design a control-signal embedding injector to efficiently integrate control signals through the global-to-focused attention mechanism. Additionally, we devise a frequency modulator to assist the model in recovering high-frequency details, ensuring the clarity of the generated point cloud. To foster development in the field and optimize text-controlled generation performance, we construct nuLiDARtext which offers diverse text descriptors for 34,149 LiDAR point clouds from 850 scenes. Experiments on uncontrolled and text-controlled generation in various forms on KITTI-360 and nuScenes datasets demonstrate the superiority of our approach. The project can be found at https://github.com/wuyang98/Text2LiDAR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning to Generate Realistic LiDAR Point Clouds

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains

References

Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)
Google Scholar
Bakhshi, R., Sandborn, P.: Maximizing the returns of LIDAR systems in wind farms for yaw error correction applications. Wind Energy 23(6), 1408–1421 (2020)
Article Google Scholar
Behley, J., et al.: SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Caccia, L., Van Hoof, H., Courville, A., Pineau, J.: Deep generative modeling of LiDAR data. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5034–5040. IEEE (2019)
Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Google Scholar
Chai, Y., et al.: To the point: efficient 3D object detection in the range image with graph convolution kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2021)
Google Scholar
Chen, K., Choy, C.B., Savva, M., Chang, A.X., Funkhouser, T., Savarese, S.: Text2Shape: generating shapes from natural language by learning joint embeddings. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018, Part III. LNCS, vol. 11363, pp. 100–116. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_7
Chapter Google Scholar
Chen, R., Chen, Y., Jiao, N., Jia, K.: Fantasia3D: disentangling geometry and appearance for high-quality text-to-3D content creation. arXiv preprint arXiv:2303.13873 (2023)
Chen, Z., Wang, F., Liu, H.: Text-to-3D using gaussian splatting. arXiv preprint arXiv:2309.16585 (2023)
Cho, J., Zala, A., Bansal, M.: DALL-Eval: probing the reasoning skills and social biases of text-to-image generation models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3043–3054 (2023)
Google Scholar
Crowson, K., et al.: VQGAN-CLIP: open domain image generation and editing with natural language guidance. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13697, pp. 88–105. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19836-6_6
Chapter Google Scholar
Cui, C., et al.: A survey on multimodal large language models for autonomous driving. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 958–979 (2024)
Google Scholar
Deliry, S.I., Avdan, U.: Accuracy of unmanned aerial systems photogrammetry and structure from motion in surveying and mapping: a review. J. Indian Soc. Remote Sens. 49(8), 1997–2017 (2021)
Article Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robot Learning, pp. 1–16. PMLR (2017)
Google Scholar
Dreissig, M., Scheuble, D., Piewak, F., Boedecker, J.: Survey on LiDAR perception in adverse weather conditions. arXiv preprint arXiv:2304.06312 (2023)
Fu, M., Liu, H., Yu, Y., Chen, J., Wang, K.: DW-GAN: a discrete wavelet transform GAN for nonhomogeneous dehazing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 203–212 (2021)
Google Scholar
Ge, S., Park, T., Zhu, J.Y., Huang, J.B.: Expressive text-to-image generation with rich text. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7545–7556 (2023)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Google Scholar
Gulino, C., et al.: Waymax: an accelerated, data-driven simulator for large-scale autonomous driving research. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Gupta, A., Dollar, P., Girshick, R.: LVIS: a dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5356–5364 (2019)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851 (2020)
Google Scholar
Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.: Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Hui, L., Xu, R., Xie, J., Qian, J., Yang, J.: Progressive point cloud deconvolution generation network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part XV. LNCS, vol. 12360, pp. 397–413. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_24
Chapter Google Scholar
Janai, J., Güney, F., Behl, A., Geiger, A., et al.: Computer vision for autonomous vehicles: problems, datasets and state of the art. Found. Trends® Comput. Graph. Vis. 12(1–3), 1–308 (2020)
Article Google Scholar
Kasten, Y., Rahamim, O., Chechik, G.: Point cloud completion with pretrained text-to-image diffusion models. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4), 139-1 (2023)
Google Scholar
Kim, Y., Lee, J., Kim, J.H., Ha, J.W., Zhu, J.Y.: Dense text-to-image generation with attention modulation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7701–7711 (2023)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Klokov, R., Boyer, E., Verbeek, J.: Discrete point flow networks for efficient point cloud generation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 694–710. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_41
Chapter Google Scholar
Kong, L., et al.: Robo3D: towards robust and reliable 3D perception against corruptions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19994–20006 (2023)
Google Scholar
Kuo, W., Cui, Y., Gu, X., Piergiovanni, A., Angelova, A.: F-VLM: open-vocabulary object detection upon frozen vision and language models. arXiv preprint arXiv:2209.15639 (2022)
Li, Z., et al.: PromptKD: unsupervised prompt distillation for vision-language models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 26617–26626 (2024)
Google Scholar
Li, Z., et al.: Curriculum temperature for knowledge distillation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 1504–1512 (2023)
Google Scholar
Liao, Y., Xie, J., Geiger, A.: KITTI-360: a novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3292–3310 (2022)
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Z., Wang, Y., Qi, X., Fu, C.W.: Towards implicit text-guided 3D shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17896–17906 (2022)
Google Scholar
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: RePaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022)
Google Scholar
Luo, S., Hu, W.: Diffusion probabilistic models for 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2837–2845 (2021)
Google Scholar
Manivasagam, S., et al.: LiDARsim: realistic LiDAR simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11167–11176 (2020)
Google Scholar
Meyer, G.P., Laddha, A., Kee, E., Vallespi-Gonzalez, C., Wellington, C.K.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12677–12686 (2019)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: RangeNet++: fast and accurate LiDAR semantic segmentation. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220. IEEE (2019)
Google Scholar
Mohsan, S.A.H., Othman, N.Q.H., Li, Y., Alsharif, M.H., Khan, M.A.: Unmanned aerial vehicles (UAVs): practical aspects, applications, open challenges, security issues, and future trends. Intell. Serv. Robot. 16(1), 109–137 (2023)
Google Scholar
Nakashima, K., Iwashita, Y., Kurazume, R.: Generative range imaging for learning scene priors of 3D LiDAR data. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1256–1266 (2023)
Google Scholar
Nakashima, K., Kurazume, R.: Learning to drop points for LiDAR scan synthesis. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 222–229. IEEE (2021)
Google Scholar
Nakashima, K., Kurazume, R.: LiDAR data synthesis with denoising diffusion probabilistic models. arXiv preprint arXiv:2309.09256 (2023)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Piroli, A., Dallabetta, V., Kopp, J., Walessa, M., Meissner, D., Dietmayer, K.: Energy-based detection of adverse weather effects in LiDAR data. IEEE Robot. Autom. Lett. 8(7), 4322–4329 (2023)
Article Google Scholar
Poole, B., Jain, A., Barron, J.T., Mildenhall, B.: DreamFusion: Text-to-3D using 2D diffusion. arXiv preprint arXiv:2209.14988 (2022)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Google Scholar
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015, Part III. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. In: Advances in Neural Information Processing Systems, vol. 35, pp. 36479–36494 (2022)
Google Scholar
Sauer, A., Chitta, K., Müller, J., Geiger, A.: Projected GANs converge faster. In: Advances in Neural Information Processing Systems, vol. 34, pp. 17480–17492 (2021)
Google Scholar
Schubert, S., Neubert, P., Pöschmann, J., Protzel, P.: Circular convolutional neural networks for panoramic images and laser data. In: 2019 IEEE Intelligent Vehicles Symposium (IV), pp. 653–660. IEEE (2019)
Google Scholar
Schuhmann, C., et al.: LAION-5B: an open large-scale dataset for training next generation image-text models. In: Advances in Neural Information Processing Systems, vol. 35, pp. 25278–25294 (2022)
Google Scholar
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: Advances in Neural Information Processing Systems, vol. 33, pp. 7537–7547 (2020)
Google Scholar
Tyszkiewicz, M.J., Fua, P., Trulls, E.: GECCO: geometrically-conditioned point diffusion models. arXiv preprint arXiv:2303.05916 (2023)
Valsesia, D., Fracastoro, G., Magli, E.: Learning localized generative models for 3D point clouds via graph convolution. In: International Conference on Learning Representations (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, J., Yuan, H., Chen, D., Zhang, Y., Wang, X., Zhang, S.: ModelScope text-to-video technical report. arXiv preprint arXiv:2308.06571 (2023)
Wang, Y., et al.: Multi-modal 3D object detection in autonomous driving: a survey. Int. J. Comput. Vis. 131(8), 2122–2152 (2023)
Article Google Scholar
Wang, Z., Liu, W., He, Q., Wu, X., Yi, Z.: CLIP-GEN: language-free training of a text-to-image generator with CLIP. arXiv preprint arXiv:2203.00386 (2022)
Wang, Z.J., Montoya, E., Munechika, D., Yang, H., Hoover, B., Chau, D.H.: DiffusionDB: a large-scale prompt gallery dataset for text-to-image generative models. arXiv preprint arXiv:2210.14896 (2022)
Wen, C., Yu, B., Tao, D.: Learning progressive point embeddings for 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10266–10275 (2021)
Google Scholar
Wu, J.Z., et al.: Tune-A-Video: one-shot tuning of image diffusion models for text-to-video generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7623–7633 (2023)
Google Scholar
Wu, L., et al.: Fast point cloud generation with straight flows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9445–9454 (2023)
Google Scholar
Wu, Z., Wang, Y., Feng, M., Xie, H., Mian, A.: Sketch and text guided diffusion model for colored point cloud generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8929–8939 (2023)
Google Scholar
Xiang, P., et al.: Snowflake point deconvolution for point cloud completion and generation with skip-transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45(5), 6320–6338 (2022)
Google Scholar
Xu, J., et al.: ImageReward: learning and evaluating human preferences for text-to-image generation. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Xu, Z., Xing, S., Sangineto, E., Sebe, N.: SpectralCLIP: preventing artifacts in text-guided style transfer from a spectral perspective. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5121–5130 (2024)
Google Scholar
Xue, M., He, J., He, Y., Liu, Z., Wang, W., Zhou, M.: Low-light image enhancement via CLIP-Fourier guided wavelet diffusion. arXiv preprint arXiv:2401.03788 (2024)
Yan, Z., Li, X., Wang, K., Zhang, Z., Li, J., Yang, J.: Multi-modal masked pre-training for monocular panoramic depth completion. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13661, pp. 378–395. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19769-7_22
Chapter Google Scholar
Yan, Z., et al.: Tri-perspective view decomposition for geometry-aware depth completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4874–4884 (2024)
Google Scholar
Yan, Z., Wang, K., Li, X., Zhang, Z., Li, J., Yang, J.: RigNet: repetitive image guided network for depth completion. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13687, pp. 214–230. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_13
Chapter Google Scholar
Yang, G., Huang, X., Hao, Z., Liu, M.Y., Belongie, S., Hariharan, B.: PointFlow: 3D point cloud generation with continuous normalizing flows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4541–4550 (2019)
Google Scholar
Yang, X., Zhou, D., Feng, J., Wang, X.: Diffusion probabilistic model made slim. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22552–22562 (2023)
Google Scholar
Yin, H., Lin, Z., Yeoh, J.K.: Semantic localization on BIM-generated maps using a 3D LiDAR sensor. Autom. Constr. 146, 104641 (2023)
Article Google Scholar
Yuan, L., et al.: Tokens-to-Token ViT: training vision transformers from scratch on ImageNet. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 558–567 (2021)
Google Scholar
Zamorski, M., et al.: Adversarial autoencoders for compact representations of 3D point clouds. Comput. Vis. Image Underst. 193, 102921 (2020)
Article Google Scholar
Zhou, Y., et al.: Towards language-free training for text-to-image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17907–17917 (2022)
Google Scholar
Zou, Q., Sun, Q., Chen, L., Nie, B., Li, Q.: A comparative analysis of LiDAR SLAM-based indoor navigation for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 23(7), 6907–6921 (2021)
Article Google Scholar
Zyrianov, V., Zhu, X., Wang, S.: Learning to generate realistic LiDAR point clouds. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13683, pp. 17–35. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20050-2_2
Chapter Google Scholar

Download references

Acknowledgements

We are very grateful for the reviewers’ critical and constructive comments. This work was supported in part by the NSFC under Grant No. 62276141, 62176124, 62276144, 62361166670.

Author information

Authors and Affiliations

PCA Lab, Nanjing University of Science and Technology, Nanjing, China
Yang Wu, Jianjun Qian & Jian Yang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Jin Xie
School of Intelligence Science and Technology, Nanjing University, Suzhou, China
Jin Xie
B-DAT and CICAEET, Nanjing University of Information Science and Technology, Nanjing, China
Kaihua Zhang

Authors

Yang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Kaihua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianjun Qian
View author publications
You can also search for this author in PubMed Google Scholar
Jin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin Xie .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4242 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Y., Zhang, K., Qian, J., Xie, J., Yang, J. (2025). Text2LiDAR: Text-Guided LiDAR Point Cloud Generation via Equirectangular Transformer. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15114. Springer, Cham. https://doi.org/10.1007/978-3-031-72992-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-72992-8_17
Published: 30 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72991-1
Online ISBN: 978-3-031-72992-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Text2LiDAR: Text-Guided LiDAR Point Cloud Generation via Equirectangular Transformer