Abstract
Progress in 3D computer vision tasks demands a huge amount of data, yet annotating multi-view images with 3D-consistent annotations, or point clouds with part segmentation is both time-consuming and challenging. This paper introduces DatasetNeRF, a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations, while utilizing minimal 2D human-labeled annotations. Specifically, we leverage the semantic prior within a 3D generative model to train a semantic decoder, requiring only a handful of fine-grained labeled samples. Once trained, the decoder generalizes across the latent space, enabling the generation of infinite data. The generated data is applicable across various computer vision tasks, including video segmentation and 3D point cloud segmentation in both synthetic and real-world scenarios. Our approach not only surpasses baseline models in segmentation quality, achieving superior 3D-Consistency and segmentation precision on individual images, but also demonstrates versatility by being applicable to both articulated and non-articulated generative models. Furthermore, we explore applications stemming from our approach, such as 3D-aware semantic editing and 3D inversion. Code can be found at /GenIntel/DatasetNeRF.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alaluf, Y., Tov, O., Mokady, R., Gal, R., Bermano, A.: HyperStyle: styleGAN inversion with hypernetworks for real image editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18511–18521 (2022)
Atzmon, M., Lipman, Y.: SAL: sign agnostic learning of shapes from raw data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2565–2574 (2020)
Baranchuk, D., Rubachev, I., Voynov, A., Khrulkov, V., Babenko, A.: Label-efficient semantic segmentation with diffusion models (2021)
Bautista, M.A., et al.: GAUDI: a neural architect for immersive 3D scene generation. Adv. Neural. Inf. Process. Syst. 35, 25102–25116 (2022)
Bergman, A., Kellnhofer, P., Yifan, W., Chan, E., Lindell, D., Wetzstein, G.: Generative neural articulated radiance fields. Adv. Neural. Inf. Process. Syst. 35, 19900–19916 (2022)
Bergman, A.W., Kellnhofer, P., Yifan, W., Chan, E.R., Lindell, D.B., Wetzstein, G.: Generative neural articulated radiance fields (2023)
Bolles, R.C., Baker, H.H., Marimont, D.H.: EpiPolar-plane image analysis: an approach to determining structure from motion. Int. J. Comput. Vision 1(1), 7–55 (1987). https://doi.org/10.1007/BF00128525
Chan, E.R., et al.: Efficient geometry-aware 3D generative adversarial networks (2022)
Chan, E.R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis (2021)
Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository (2015)
Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: TensoRF: tensorial Radiance Fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, pp. 333–350. Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs (2017)
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: StarGAN V2: diverse image synthesis for multiple domains (2020)
Deng, K., Yang, G., Ramanan, D., Zhu, J.Y.: 3D-aware conditional image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4434–4445 (2023)
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. Adv. Neural. Inf. Process. Syst. 34, 8780–8794 (2021)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robot Learning, pp. 1–16. PMLR (2017)
Fei, Z., Fan, M., Zhu, L., Huang, J., Wei, X., Wei, X.: Masked auto-encoders meet generative adversarial networks and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24449–24459 (2023)
Gadelha, M., Maji, S., Wang, R.: 3D shape induction from 2D views of multiple objects. In: 2017 International Conference on 3D Vision (3DV), pp. 402–411. IEEE (2017)
Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: FastNerf: high-fidelity neural rendering at 200FPS. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14346–14355 (2021)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric regularization for learning shapes. arXiv preprint arXiv:2002.10099 (2020)
Gu, J., Liu, L., Wang, P., Theobalt, C.: StyleNerf: a style-based 3D-aware generator for high-resolution image synthesis. arXiv preprint arXiv:2110.08985 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Hedman, P., Srinivasan, P.P., Mildenhall, B., Barron, J.T., Debevec, P.: Baking neural radiance fields for real-time view synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5875–5884 (2021)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Karras, T., et al.: Alias-free generative adversarial networks. Adv. Neural. Inf. Process. Syst. 34, 852–863 (2021)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks (2019)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAND. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Kellnhofer, P., Jebe, L.C., Jones, A., Spicer, R., Pulli, K., Wetzstein, G.: Neural lumigraph rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4287–4297 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017)
Kirschstein, T., Qian, S., Giebenhain, S., Walter, T., Nießner, M.: NeRSemble: multi-view radiance field reconstruction of human heads. ACM Trans. Graph. 42(4), 1–14 (2023). https://doi.org/10.1145/3592455
Kohli, A.P.S., Sitzmann, V., Wetzstein, G.: Semantic implicit neural scene representations with semi-supervised training. In: 2020 International Conference on 3D Vision (3DV), pp. 423–433. IEEE (2020)
Li, D., et al.: BigDatasetGAN: synthesizing ImageNet with pixel-wise annotations (2022)
Li, R., Yang, S., Ross, D.A., Kanazawa, A.: AI choreographer: music conditioned 3D dance generation with AIST++ (2021)
Lin, K.E., Lin, Y.C., Lai, W.S., Lin, T.Y., Shih, Y.C., Ramamoorthi, R.: Vision transformer for nerf-based view synthesis from a single input image. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 806–815 (2023)
Lindell, D.B., Martel, J.N., Wetzstein, G.: AutoInt: automatic integration for fast neural volume rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14556–14565 (2021)
Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A., Fidler, S.: EditGAN: high-precision semantic image editing. Adv. Neural. Inf. Process. Syst. 34, 16331–16345 (2021)
Liu, L., Gu, J., Zaw Lin, K., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. Adv. Neural. Inf. Process. Syst. 33, 15651–15663 (2020)
Liu, S., Zhang, Y., Peng, S., Shi, B., Pollefeys, M., Cui, Z.: DIST: rendering deep implicit signed distance function with differentiable sphere tracing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2019–2028 (2020)
Ma, L., et al.: Deblur-NeRF: neural radiance fields from blurry images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12861–12870 (2022)
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: NeRF in the wild: neural radiance fields for unconstrained photo collections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7210–7219 (2021)
Michalkiewicz, M., Pontes, J.K., Jack, D., Baktashmotlagh, M., Eriksson, A.: Implicit surface representations as layers in neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4743–4752 (2019)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Neff, T., et al.: DoNerf: towards real-time rendering of compact neural radiance fields using depth oracle networks. In: Computer Graphics Forum. vol. 40, pp. 45–59. Wiley Online Library (2021)
Nguyen, Q., Vu, T., Tran, A., Nguyen, K.: Dataset Diffusion: diffusion-based synthetic dataset generation for pixel-level semantic segmentation (2023)
Nguyen-Phuoc, T.H., Richardt, C., Mai, L., Yang, Y., Mitra, N.: BlockGAN: learning 3D object-aware scene representations from Unlabelled images. Adv. Neural. Inf. Process. Syst. 33, 6767–6778 (2020)
Niemeyer, M., Geiger, A.: Giraffe: representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference On Computer Vision and Pattern Recognition, pp. 11453–11464 (2021)
Oquab, M., et al.: DINOv2: learning robust visual features without supervision (2024)
Or-El, R., Luo, X., Shan, M., Shechtman, E., Park, J.J., Kemelmacher-Shlizerman, I.: StylesDF: High-resolution 3D-consistent image and geometry generation (2022)
Or-El, R., Luo, X., Shan, M., Shechtman, E., Park, J.J., Kemelmacher-Shlizerman, I.: StylesDF: high-resolution 3D-consistent image and geometry generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13503–13513 (2022)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepsDF: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 165–174 (2019)
Puig, X., et al.: VirtualHome: simulating household activities via programs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8494–8502 (2018)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation (2017)
Richardson, E., et al.: Encoding in style: a styleGAN encoder for image-to-image translation (2021)
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)
Schwarz, K., Liao, Y., Niemeyer, M., Geiger, A.: GRAF: generative radiance fields for 3d-aware image synthesis. Adv. Neural. Inf. Process. Syst. 33, 20154–20166 (2020)
Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9243–9252 (2020)
Sitzmann, V., Martel, J., Bergman, A., Lindell, D., Wetzstein, G.: Implicit neural representations with periodic activation functions. Adv. Neural. Inf. Process. Syst. 33, 7462–7473 (2020)
Sitzmann, V., Zollhöfer, M., Wetzstein, G.: Scene representation networks: continuous 3D-structure-aware neural scene representations. In: Advances in Neural Information Processing Systems. vol. 32 (2019)
Srinivasan, P.P., Deng, B., Zhang, X., Tancik, M., Mildenhall, B., Barron, J.T.: NeRV: neural reflectance and visibility fields for relighting and view synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7495–7504 (2021)
Sucar, E., Liu, S., Ortiz, J., Davison, A.J.: iMAP: implicit mapping and positioning in real-time. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6229–6238 (2021)
Sun, J., Wang, X., Shi, Y., Wang, L., Wang, J., Liu, Y.: IDE-3D: Interactive disentangled editing for high-resolution 3D-aware portrait synthesis (2022)
Tancik, M., et al.: Block-NeRF: scalable large scene neural view synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8248–8258 (2022)
Vora, S., et al.: NeSF: Neural semantic fields for generalizable semantic segmentation of 3D scenes. arXiv preprint arXiv:2111.13260 (2021)
Wu, W., et al.: DatasetDM: Synthesizing data with perception annotations using diffusion models (2023)
Xiang, J., Yang, J., Deng, Y., Tong, X.: GRAM-HD: 3D-consistent image generation at high resolution with generative radiance manifolds (2023)
Yang, H., et al.: ContraNeRF: generalizable neural radiance fields for synthetic-to-real novel view synthesis via contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16508–16517 (2023)
Yang, Z., Zhan, F., Liu, K., Xu, M., Lu, S.: Ai-generated images as data source: The dawn of synthetic era. arXiv preprint arXiv:2310.01830 (2023)
Yi, L., et al.: A scalable active framework for region annotation in 3D shape collections. SIGGRAPH Asia (2016)
Zhan, F., Liu, L., Kortylewski, A., Theobalt, C.: General neural gauge fields. In: The Eleventh International Conference on Learning Representations (2023)
Zhan, F., et al.: Multimodal image synthesis and editing: a survey and taxonomy. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2023)
Zhang, K., Riegler, G., Snavely, N., Koltun, V.: NeRF++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492 (2020)
Zhang, Y., et al.: DatasetGAN: efficient labeled data factory with minimal human effort (2021)
Zhi, S., Laidlow, T., Leutenegger, S., Davison, A.J.: In-place scene labelling and understanding with implicit scene representation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15838–15847 (2021)
Zhu, J.Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold (2018)
Zhu, J.Y., et al.: Visual object networks: image generation with disentangled 3D representations. In: Advances in Neural Information Processing Systems. vol. 31 (2018)
Zhu, Z., et al.: NICE-SLAM: neural implicit scalable encoding for slam. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12786–12796 (2022)
Acknowledgements
Adam Kortylewski gratefully acknowledges support for his Emmy Noether Research Group, funded by the German Research Foundation (DFG) under Grant No. 468670075.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chi, Y., Zhan, F., Wu, S., Theobalt, C., Kortylewski, A. (2025). DatasetNeRF: Efficient 3D-Aware Data Factory with Generative Radiance Fields. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15120. Springer, Cham. https://doi.org/10.1007/978-3-031-73033-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-73033-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73032-0
Online ISBN: 978-3-031-73033-7
eBook Packages: Computer ScienceComputer Science (R0)