Skip to main content

Learning 3D Semantic Reconstruction on Octrees

  • Conference paper
  • First Online:
Pattern Recognition (DAGM GCPR 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11824))

Included in the following conference series:

  • 2119 Accesses

Abstract

We present a fully convolutional neural network that jointly predicts a semantic 3D reconstruction of a scene as well as a corresponding octree representation. This approach leverages the efficiency of an octree data structure to improve the capacities of volumetric semantic 3D reconstruction methods, especially in terms of scalability. At every octree level, the network predicts a semantic class for every voxel and decides which voxels should be further split in order to refine the reconstruction, thus working in a coarse-to-fine manner. The semantic prediction part of our method builds on recent work that combines traditional variational optimization and neural networks. In contrast to previous networks that work on dense voxel grids, our network is much more efficient in terms of memory consumption and inference efficiency, while achieving similar reconstruction performance. This allows for a high resolution reconstruction in case of limited memory. We perform experiments on the SUNCG and ScanNetv2 datasets on which our network shows comparable reconstruction results to the corresponding dense network while consuming less memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bresson, X., Esedoḡlu, S., Vandergheynst, P., Thiran, J.P., Osher, S.: Fastglobal minimization of the active contour/snake model. J. Math. Imaging Vis. 28(2), 151–167 (2007)

    Article  Google Scholar 

  2. Chan, T., Esedoḡlu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. SIAM J. Appl. Math. 66(5), 1362–1648 (2006)

    Article  MathSciNet  Google Scholar 

  3. Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: Proceedings of Neural Information Processing Systems (NIPS) (2018)

    Google Scholar 

  4. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

    Chapter  Google Scholar 

  5. Cherabier, I., Schönberger, J.L., Oswald, M.R., Pollefeys, M., Geiger, A.: Learning priors for semantic 3D reconstruction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 325–341. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_20

    Chapter  Google Scholar 

  6. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  7. Dai, A., Nießner, M.: 3DMV: joint 3D-multi-view prediction for 3D semantic scene segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 458–474. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_28

    Chapter  Google Scholar 

  8. Gargantini, I.: Linear octree for fast processing of three-dimensional objects. Comput. Graph. Image Process. 20 (1982)

    Google Scholar 

  9. Häne, C., Zach, C., Cohen, A., Angst, R., Pollefeys, M.: Joint 3D scene reconstruction and class segmentation. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 97–104 (2013). https://doi.org/10.1109/CVPR.2013.20

  10. Häne, C., Zach, C., Cohen, A., Pollefeys, M.: Dense semantic 3D reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1730–1743 (2017). https://doi.org/10.1109/TPAMI.2016.2613051

    Article  Google Scholar 

  11. He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  12. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)

  13. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  14. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54

    Chapter  Google Scholar 

  15. Pock, T., Chambolle, A.: Diagonal preconditioning for first order primal-dual algorithms in convex optimization. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  16. Riegler, G., Ulusoy, A.O., Bischof, H., Geiger, A.: OctNetFusion: learning depth fusion from data. In: International Conference on 3D Vision (3DV) (2017)

    Google Scholar 

  17. Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  18. Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.A.: Semantic scene completion from a single depth image. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  19. Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: Efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of International Conference on Computer Vision (ICCV) (2017). http://lmb.informatik.uni-freiburg.de/Publications/2017/TDB17b

  20. Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based Convolutional neural networks for 3D shape analysis. ACM Trans. Graph. (SIGGRAPH) 36(4), 72 (2017)

    Google Scholar 

  21. Wang, P.S., Sun, C.Y., Liu, Y., Tong, X.: Adaptive O-CNN: a patch-based deep representation of 3D shapes. ACM Transactions on Graphics (SIGGRAPH Asia), vol. 37, no. 6 (2018)

    Google Scholar 

Download references

Acknowledgements

This research was partially supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number D17PC00280. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojuan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Oswald, M.R., Cherabier, I., Pollefeys, M. (2019). Learning 3D Semantic Reconstruction on Octrees. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33676-9_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33675-2

  • Online ISBN: 978-3-030-33676-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics