Learning 3D Semantic Reconstruction on Octrees

Wang, Xiaojuan; Oswald, Martin R.; Cherabier, Ian; Pollefeys, Marc

doi:10.1007/978-3-030-33676-9_41

Xiaojuan Wang¹¹,
Martin R. Oswald¹¹,
Ian Cherabier¹¹ &
…
Marc Pollefeys^11,12

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11824))

Included in the following conference series:

German Conference on Pattern Recognition

2119 Accesses

Abstract

We present a fully convolutional neural network that jointly predicts a semantic 3D reconstruction of a scene as well as a corresponding octree representation. This approach leverages the efficiency of an octree data structure to improve the capacities of volumetric semantic 3D reconstruction methods, especially in terms of scalability. At every octree level, the network predicts a semantic class for every voxel and decides which voxels should be further split in order to refine the reconstruction, thus working in a coarse-to-fine manner. The semantic prediction part of our method builds on recent work that combines traditional variational optimization and neural networks. In contrast to previous networks that work on dense voxel grids, our network is much more efficient in terms of memory consumption and inference efficiency, while achieving similar reconstruction performance. This allows for a high resolution reconstruction in case of limited memory. We perform experiments on the SUNCG and ScanNetv2 datasets on which our network shows comparable reconstruction results to the corresponding dense network while consuming less memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

Learning Dual Hierarchical Representation for 3D Surface Reconstruction

Learning to Reconstruct High-Quality 3D Shapes with Cascaded Fully Convolutional Networks

References

Bresson, X., Esedoḡlu, S., Vandergheynst, P., Thiran, J.P., Osher, S.: Fastglobal minimization of the active contour/snake model. J. Math. Imaging Vis. 28(2), 151–167 (2007)
Article Google Scholar
Chan, T., Esedoḡlu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. SIAM J. Appl. Math. 66(5), 1362–1648 (2006)
Article MathSciNet Google Scholar
Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: Proceedings of Neural Information Processing Systems (NIPS) (2018)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Cherabier, I., Schönberger, J.L., Oswald, M.R., Pollefeys, M., Geiger, A.: Learning priors for semantic 3D reconstruction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 325–341. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_20
Chapter Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Dai, A., Nießner, M.: 3DMV: joint 3D-multi-view prediction for 3D semantic scene segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 458–474. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_28
Chapter Google Scholar
Gargantini, I.: Linear octree for fast processing of three-dimensional objects. Comput. Graph. Image Process. 20 (1982)
Google Scholar
Häne, C., Zach, C., Cohen, A., Angst, R., Pollefeys, M.: Joint 3D scene reconstruction and class segmentation. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 97–104 (2013). https://doi.org/10.1109/CVPR.2013.20
Häne, C., Zach, C., Cohen, A., Pollefeys, M.: Dense semantic 3D reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1730–1743 (2017). https://doi.org/10.1109/TPAMI.2016.2613051
Article Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Pock, T., Chambolle, A.: Diagonal preconditioning for first order primal-dual algorithms in convex optimization. In: International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Riegler, G., Ulusoy, A.O., Bischof, H., Geiger, A.: OctNetFusion: learning depth fusion from data. In: International Conference on 3D Vision (3DV) (2017)
Google Scholar
Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.A.: Semantic scene completion from a single depth image. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: Efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of International Conference on Computer Vision (ICCV) (2017). http://lmb.informatik.uni-freiburg.de/Publications/2017/TDB17b
Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based Convolutional neural networks for 3D shape analysis. ACM Trans. Graph. (SIGGRAPH) 36(4), 72 (2017)
Google Scholar
Wang, P.S., Sun, C.Y., Liu, Y., Tong, X.: Adaptive O-CNN: a patch-based deep representation of 3D shapes. ACM Transactions on Graphics (SIGGRAPH Asia), vol. 37, no. 6 (2018)
Google Scholar

Download references

Acknowledgements

This research was partially supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number D17PC00280. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

ETH Zurich, Zürich, Switzerland
Xiaojuan Wang, Martin R. Oswald, Ian Cherabier & Marc Pollefeys
Microsoft, Redmond, USA
Marc Pollefeys

Authors

Xiaojuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Martin R. Oswald
View author publications
You can also search for this author in PubMed Google Scholar
Ian Cherabier
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pollefeys
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojuan Wang .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
University of Hamburg, Hamburg, Germany
Simone Frintrop
University of Münster, Münster, Germany
Xiaoyi Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Oswald, M.R., Cherabier, I., Pollefeys, M. (2019). Learning 3D Semantic Reconstruction on Octrees. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-33676-9_41
Published: 25 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33675-2
Online ISBN: 978-3-030-33676-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics