3D CoMPaT: Composition of Materials on Parts of 3D Things

Li, Yuchen; Upadhyay, Ujjwal; Slim, Habib; Abdelreheem, Ahmed; Prajapati, Arpit; Pothigara, Suhail; Wonka, Peter; Elhoseiny, Mohamed

doi:10.1007/978-3-031-20074-8_7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13668))

Included in the following conference series:

European Conference on Computer Vision

1680 Accesses
2 Citations

Abstract

We present 3D CoMPaT, a richly annotated large-scale dataset of more than 7.19 million rendered compositions of Materials on Parts of 7262 unique 3D Models; 990 compositions per model on average. 3D CoMPaT covers 43 shape categories, 235 unique part names, and 167 unique material classes that can be applied to parts of 3D objects. Each object with the applied part-material compositions is rendered from four equally spaced views as well as four randomized views, leading to a total of 58 million renderings (7.19 million compositions \(\times 8{}\) views). This dataset primarily focuses on stylizing 3D shapes at part-level with compatible materials. We introduce a new task, called Grounded CoMPaT Recognition (GCR), to collectively recognize and ground compositions of materials on parts of 3D objects. We present two variations of this task and adapt state-of-art 2D/3D deep learning methods to solve the problem as baselines for future research. We hope our work will help ease future research on compositional 3D Vision. The dataset and code are publicly available at https://www.3dcompat-dataset.org/.

Y. Li , U. Upadhyay and H. Slim—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Blender foundation, blender.org - home of the blender project - free and open 3d creation software (2021)
Google Scholar
Achlioptas, P., Abdelreheem, A., Xia, F., Elhoseiny, M., Guibas, L.: ReferIt3D: neural listeners for fine-grained 3D object identification in real-world scenes. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 422–440. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_25
Chapter Google Scholar
Chang, A., et al.: Matterport3d: learning from rgb-d data in indoor environments. In: International Conference on 3D Vision (3DV) (2017)
Google Scholar
Chang, A.X., et al.: Shapenet: an information-rich 3d model repository. Technical Report. arXiv:1512.03012 [cs.GR], Stanford University – Princeton University – Toyota Technological Institute at Chicago (2015)
Chen, D.Z., Chang, A.X., Nießner, M.: Scanrefer: 3D object localization in rgb-d scans using natural language. arXiv preprint arXiv:1912.08830 (2019)
Choi, S., Zhou, Q.Y., Miller, S., Koltun, V.: A large dataset of object scans (2016)
Google Scholar
Choy, C., Gwak, J., Savarese, S.: 4D spatio-temporal convnets: minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3075–3084 (2019)
Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of Computer Vision and Pattern Recognition (CVPR). IEEE (2017)
Google Scholar
Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2584–2591 (2013)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR 2009, pp. 1778–1785. IEEE (2009)
Google Scholar
Fu, H., et al.: 3D-front: 3D furnished rooms with layouts and semantics (2021)
Google Scholar
Fu, H., et al.: 3D-future: 3D furniture shape with texture. arXiv preprint arXiv:2009.09633 (2020)
Gao, L., Wu, T., Yuan, Y., Lin, M., Lai, Y., Zhang, H.: TM-NET: deep generative networks for textured meshes. CoRR abs/2010.06217 (2020), https://arxiv.org/abs/2010.06217
Gao, L., et al.: SDM-NET: deep generative network for structured deformable mesh. CoRR abs/1908.04520 (2019). http://arxiv.org/abs/1908.04520
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: point cloud transformer (2021)
Google Scholar
Guo, M.H., et al.: Pct: point cloud transformer. Comput. Visual Media 7(2), 187–199 (2021). https://doi.org/10.1007/s41095-021-0229-5
Article Google Scholar
Guo, Y., Ding, G., Han, J., Gao, Y.: Synthesizing samples for zero-shot learning. In: IJCAI (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Google Scholar
Hu, W., Zhao, H., Jiang, L., Jia, J., Wong, T.T.: Bidirectional projection network for cross dimension scene understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14373–14382 (2021)
Google Scholar
Jiang, L., Zhao, H., Shi, S., Liu, S., Fu, C.W., Jia, J.: Pointgroup: dual-set point grouping for 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4867–4876 (2020)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958. IEEE (2009)
Google Scholar
Li, Y., et al.: Supplementary material for 3D CoMPaT: composition of materials on parts of 3D things (2022). https://3dcompat-dataset.org/pdf/supplementary.pdf, version 1.0
Li, Z., et al.: Openrooms: an end-to-end open framework for photorealistic indoor scene datasets (2021)
Google Scholar
Lin, H., et al.: Learning material-aware local descriptors for 3D shapes. In: 2018 International Conference on 3D Vision (3DV) (2018). https://doi.org/10.1109/3dv.2018.00027
Ma, X., Qin, C., You, H., Ran, H., Fu, Y.: Rethinking network design and local geometry in point cloud: a simple residual mlp framework. arXiv preprint arXiv:2202.07123 (2022)
Mo, K., et al.: PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
Park, K., Rematas, K., Farhadi, A., Seitz, S.M.: Photoshape: photorealistic materials for large-scale shape collections. ACM Trans. Graph. 37(6) (2018)
Google Scholar
Pratt, S., Yatskar, M., Weihs, L., Farhadi, A., Kembhavi, A.: Grounded situation recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 314–332. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_19
Chapter Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst., 5099–5108 (2017)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., Yeung, S.K.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. (TOG) 38, 1–12 (2019)
Google Scholar
Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Yatskar, M., Zettlemoyer, L., Farhadi, A.: Situation recognition: visual semantic role labeling for image understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5534–5542 (2016)
Google Scholar
Yi, L., et al.: A scalable active framework for region annotation in 3d shape collections. ACM Trans. Graph. (ToG) 35(6), 1–12 (2016)
Article Google Scholar

Download references

Acknowledgments

The authors wish to thank Poly9 Inc. participants for all the hard work, without whom this work would not be possible. This research is supported by King Abdullah University of Science and Technology (KAUST).

Author information

Authors and Affiliations

KAUST, Thuwal, Saudi Arabia
Yuchen Li, Ujjwal Upadhyay, Habib Slim, Ahmed Abdelreheem, Peter Wonka & Mohamed Elhoseiny
Poly9 Inc., San Francisco, California, USA
Arpit Prajapati & Suhail Pothigara

Authors

Yuchen Li
View author publications
You can also search for this author in PubMed Google Scholar
Ujjwal Upadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Habib Slim
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Abdelreheem
View author publications
You can also search for this author in PubMed Google Scholar
Arpit Prajapati
View author publications
You can also search for this author in PubMed Google Scholar
Suhail Pothigara
View author publications
You can also search for this author in PubMed Google Scholar
Peter Wonka
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Elhoseiny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Elhoseiny .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y. et al. (2022). 3D CoMPaT: Composition of Materials on Parts of 3D Things. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-20074-8_7
Published: 12 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20073-1
Online ISBN: 978-3-031-20074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D CoMPaT: Composition of Materials on Parts of 3D Things