Skip to main content

3D CoMPaT: Composition of Materials on Parts of 3D Things

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

We present 3D CoMPaT, a richly annotated large-scale dataset of more than 7.19 million rendered compositions of Materials on Parts of 7262 unique 3D Models; 990 compositions per model on average. 3D CoMPaT covers 43 shape categories, 235 unique part names, and 167 unique material classes that can be applied to parts of 3D objects. Each object with the applied part-material compositions is rendered from four equally spaced views as well as four randomized views, leading to a total of 58 million renderings (7.19 million compositions \(\times 8{}\) views). This dataset primarily focuses on stylizing 3D shapes at part-level with compatible materials. We introduce a new task, called Grounded CoMPaT Recognition (GCR), to collectively recognize and ground compositions of materials on parts of 3D objects. We present two variations of this task and adapt state-of-art 2D/3D deep learning methods to solve the problem as baselines for future research. We hope our work will help ease future research on compositional 3D Vision. The dataset and code are publicly available at https://www.3dcompat-dataset.org/.

Y. Li , U. Upadhyay and H. Slim—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blender foundation, blender.org - home of the blender project - free and open 3d creation software (2021)

    Google Scholar 

  2. Achlioptas, P., Abdelreheem, A., Xia, F., Elhoseiny, M., Guibas, L.: ReferIt3D: neural listeners for fine-grained 3D object identification in real-world scenes. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 422–440. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_25

    Chapter  Google Scholar 

  3. Chang, A., et al.: Matterport3d: learning from rgb-d data in indoor environments. In: International Conference on 3D Vision (3DV) (2017)

    Google Scholar 

  4. Chang, A.X., et al.: Shapenet: an information-rich 3d model repository. Technical Report. arXiv:1512.03012 [cs.GR], Stanford University – Princeton University – Toyota Technological Institute at Chicago (2015)

  5. Chen, D.Z., Chang, A.X., Nießner, M.: Scanrefer: 3D object localization in rgb-d scans using natural language. arXiv preprint arXiv:1912.08830 (2019)

  6. Choi, S., Zhou, Q.Y., Miller, S., Koltun, V.: A large dataset of object scans (2016)

    Google Scholar 

  7. Choy, C., Gwak, J., Savarese, S.: 4D spatio-temporal convnets: minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3075–3084 (2019)

    Google Scholar 

  8. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of Computer Vision and Pattern Recognition (CVPR). IEEE (2017)

    Google Scholar 

  9. Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2584–2591 (2013)

    Google Scholar 

  10. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR 2009, pp. 1778–1785. IEEE (2009)

    Google Scholar 

  11. Fu, H., et al.: 3D-front: 3D furnished rooms with layouts and semantics (2021)

    Google Scholar 

  12. Fu, H., et al.: 3D-future: 3D furniture shape with texture. arXiv preprint arXiv:2009.09633 (2020)

  13. Gao, L., Wu, T., Yuan, Y., Lin, M., Lai, Y., Zhang, H.: TM-NET: deep generative networks for textured meshes. CoRR abs/2010.06217 (2020), https://arxiv.org/abs/2010.06217

  14. Gao, L., et al.: SDM-NET: deep generative network for structured deformable mesh. CoRR abs/1908.04520 (2019). http://arxiv.org/abs/1908.04520

  15. Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: point cloud transformer (2021)

    Google Scholar 

  16. Guo, M.H., et al.: Pct: point cloud transformer. Comput. Visual Media 7(2), 187–199 (2021). https://doi.org/10.1007/s41095-021-0229-5

    Article  Google Scholar 

  17. Guo, Y., Ding, G., Han, J., Gao, Y.: Synthesizing samples for zero-shot learning. In: IJCAI (2017)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)

    Google Scholar 

  19. Hu, W., Zhao, H., Jiang, L., Jia, J., Wong, T.T.: Bidirectional projection network for cross dimension scene understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14373–14382 (2021)

    Google Scholar 

  20. Jiang, L., Zhao, H., Shi, S., Liu, S., Fu, C.W., Jia, J.: Pointgroup: dual-set point grouping for 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4867–4876 (2020)

    Google Scholar 

  21. Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958. IEEE (2009)

    Google Scholar 

  22. Li, Y., et al.: Supplementary material for 3D CoMPaT: composition of materials on parts of 3D things (2022). https://3dcompat-dataset.org/pdf/supplementary.pdf, version 1.0

  23. Li, Z., et al.: Openrooms: an end-to-end open framework for photorealistic indoor scene datasets (2021)

    Google Scholar 

  24. Lin, H., et al.: Learning material-aware local descriptors for 3D shapes. In: 2018 International Conference on 3D Vision (3DV) (2018). https://doi.org/10.1109/3dv.2018.00027

  25. Ma, X., Qin, C., You, H., Ran, H., Fu, Y.: Rethinking network design and local geometry in point cloud: a simple residual mlp framework. arXiv preprint arXiv:2202.07123 (2022)

  26. Mo, K., et al.: PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)

    Google Scholar 

  27. Park, K., Rematas, K., Farhadi, A., Seitz, S.M.: Photoshape: photorealistic materials for large-scale shape collections. ACM Trans. Graph. 37(6) (2018)

    Google Scholar 

  28. Pratt, S., Yatskar, M., Weihs, L., Farhadi, A., Kembhavi, A.: Grounded situation recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 314–332. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_19

    Chapter  Google Scholar 

  29. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst., 5099–5108 (2017)

    Google Scholar 

  30. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  31. Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., Yeung, S.K.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  32. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. (TOG) 38, 1–12 (2019)

    Google Scholar 

  33. Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

    Google Scholar 

  34. Yatskar, M., Zettlemoyer, L., Farhadi, A.: Situation recognition: visual semantic role labeling for image understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5534–5542 (2016)

    Google Scholar 

  35. Yi, L., et al.: A scalable active framework for region annotation in 3d shape collections. ACM Trans. Graph. (ToG) 35(6), 1–12 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

The authors wish to thank Poly9 Inc. participants for all the hard work, without whom this work would not be possible. This research is supported by King Abdullah University of Science and Technology (KAUST).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Elhoseiny .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y. et al. (2022). 3D CoMPaT: Composition of Materials on Parts of 3D Things. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20074-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20073-1

  • Online ISBN: 978-3-031-20074-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics