Abstract
The availability of ShapeNet, a dataset with vast numbers of 3D objects, has led to the development of successful 3D reconstruction models. However, evaluation against similar datasets that measure aspects closely related to ShapeNet is often misleading. We propose a novel benchmark to tackle this assessment problem. We selected three state-of-the-art models for comparison: The voxel-based 3D-C2FT, Pix2Vox, and occupancy function based Occupancy Networks to demonstrate the effectiveness of our benchmark. We adapted a novel dataset, 3DCoMPaT++, which offers rich material and part annotations for the evaluation of 3D reconstructions. We assessed the reconstruction performance by changing viewpoints and varying styles in 2D input images. The results show that models struggle to adapt to novel settings. We also evaluated models at the part level to identify the most challenging parts. We propose Part F1-Score@0.01 for evaluation. Our experiments show quantitatively that performance degrades drastically and the methods perform poorly in finer details and thin parts.
Similar content being viewed by others
Data Availability
No datasets were generated or analysed during the current study.
References
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: an information-rich 3D model repository. Preprint at arXiv:1512.03012 (2015)
Li, Y., Upadhyay, U., Slim, H., Abdelreheem, A., Prajapati, A., Suhail Pothigara, P.W., Elhoseiny, M.: 3DCoMPaT: Composition of materials on parts of 3D things. In: ECCV, pp. 110–127 (2022)
Slim, H., Li, X., Li, Y., Ahmed, M., Ayman, M., Upadhyay, U., Abdelreheem, A., Prajapati, A., Pothigara, S., Wonka, P., Elhoseiny, M.: 3DCoMPaT++: An improved large-scale 3D vision dataset for compositional recognition. Preprint at arXiv:2310.18511 (2023)
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: CVPR, pp. 4455–4465 (2019)
Xie, H., Yao, H., Sun, X., Zhou, S., Zhang, S.: Pix2Vox: Context-aware 3D reconstruction from single and multi-view images. In: ICCV, pp. 2690–2698 (2019)
Tiong, L.C.O., Sigmund, D., Teoh, A.B.J.: 3D-C2FT: Coarse-to-fine transformer for multi-view 3D reconstruction. In: ACCV, pp. 1438–1454 (2022)
Kantarci, M., Gökberk, B., Akarun, L.: A novel part-based benchmark for 3D object reconstruction. In: SIU, pp. 1–4 (2024)
Kantarci, M., Gökberk, B., Akarun, L.: A survey of 3D object reconstruction methods. In: SIU, pp. 1–4 (2022)
Ibing, M., Lim, I., Kobbelt, L.P.: 3D shape generation with grid-based implicit functions. In: CVPR, pp. 13554–13563 (2021)
Zhang, Y., Huo, K., Liu, Z., Zang, Y., Liu, Y., Li, X., Zhang, Q., Wang, C.: PGNet: A part-based generative network for 3D object reconstruction. Knowl.-Based Syst. 194(1), 105574 (2020)
Yu, Q., Yang, C., Wei, H.: Part-wise atlasnet for 3D point cloud reconstruction from a single image. Knowl.-Based Syst. 242(1), 108395 (2022)
Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: A papier-mache approach to learning 3D surface generation. In: CVPR, pp. 216–224 (2018)
Sun, X., Wu, J., Zhang, X., Zhang, Z., Zhang, C., Xue, T., Tenenbaum, J.B., Freeman, W.T.: Pix3D: Dataset and methods for single-image 3D shape modeling. In: CVPR, pp. 2974–2983 (2018)
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond PASCAL: A benchmark for 3D object detection in the wild. In: WACV, pp. 75–82 (2014)
Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: ObjectNet3D: A large scale database for 3D object recognition. In: ECCV, pp. 160–176 (2016)
Shrestha, R., Hu, S., Gou, M., Liu, Z., Tan, P.: A real world dataset for multi-view 3D reconstruction. In: ECCV, pp. 56–73 (2022)
Mo, K., Zhu, S., Chang, A.X., Yi, L., Tripathi, S., Guibas, L.J., Su, H.: PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: CVPR, pp. 909–918 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 1–14 (2015)
Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, pp. 2261–2269 (2017)
Stutz, D., Geiger, A.: Learning 3D shape completion from laser scan data with weak supervision. In: CVPR, pp. 1955–1964 (2018)
Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3D surface construction algorithm. ACM Special Interest Group Comput. Graph. 21(4), 163–169 (1987)
Xie, H., Yao, H., Zhang, S., Zhou, S., Sun, W.: Pix2Vox++: Multi-scale context-aware 3D object reconstruction from single and multiple images. Int. J. of Comput. Vis. 128(12), 2919–2935 (2020)
Nooruddin, F.S., Turk, G.: Simplification and repair of polygonal models using volumetric techniques. Trans. Vis. Comput. Graph. 9(2), 191–205 (2003)
Min, P.: binvox. http://www.patrickmin.com/binvox (2004). Accessed September 30 2023
Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: ECCV, pp. 628–644 (2016)
Girgin, E., Gökberk, B., Akarun, L.: A novel occlusion index. In: SIU, pp. 1–4 (2023)
Author information
Authors and Affiliations
Contributions
All authors wrote and reviewed the main manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kantarcı, M.G., Gökberk, B. & Akarun, L. Unveiling limitations of 3D object reconstruction models through a novel benchmark. SIViP 19, 45 (2025). https://doi.org/10.1007/s11760-024-03663-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03663-7