ABSTRACT
Simultaneously recovering the 3D shape and its surface color from a single image has been a very challenging. In this paper, we substantially improve Soft Rasterizer that is a state-of-the art method for 3D color object reconstruction. The model adopts the structure of the encoder and decoder with a single image as input. Firstly, the features are extracted by the encoder, and then they are simultaneously sent to the shape generator and the color generator to obtain the shape estimate and the corresponding surface color, and finally the final colorful 3D model is rendered by the differentiable renderer. In order to ensure the details of the reconstructed 3D model, this paper introduces an attention mechanism into the encoder to further improve the reconstruction effect. For surface color reconstruction, we propose a combination loss. The experimental results show that compared with the 3D reconstruction network models 3D-R2N2 and OccNet, the intersection-over-union (IOU) increases by 10% and 3% in our model. Compared to the open source project SoftRas_O, the model increases by 3.8% on structural similarity (SSIM) and decreases by 1.2% on mean square error (MSE).
- Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012(2015).Google Scholar
- Wenzheng Chen, Huan Ling, Jun Gao, Edward Smith, Jaakko Lehtinen, Alec Jacobson, and Sanja Fidler. 2019. Learning to predict 3d objects with an interpolation-based differentiable renderer. In Advances in Neural Information Processing Systems. 9609–9619.Google Scholar
- Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 2016. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 628–644.Google ScholarCross Ref
- Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 605–613.Google ScholarCross Ref
- Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414–2423.Google ScholarCross Ref
- Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.Google Scholar
- Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.Google ScholarCross Ref
- Angjoo Kanazawa, Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. 2018. Learning category-specific mesh reconstruction from image collections. In Proceedings of the European Conference on Computer Vision (ECCV). 371–386.Google ScholarDigital Library
- Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401–4410.Google ScholarCross Ref
- Hiroharu Kato and Tatsuya Harada. 2019. Learning view priors for single-view 3d reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9778–9787.Google ScholarCross Ref
- Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Neural 3d mesh renderer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3907–3916.Google ScholarCross Ref
- Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. 2019. Soft rasterizer: A differentiable renderer for image-based 3d reasoning. In Proceedings of the IEEE International Conference on Computer Vision. 7708–7717.Google ScholarCross Ref
- Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1–16.Google ScholarDigital Library
- Matthew M Loper and Michael J Black. 2014. OpenDR: An approximate differentiable renderer. In European Conference on Computer Vision. Springer, 154–169.Google ScholarCross Ref
- Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4460–4470.Google ScholarCross Ref
- Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, and Shigeo Morishima. 2019. Siclope: Silhouette-based clothed people. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480–4490.Google ScholarCross Ref
- Yongbin Sun, Ziwei Liu, Yue Wang, and Sanjay E Sarma. 2018. Im2avatar: Colorful 3d reconstruction from a single image. arXiv preprint arXiv:1804.06375(2018).Google Scholar
- Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV). 52–67.Google ScholarDigital Library
- Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.Google ScholarDigital Library
- Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV). 3–19.Google ScholarDigital Library
- Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, Bill Freeman, and Josh Tenenbaum. 2017. Marrnet: 3d shape reconstruction via 2.5 d sketches. In Advances in neural information processing systems. 540–550.Google Scholar
- Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T Freeman, and Joshua B Tenenbaum. 2018. Learning shape priors for single-view 3d completion and reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV). 646–662.Google ScholarDigital Library
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912–1920.Google Scholar
- Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Josh Tenenbaum, Bill Freeman, and Jiajun Wu. 2018. Learning to reconstruct shapes from unseen classes. Advances in neural information processing systems 31 (2018), 2257–2268.Google Scholar
- Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2016. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging 3, 1 (2016), 47–57.Google ScholarCross Ref
- Silvia Zuffi, Angjoo Kanazawa, Tanya Berger-Wolf, and Michael Black. 2019. Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images “In the Wild”. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 5358–5367.Google ScholarCross Ref
Recommendations
Deep learning framework-based 3D shape reconstruction of tanks from a single RGB image
AbstractIn recent times, complicated three-dimensional shape reconstruction from a single RGB image has become a crucial technology in many industries such as Automotive, Healthcare, and Military. It is particularly challenging to reconstruct the complex ...
Highlights- A new method was proposed to reconstruct meshes of various complicated tanks from a single image.
- Our framework avoids the problems of local adhesion, uneven surface and distortion of structure.
- We design a Shape Initialization ...
Mirror Surface Reconstruction from a Single Image
This paper tackles the problem of reconstructing the shape of a smooth mirror surface from a single image. In particular, we consider the case where the camera is observing the reflection of a static reference target in the unknown mirror. We first study ...
PushNet: 3D reconstruction from a single image by pushing
AbstractTaking inspiration from the recent advancements in deep learning within the three-dimensional (3D) domain, we propose an end-to-end deep learning framework to reconstruct 3D shapes in point cloud format from a single color image. While many state-...
Comments