Skip to main content

Advertisement

Log in

EA-EDNet: encapsulated attention encoder-decoder network for 3D reconstruction in low-light-level environment

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

3D reconstruction via neural networks has become striking nowadays. However, the existing works are based on information-rich environment to perform reconstruction, not yet about the Low-Light-Level (LLL) environment where the information is extremely scarce. The implementation of 3D reconstruction in this environment is an urgent requirement for military, aerospace and other fields. Therefore, we introduce an Encapsulated Attention Encoder-Decoder Network (EA-EDNet) in this paper. It can incorporate multiple levels of semantic to adequately extract the limited information from images taken in the LLL environment and can reason out the defective morphological data as well as intensify the attention to the focused parts. The EA-EDNet adopts a two-stage combined coarse-to-fine training fashion. We additionally create a realistic LLL environment dataset 3LNet-12, and accompanying propose an analysis method for filtering this dataset. In experiments, the proposed method not only achieves results superior to the state-of-the-art methods, but also achieves more delicate reconstruction models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Zhu, Lei, et al. CED-Net: contextual encoder–decoder network for 3D face reconstruction. Multimedia Systems 28.5, 1713–1722 (2022)

  2. Liang, Q., Li, Q., Nie, W., Liu, A.-A.: Pagn: perturbation adaption generation network for point cloud adversarial defense. Multimedia Syst. 28(3), 851–859 (2022)

    Article  Google Scholar 

  3. Luo, Changwei, et al. Robust 3D face modeling and tracking from RGB-D images. Multimedia Systems 28.5, 1657–1666 (2022)

  4. Kausar, Asma, et al. 3D shallow deep neural network for fast and precise segmentation of left atrium. Multimedia Systems 1–11 (2021)

  5. Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In: European Conference on Computer Vision, pp. 628–644 (2016). Springer

  6. Minemura, K., Liau, H., Monrroy, A., Kato, S.: Lmnet: Real-time multiclass object detection on cpu using 3d lidar. In: 2018 3rd Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), pp. 28–34 (2018). IEEE

  7. Tran, L., Liu, X.: Nonlinear 3d face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7346–7355 (2018)

  8. Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single rgb camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1175–1186 (2019)

  9. Tulsiani, S., Efros, A.A., Malik, J.: Multi-view consistency as supervisory signal for learning shape and pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2897–2905 (2018)

  10. Fan, Hehe, et al. Deep hierarchical representation of point cloud videos via spatio-temporal decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence 44.12, 9918–9930 (2021)

  11. Xu, H., Zhou, Z., Wang, Y., Kang, W., Sun, B., Li, H., Qiao, Y.: Digging into uncertainty in self-supervised multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6078–6087 (2021)

  12. Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)

  13. Cui, H., Shen, S., Gao, W., Wang, Z.: Progressive large-scale structure-from-motion with orthogonal msts. In: 2018 International Conference on 3D Vision (3DV), pp. 79–88 (2018). IEEE

  14. Anaya, J., Barbu, A.: Renoir - a dataset for real low-light noise image reduction. J. Visual Communicat. Image Represent. 51, 144–154 (2018)

    Article  Google Scholar 

  15. Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comp. Vision Image Underst. 178, 30–42 (2019). https://doi.org/10.1016/j.cviu.2018.10.010

    Article  Google Scholar 

  16. YIN, L.-j., CHEN, Q., GU, G.-h., GONG, S.-x.: Monte carlo simulation and implementation of photon counting image based on apd. Journal of Nanjing University of Science and Technology (Natural Science), 34(5), 649–652 (2010)

  17. Wang, X., Yin, L., Gao, M., Wang, Z., Shen, J., Zou, G.: Denoising method for passive photon counting images based on block-matching 3d filter and non-subsampled contourlet transform. Sensors 19(11), 2462 (2019)

    Article  Google Scholar 

  18. Li, Y., Yin, L., Wang, Z., Pan, J., Gao, M., Zou, G., Liu, J., Wang, L.: Bayesian regularization restoration algorithm for photon counting images. Appl. Intellig. 51(8), 5898–5911 (2021)

    Article  Google Scholar 

  19. Jiang, L., Zhang, J., Deng, B., Li, H., Liu, L.: 3d face reconstruction with geometry details from a single image. IEEE Transact. Image Process. 27(10), 4756–4770 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  20. Öztireli, A.C., Guennebaud, G., Gross, M.: Feature preserving point set surfaces based on non-linear kernel regression. Comp. Graphics Forum 28, 493–501 (2009)

    Article  Google Scholar 

  21. Guennebaud, G., Gross, M.: Algebraic point set surfaces. In: ACM Siggraph 2007 Papers, p. 23 (2007)

  22. Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision, pp. 501–518 (2016). Springer

  23. Chauve, A.-L., Labatut, P., Pons, J.-P.: Robust piecewise-planar 3d reconstruction and completion from large-scale unstructured point data. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1261–1268 (2010). IEEE

  24. Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. Comp Graphics Forum 28, 503–512 (2009)

    Article  Google Scholar 

  25. Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Transact Graph (ToG) 32(3), 1–13 (2013)

    Article  MATH  Google Scholar 

  26. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

  27. Nguyen, D.T., Hua, B.-S., Tran, K., Pham, Q.-H., Yeung, S.-K.: A field model for repairing 3d shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5684 (2016)

  28. Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W.T., Tenenbaum, J.B.: Learning shape priors for single-view 3d completion and reconstruction. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 646–662 (2018)

  29. Dai, A., Ruizhongtai Qi, C., Nießner, M.: Shape completion using 3d-encoder-predictor cnns and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5868–5877 (2017)

  30. Han, X., Li, Z., Huang, H., Kalogerakis, E., Yu, Y.: High-resolution shape completion using deep neural networks for global structure and local geometry inference. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 85–93 (2017)

  31. Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 84–93 (2020)

  32. Häne, C., Tulsiani, S., Malik, J.: Hierarchical surface prediction for 3d object reconstruction. In: 2017 International Conference on 3D Vision (3DV), pp. 412–420 (2017). IEEE

  33. Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017)

  34. Kanazawa, A., Tulsiani, S., Efros, A.A., Malik, J.: Learning category-specific mesh reconstruction from image collections. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 371–386 (2018)

  35. Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)

  36. Nguyen, A.-D., Choi, S., Kim, W., Lee, S.: Graphx-convolution for point cloud deformation in 2d-to-3d conversion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8628–8637 (2019)

  37. Zhang, X., Feng, Y., Li, S., Zou, C., Wan, H., Zhao, X., Guo, Y., Gao, Y.: View-guided point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15890–15899 (2021)

  38. Li, Z., Yu, T., Zheng, Z., Guo, K., Liu, Y.: Posefusion: Pose-guided selective fusion for single-view human volumetric capture. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14162–14172 (2021)

  39. Shin, D., Kirmani, A., Goyal, V.K., Shapiro, J.H.: Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors. IEEE Transact. Computat. Imaging 1(2), 112–125 (2015)

    Article  MathSciNet  Google Scholar 

  40. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

  41. Qiu, Shi, Saeed Anwar, and Nick Barnes. Geometric back-projection network for point cloud classification. IEEE Transactions on Multimedia 24, 1943–1955 (2021)

  42. Yi, L., Kim, V.G., Ceylan, D., Shen, I.-C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., Guibas, L.: A scalable active framework for region annotation in 3d shape collections. ACM Transact. Graphics (ToG) 35(6), 1–12 (2016)

    Article  Google Scholar 

  43. Lai, K., Bo, L., Fox, D.: Unsupervised feature learning for 3d scene labeling. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3050–3057 (2014). IEEE

  44. Kingma, Diederik P., and Jimmy Ba. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).

  45. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

  46. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  47. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

  48. Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 603–612 (2019)

  49. Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)

  50. Guo, M.-H., Cai, J.-X., Liu, Z.-N., Mu, T.-J., Martin, R.R., Hu, S.-M.: Pct: Point cloud transformer. Computat. Visual Media 7(2), 187–199 (2021)

    Article  Google Scholar 

  51. Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)

  52. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)

  53. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

  54. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

  55. Klokov, R., Lempitsky, V.: Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 863–872 (2017)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NSFC) (62101310) and Natural Science Foundation of Shandong Province, China (ZR2020MF127).

Author information

Authors and Affiliations

Authors

Contributions

YD: conceptualization, methodology, software, writing reviewing and editing. LY: visualization, investigation, supervision. XG: data curation, software, validation. HZ: writing- original draft preparation. ZW, GZ: software.

Corresponding author

Correspondence to Liju Yin.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, Y., Yin, L., Gao, X. et al. EA-EDNet: encapsulated attention encoder-decoder network for 3D reconstruction in low-light-level environment. Multimedia Systems 29, 2263–2279 (2023). https://doi.org/10.1007/s00530-023-01100-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01100-2

Keywords

Navigation