skip to main content
research-article

Investigation of Single Image Depth Prediction Under Different Lighting Conditions: A Case Study of Ancient Roman Coins

Published: 16 July 2021 Publication History

Abstract

This article investigates the limitations of single image depth prediction (SIDP) under different lighting conditions. Besides that, it also offers a new approach to obtain the ideal condition for SIDP. To satisfy the data requirement, we exploit a photometric stereo dataset consisting of several images of an object under different light properties. In this work, we used a dataset of ancient Roman coins captured under 54 different lighting conditions to illustrate how the approach is affected by them. This dataset emulates many lighting variances with a different state of shading and reflectance common in the natural environment. The ground truth depth data in the dataset was obtained using the stereo photometric method and used as training data. We investigated the capabilities of three different state-of-the-art methods to reconstruct ancient Roman coins with different lighting scenarios. The first investigation compares the performance of a given network using previously trained data to check cross-domains performance. Second, the model is fine-tuned from pre-trained data and trained using 70% of the ancient Roman coin dataset. Both models are tested on the remaining 30% of the data. As evaluation metrics, root mean square error and visual inspection are used. As a result, the methods show different characteristic results based on the lighting condition of the test data. Overall, they perform better at 51° and 71° angles of light, so-called ideal condition afterward. However, they perform worse at 13° and 32° because of the high density of shadows. They also cannot reach the best performance at 82° caused by the reflection that appears on the image. Based on these findings, we propose a new approach to reduce the shadows and reflections on the image using intrinsic image decomposition to achieve a synthetic ideal condition. Based on the results of synthetic images, this approach can enhance the performance of SIDP. For some state-of-the-art methods, it also achieves better results than previous original RGB images.

References

[1]
Baijiang Fan, Yunbo Rao, Wei Liu, Qifei Wang, and Huaiyu Wen. 2017. Region-based growing algorithm for 3D reconstruction from MRI images. In Proceedings of the 2nd International Conference on Image, Vision, and Computing (ICIVC’17). 521–525.
[2]
Christoph Baur, Shadi Albarqouni, Stefanie Demirci, Nassir Navab, and Pascal Fallavollita. 2016. CathNets: Detection and single-view depth prediction of catheter electrodes. In Medical Imaging and Augmented Reality, Guoyan Zheng, Hongen Liao, Pierre Jannin, Philippe Cattin, and Su-Lin Lee (Eds.). Springer International, Cham, Switzerland, 38–49.
[3]
Simon Brenner, Sebastian Zambanini, and Robert Sablatnig. 2018. An investigation of optimal light source setups for photometric stereo reconstruction of historical coins. In Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage. https://doi.org/10.2312/gch.20181362
[4]
Y. Cao, Z. Wu, and C. Shen. 2018. Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Transactions on Circuits and Systems for Video Technology 28, 11 (2018), 3174–3182.
[5]
L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 4 (2018), 834–848.
[6]
David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems 27. Curran Associates, Red Hook, NY, 2366–2374.
[7]
Haoqiang Fan, Hao Su, and Leonidas Guibas. 2017. A point set generation network for 3D object reconstruction from a single image. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2463–2471. https://doi.org/10.1109/CVPR.2017.264
[8]
Aufaclav Frisky, Adieyatna Fajri, Simon Brenner, and Robert Sablatnig. 2020. Acquisition evaluation on outdoor scanning for archaeological artifact digitalization. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging, and Computer Graphics Theory and Applications(VISIGRAPP’20). 792–799. https://doi.org/10.5220/0008964907920799
[9]
Aufaclav Frisky, Andi Putranto, Sebastian Zambanini, and Robert Sablatnig. 2021. MCCNet: Multi-color cascade network with weight transfer for single image depth prediction on outdoor relief images. In Pattern Recognition. ICPR International Workshops and Challenges. Lecture Notes in Computer Science, Vol. 12667. Springer, 263–278. https://doi.org/10.5220/0008964907920799
[10]
Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, and Dacheng Tao. 2018. Deep ordinal regression network for monocular depth estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’18). 2002–2011. https://doi.org/10.1109/CVPR.2018.00214
[11]
Andreas Geiger, P. Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32 (Sept. 2013), 1231–1237. https://doi.org/10.1177/0278364913491297
[12]
A. Geiger, P. Lenz, and R. Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. 3354–3361.
[13]
M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S. M. Seitz. 2007. Multi-view stereo for community photo collections. In Proceedings of the IEEE 11th International Conference on Computer Vision. 1–8.
[14]
D. B. Goldman, B. Curless, A. Hertzmann, and S. M. Seitz. 2010. Shape and spatially-varying BRDFs from photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 6 (2010), 1060–1071.
[15]
Hanry Ham, Julian Wesley, and Hendra Hendra. 2019. Computer vision based 3D reconstruction: A review. International Journal of Electrical and Computer Engineering 9 (Aug. 2019), 2394. https://doi.org/10.11591/ijece.v9i4.pp2394-2402
[16]
L. He, G. Wang, and Z. Hu. 2018. Learning depth from single images with deep neural network embedding focal length. IEEE Transactions on Image Processing 27, 9 (2018), 4676–4689.
[17]
Jianbo Jiao, Ying Cao, Yibing Song, and Rynson Lau. 2018. Look deeper into depth: Monocular depth estimation with semantic booster and attention-driven loss. In Computer Vision—ECCV 2018. Lecture Notes in Computer Science, Vol. 11219. Springer, 55–71. https://doi.org/10.1007/978-3-030-01267-0_4
[18]
Seungryong Kim, Kihong Park, Kwanghoon Sohn, and Stephen Lin. 2016. Unified depth prediction and intrinsic image decomposition from a single image via joint convolutional neural fields. In Computer Vision—ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, Cham, Switzerland, 143–159.
[19]
Tobias Koch, Lukas Liebel, Friedrich Fraundorfer, and Marco Körner. 2018. Evaluation of CNN-based single-image depth estimation methods. In Proceedings of ECCV Workshops. 1–17.
[20]
N. Kong and M. J. Black. 2015. Intrinsic depth: Improving depth transfer with intrinsic images. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV’15). 3514–3522. https://doi.org/10.1109/ICCV.2015.401
[21]
I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab. 2016. Deeper depth prediction with fully convolutional residual networks. In Proceedings of the 2016 4th International Conference on 3D Vision (3DV’16). 239–248.
[22]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
[23]
Jae-Han Lee and Chang-Su Kim. 2019. Monocular depth estimation using relative depth maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 9729–9738.
[24]
Louis Lettry, Kenneth Vanhoey, and Luc Van Gool. 2018. Unsupervised deep single-image intrinsic decomposition using illumination-varying image sequences. Computer Graphics Forum 37, 10 (Oct. 2018), 409–419.
[25]
F. Liu, C. Shen, G. Lin, and I. Reid. 2016. Learning depth from single monocular images using deep convolutional neural fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 10 (2016), 2024–2039.
[26]
Robert Maier, Kihwan Kim, Daniel Cremers, Jan Kautz, and Matthias Niessner. 2017. Intrinsic3D: High-quality 3D reconstruction by joint appearance and geometry optimization with spatially-varying lighting. In Proceedings of the International Conference on Computer Vision (ICCV’17). 3133–3141. https://doi.org/10.1109/ICCV.2017.338
[27]
M. Menze and A. Geiger. 2015. Object scene flow for autonomous vehicles. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3061–3070.
[28]
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from RGBD images. In Computer Vision—ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Springer, 746–760.
[29]
Art B. Owen. 2007. A robust hybrid of lasso and ridge regression. Contemporary Mathematics 1 (2007), 443.
[30]
G. Oxholm and K. Nishino. 2014. Multiview shape and reflectance from natural illumination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 2163–2170.
[31]
Jiao Pan, Liang Li, Hiroshi Yamaguchi, Kyoko Hasegawa, Fadjar I. Thufail, Bramantara, and Satoshi Tanaka. 2019. 3D transparent visualization of relief-type cultural heritage assets based on depth reconstruction of old monocular photos. In Methods and Applications for Modeling and Simulation of Complex Systems, Gary Tan, Axel Lehmann, Yong Meng Teo, and Wentong Cai (Eds.). Springer Singapore, Singapore, 187–198.
[32]
Jiao Pan, Liang Li, Hiroshi Yamaguchi, Kyoko Hasegawa, Fadjar I. Thufail, Bra Mantara, and Satoshi Tanaka. 2018. 3D reconstruction and transparent visualization of indonesian cultural heritage from a single image. In Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage. 207–210. https://doi.org/10.2312/gch.20181363
[33]
Peng Wang, Xiaohui Shen, Zhe Lin, S. Cohen, B. Price, and A. Yuille. 2015. Towards unified depth and semantic prediction from a single image. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 2800–2809.
[34]
Yvain Queau, Francois Lauze, and Jean-Denis Durou. 2014. Solving uncalibrated photometric stereo using total variation. Journal of Mathematical Imaging and Vision 52 (May 2014), 87–107. https://doi.org/10.1007/s10851-014-0512-5
[35]
Michael Ramamonjisoa and Vincent Lepetit. 2019. SharpNet: Fast and accurate recovery of occluding contours in monocular depth estimation. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW’19). 2109–2118. https://doi.org/10.1109/ICCVW.2019.00266
[36]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi (Eds.). Springer International, Cham, Switzerland, 234–241.
[37]
A. Saxena, M. Sun, and A. Y. Ng. 2009. Make3D: Learning 3D scene structure from a single still image. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 5 (May 2009), 824–840. https://doi.org/10.1109/TPAMI.2008.132
[38]
E. Shelhamer, J. T. Barron, and T. Darrell. 2015. Scene intrinsics and depth from a single image. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW’15). 235–242. https://doi.org/10.1109/ICCVW.2015.39
[39]
N. Silberman and R. Fergus. 2011. Indoor scene segmentation using a structured light sensor. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops’11). 601–608.
[40]
G. Vogiatzis, P. Favaro, and R. Cipolla. 2005. Using frontier points to recover shape, reflectance and illumination. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV’05), Vol. 1. 228–235.
[41]
R. White and D. A. Forsyth. 2006. Combining cues: Shape from shading and texture. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. 1809–1816.
[42]
D. Xu, E. Ricci, W. Ouyang, X. Wang, and N. Sebe. 2019. Monocular depth estimation using multi-scale continuous CRFs as sequential deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 6 (2019), 1426–1440.
[43]
M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang. 2018. DenseASPP for semantic segmentation in street scenes. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3684–3692.
[44]
W. Yin, Y. Liu, C. Shen, and Y. Yan. 2019. Enforcing geometric constraints of virtual normal for depth prediction. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV’19). 5683–5692.
[45]
Y. Zhang, S. Song, E. Yumer, M. Savva, J. Lee, H. Jin, and T. Funkhouser. 2017. Physically-based rendering for indoor scene understanding using convolutional neural networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5057–5065.

Cited By

View all
  • (2024)Low-Cost Hand Gesture Control for Swarm Quadrotor using Wearable Device in Indoor Environments2024 IEEE 33rd International Symposium on Industrial Electronics (ISIE)10.1109/ISIE54533.2024.10595765(1-6)Online publication date: 18-Jun-2024
  • (2023)Heritage Coin Identification using Convolutional Neural Networks: A Multi-Classification Approach for Numismatic Research2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250481(1-6)Online publication date: 23-Aug-2023
  • (2023)Digital Restoration of Cultural Heritage With Data-Driven Computing: A SurveyIEEE Access10.1109/ACCESS.2023.328063911(53939-53977)Online publication date: 2023

Index Terms

  1. Investigation of Single Image Depth Prediction Under Different Lighting Conditions: A Case Study of Ancient Roman Coins

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Journal on Computing and Cultural Heritage
      Journal on Computing and Cultural Heritage   Volume 14, Issue 4
      December 2021
      328 pages
      ISSN:1556-4673
      EISSN:1556-4711
      DOI:10.1145/3476246
      Issue’s Table of Contents
      © 2021 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 July 2021
      Accepted: 01 May 2021
      Received: 01 November 2020
      Published in JOCCH Volume 14, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Investigation
      2. Roman Coins
      3. depth prediction
      4. different lighting
      5. single-image
      6. state-of-the-arts

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • Indonesian-Austrian Scholarship Program

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)12
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Low-Cost Hand Gesture Control for Swarm Quadrotor using Wearable Device in Indoor Environments2024 IEEE 33rd International Symposium on Industrial Electronics (ISIE)10.1109/ISIE54533.2024.10595765(1-6)Online publication date: 18-Jun-2024
      • (2023)Heritage Coin Identification using Convolutional Neural Networks: A Multi-Classification Approach for Numismatic Research2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250481(1-6)Online publication date: 23-Aug-2023
      • (2023)Digital Restoration of Cultural Heritage With Data-Driven Computing: A SurveyIEEE Access10.1109/ACCESS.2023.328063911(53939-53977)Online publication date: 2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media