Abstract
Single image depth prediction is considerably difficult since depth cannot be estimated from pixel correspondences. Thus, prior knowledge, such as registered pixel and depth information from the user is required. Another problem rises when targeting a specific domain requirement as the number of freely available training datasets is limited. Due to color problem in relief images, we present a new outdoor Registered Relief Depth (RRD) Prambanan dataset, consisting of outdoor images of Prambanan temple relief with registered depth information supervised by archaeologists and computer scientists. In order to solve the problem, we also propose a new depth predictor, called Multi-Color Cascade Network (MCCNet), with weight transfer. Applied on the new RRD Prambanan dataset, our method performs better in different materials than the baseline with 2.53 mm RMSE. In the NYU Depth V2 dataset, our method’s performance is better than the baselines and in line with other state-of-the-art works.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The dataset can be obtained by sending an email to aufaclav@ugm.ac.id or aufaclav@cvl.tuwien.ac.at.
References
Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv e-prints arXiv:1812.11941, December 2018
Antoniou, A., Storkey, A., Edwards, H.: Augmenting image classifiers using data augmentation generative adversarial networks. In: 27th International Conference on Artificial Neural Networks, pp. 594–603, October 2018. https://doi.org/10.1007/978-3-030-01424-758
Bernardes, P., Magalhães, F., Ribeiro, J., Madeira, J., Martins, M.: Image-based 3D modelling in archaeology: application and evaluation. In: 22nd International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, June 2014
Chakrabarti, A., Shao, J., Shakhnarovich, G.: Depth from a single image by harmonizing overcomplete local network predictions. In: Advances in Neural Information Processing Systems 29, pp. 2658–2666. Curran Associates, Inc. (2016)
Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. CoRR abs/1512.03012 (2015)
Cubuk, E.D., Zoph, B., Mané, D., Vasudevan, V., Le, Q.V.: Autoaugment: learning augmentation policies from data. arXiv e-prints, abs/1805.09501. arXiv:1805.09501, August 2018
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2650–2658, December 2015. https://doi.org/10.1109/ICCV.2015.304
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems 27, pp. 2366–2374. Curran Associates, Inc. (2014)
Favaro, P., Soatto, S.: A geometric approach to shape from defocus. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 406–417 (2005)
Frisky, A.Z.K., Fajri, A., Brenner, S., Sablatnig, R.: Acquisition evaluation on outdoor scanning for archaeological artifact digitalization. In: Farinella, G.M., Radeva, P., Braz, J. (eds.) Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2020, Volume 5: VISAPP, Valletta, Malta, 27–29 February 2020, pp. 792–799. SCITEPRESS (2020). https://doi.org/10.5220/0008964907920799
Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2018, pp. 2002–2011, June 2018. https://doi.org/10.1109/CVPR.2018.00214
Georgopoulos, A., Ioannidis, C., Valanis, A.: Assessing the performance of a structured light scanner. Int. Arch. Photogram. Remote Sens. Spatial Inf. Sci. XXXVIII, 250–255 (2010)
Georgopoulos, A., Stathopoulou, E.K.: Data acquisition for 3D geometric recording: state of the art and recent innovations. In: Vincent, M.L., López-Menchero Bendicho, V.M., Ioannides, M., Levy, T.E. (eds.) Heritage and Archaeology in the Digital Age. QMHSS, pp. 1–26. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65370-9_1
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth prediction. In: The International Conference on Computer Vision (ICCV), pp. 3828–3838, October 2019
Gowda, S.N., Yuan, C.: ColorNet: investigating the importance of color spaces for image classification. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11364, pp. 581–596. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20870-7_36
Han, X., Laga, H., Bennamoun, M.: Image-based 3D object reconstruction: state-of-the-art and trends in the deep learning era. IEEE Trans. Pattern Anal. Mach. Intell. 1–27 (2019)
Hane, C., Tulsiani, S., Malik, J.: Hierarchical surface prediction for 3D object reconstruction. In: 2017 International Conference on 3D Vision (3DV), pp. 412–420 (2017)
Johnston, A., Garg, R., Carneiro, G., Reid, I., van den Hengel, A.: Scaling CNNs for high resolution volumetric reconstruction from a single image. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 930–939 (2017)
Karsch, K., Liu, C., Kang, S.B.: Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2144–2158 (2014)
Larbi, K., Ouarda, W., Drira, H., Ben Amor, B., Ben Amar, C.: DeepColorfASD: face anti spoofing solution using a multi channeled color spaces CNN. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 4011–4016, October 2018. https://doi.org/10.1109/SMC.2018.00680
Lee, J., Heo, M., Kim, K., Kim, C.: Single-image depth estimation based on Fourier domain analysis. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 330–339, June 2018. https://doi.org/10.1109/CVPR.2018.00042
Lee, J.H., Kim, C.S.: Monocular depth estimation using relative depth maps. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9729–9738, June 2019
Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5162–5170, June 2015. https://doi.org/10.1109/CVPR.2015.7299152
Loesdau, M., Chabrier, S., Gabillon, A.: Hue and saturation in the RGB color space. In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds.) ICISP 2014. LNCS, vol. 8509, pp. 203–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07998-1_23
Loh, A.: The recovery of 3-D structure using visual texture patterns. Ph.D. thesis, University of Western Australia (2006)
Luhmann, T., Stuart, R., Kyle, S., Boehmn, J.: Close-Range photogrammetry and 3D imaging. De Gruyter, Berlin, Germany (2013)
Mikołajczyk, A., Grochowski, M.: Data augmentation for improving deep learning in image classification problem. In: 2018 International Interdisciplinary Ph.D. Workshop (IIPhDW), pp. 117–122 (2018)
Oswald, M.R., Töppe, E., Cremers, D.: Fast and globally optimal single view reconstruction of curved objects. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 534–541 (2012)
Pan, J., et al.: 3D reconstruction and transparent visualization of Indonesian cultural heritage from a single image. In: Eurographics Workshop on Graphics and Cultural Heritage, pp. 207–210. The Eurographics Association (2018). https://doi.org/10.2312/gch.20181363
Pollefeys, M., Van Gool, L., Vergauwen, M., Cornelis, K., Verbiest, F., Tops, J.: Image-based 3d acquisition of archaeological heritage and applications. In: Proceedings of the 2001 Conference on Virtual Reality, Archeology, and Cultural Heritage, VAST 01, pp. 255–262, January 2001. https://doi.org/10.1145/584993.585033
Zhang, R., Tsai, P.-S., Cryer, J.E., Shah, M.: Shape-from-shading: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 690–706 (1999)
Sablier, M., Garrigues, P.: Cultural heritage and its environment: an issue of interest for environmental science and pollution research. Environ. Sci. Pollution Res. 21(9), 5769–5773 (2014). https://doi.org/10.1007/s11356-013-2458-3
Saxena, A., Sun, M., Ng, A.Y.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009). https://doi.org/10.1109/TPAMI.2008.132
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Snodgrass, A.: The Symbolism of the Stupa, 2nd edn. Cornell University Press (1985). http://www.jstor.org/stable/10.7591/j.ctv1nhnhr
Wiles, O., Zisserman, A.: SilNet: single- and multi-view reconstruction by learning from silhouettes. In: British Machine Vision Conference (2017)
Wu, S., Rupprecht, C., Vedaldi, A.: Unsupervised learning of probably symmetric deformable 3D objects from images in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–10, June 2020
Xu, H., Jiang, M.: Depth prediction from a single image based on non-parametric learning in the gradient domain. Optik 181, 880–890 (2019). https://doi.org/10.1016/j.ijleo.2018.12.061
Zollhöfer, M., et al.: Shading-based refinement on volumetric signed distance functions. ACM Trans. Graph. 34(4), 96:1–96:14 (2015). https://doi.org/10.1145/2766887
Acknowledgment
This work is funded by a collaboration scheme between the Ministry of Research and Technology of the Republic of Indonesia and OeAD-GmbH within the Indonesian-Austrian Scholarship Program (IASP). This work is also supported by the Ministry of Education and Culture of the Republic of Indonesia and the Institute for Preservation of Cultural Heritage (BPCB) D.I. Yogyakarta by their permission to take the relief dataset.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Frisky, A.Z.K., Putranto, A., Zambanini, S., Sablatnig, R. (2021). MCCNet: Multi-Color Cascade Network with Weight Transfer for Single Image Depth Prediction on Outdoor Relief Images. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-68787-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68786-1
Online ISBN: 978-3-030-68787-8
eBook Packages: Computer ScienceComputer Science (R0)