Abstract
In this paper, we study realistic bokeh rendering from a single all-in-focus image. Existing computational bokeh rendering methods generate bokeh effects by adding a simple flat background blur. As a result, the rendering results are different from the real bokeh on DSLR cameras. To address this issue, we propose a multi-stage network to learn shallow depth-of-field from a single bokeh-free image. In particular, our network consists of four modules: defocus estimation, radiance, rendering, and upsampling. The four modules are trained on different sizes to learn global features as well as local details around the boundaries of in-focus objects. Experimental results show that our approach is capable of rendering a pleasing distinctive bokeh effect in complex scenes.
Keywords
X. Luo, J. Peng—Equal contributions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bae, S., Durand, F.: Defocus magnification. In: Computer Graphics Forum, vol. 26, pp. 571–579. Wiley (2007)
Busam, B., Hog, M., McDonagh, S., Slabaugh, G.: SteReFo: efficient image refocusing with stereo vision. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (2019)
Chakrabarti, A., Zickler, T., Freeman, W.T.: Analyzing spatially-varying blur. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2512–2519. IEEE (2010)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dutta, S.: Depth-aware blending of smoothed images for bokeh effect generation. arXiv preprint arXiv:2005.14214 (2020)
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2650–2658 (2015)
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)
Guo, X., Li, H., Yi, S., Ren, J., Wang, X.: Learning monocular depth by distilling cross-domain stereo networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 484–500 (2018)
Haeberli, P., Akeley, K.: The accumulation buffer: hardware support for high-quality rendering. ACM SIGGRAPH Comput. Graph. 24(4), 309–318 (1990)
Herrmann, C., et al.: Learning to autofocus. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2230–2239 (2020)
Ignatov, A., Patel, J., Timofte, R.: Rendering natural camera bokeh effect with deep learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 418–419 (2020)
Ignatov, A., et al.: Aim 2019 challenge on bokeh effect synthesis: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3591–3598. IEEE (2019)
Ignatov, A., Timofte, R., et al.: AIM 2020 challenge on rendering realistic bokeh. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 213–228. Springer, Cham (2020)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Krivánek, J., Zara, J., Bouatouch, K.: Fast depth of field rendering with surface splatting. In: 2003 Proceedings Computer Graphics International, pp. 196–201. IEEE (2003)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lasinger, K., Ranftl, R., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. arXiv preprint arXiv:1907.01341 (2019)
Levin, A., Lischinski, D., Weiss, Y.: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2007)
Li, Z., et al.: Learning the depths of moving people by watching frozen people. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4521–4530 (2019)
Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041–2050 (2018)
Lin, H., Kim, S.J., Süsstrunk, S., Brown, M.S.: Revisiting radiometric calibration for color computer vision. In: 2011 International Conference on Computer Vision, pp. 129–136. IEEE (2011)
Lin, J., Ji, X., Xu, W., Dai, Q.: Absolute depth estimation from a single defocused image. IEEE Trans. Image Process. 22(11), 4545–4550 (2013)
Park, J., Tai, Y.W., Cho, D., So Kweon, I.: A unified approach of multi-scale deep and hand-crafted features for defocus estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1736–1745 (2017)
Pharr, M., Jakob, W., Humphreys, G.: Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann, Burlington (2016)
Robison, A., Shirley, P.: Image space gathering. In: 2009 Proceedings of the Conference on High Performance Graphics, pp. 91–98 (2009)
Shi, J., Tao, X., Xu, L., Jia, J.: Break ames room illusion: depth from general single images. ACM Trans. Graph. (TOG) 34(6), 1–11 (2015)
Shi, J., Xu, L., Jia, J.: Just noticeable defocus blur detection and estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 657–665 (2015)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Srinivasan, P.P., Garg, R., Wadhwa, N., Ng, R., Barron, J.T.: Aperture supervision for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6393–6401 (2018)
Tang, C., Hou, C., Song, Z.: Defocus map estimation from a single image via spectrum contrast. Opt. Lett. 38(10), 1706–1708 (2013)
Tang, C., Wu, J., Hou, Y., Wang, P., Li, W.: A spectral and spatial approach of coarse-to-fine blurred image region detection. IEEE Sig. Process. Lett. 23(11), 1652–1656 (2016)
Wadhwa, N., et al.: Synthetic depth-of-field with a single-camera mobile phone. ACM Trans. Graph. (TOG) 37(4), 1–13 (2018)
Wang, L., et al.: DeepLens: shallow depth of field from a single image. arXiv preprint arXiv:1810.08100 (2018)
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Xian, K., et al.: Monocular relative depth perception with web stereo data supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 311–320 (2018)
Xian, K., Zhang, J., Wang, O., Mai, L., Lin, Z., Cao, Z.: Structure-guided ranking loss for single image depth prediction. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Xu, G., Quan, Y., Ji, H.: Estimating defocus blur via rank of local patches. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5371–5379 (2017)
Yan, R., Shao, L.: Blind image blur estimation via deep learning. IEEE Trans. Image Process. 25(4), 1910–1921 (2016)
Yang, Y., Lin, H., Yu, Z., Paris, S., Yu, J.: Virtual DSLR: high quality dynamic depth-of-field synthesis on mobile platforms. Electron. Imaging 2016(18), 1–9 (2016)
Zhang, X., Wang, R., Jiang, X., Wang, W., Gao, W.: Spatially variant defocus blur map estimation and deblurring from a single image. J. Vis. Commun. Image Represent. 35, 257–264 (2016)
Zhang, X., Matzen, K., Nguyen, V., Yao, D., Zhang, Y., Ng, R.: Synthetic defocus and look-ahead autofocus for casual videography. arXiv preprint arXiv:1905.06326 (2019)
Zhuo, S., Sim, T.: Defocus map estimation from a single image. Pattern Recogn. 44(9), 1852–1858 (2011)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (Grant No. U1913602).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Luo, X., Peng, J., Xian, K., Wu, Z., Cao, Z. (2020). Bokeh Rendering from Defocus Estimation. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12537. Springer, Cham. https://doi.org/10.1007/978-3-030-67070-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-67070-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67069-6
Online ISBN: 978-3-030-67070-2
eBook Packages: Computer ScienceComputer Science (R0)