Abstract
Illumination estimation is a highly challenging problem. Many methods are learning from multi-image by unsupervised learning. However, these methods are based on unrelated data and do not exploit the information between images. In this paper, we propose a new unsupervised method for learning illumination by observing indoor video sequences under changing illumination and learning the intrinsic decomposition of the sequences. This method allows us to learn without ground truth (GT) and leverage information obtained from multiple consecutive images. Based on the above ideas, we propose a new network framework and introduce albedo smoothness loss and illumination smoothness loss two new dense spatio-temporal smoothness loss functions. These loss functions take full advantage of the information between images to constrain the entire image sequence. In our evaluation, our approach shows good performance on several specific metrics. Experiments show that our method has strong generalization and can be easily applied to other classical datasets, including Intrinsic Images in the Wild (IIW) and Shading Annotations in the Wild (SAW).
Similar content being viewed by others
Availability of data and materials
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Barron, J.T., Malik, J.: Intrinsic scene properties from a single rgb-d image. Proceedings / CVPR, IEEE computer society conference on computer vision and pattern recognition 38(4) (2013)
Fan, Q., Yang, J., Hua, G., et al.: Revisiting deep intrinsic image decompositions. IEEE (2018)
Garces, E., Munoz, A., Lopez-Moreno, J.: Intrinsic images by clustering. Comput. Gr. Forum 31(4), 1415–1424 (2012)
Gardner, M.A., Sunkavalli, K., Yumer, E.: Learning to predict indoor illumination from a single image. arXiv e-prints (2017)
Kovacs, B., Bell, S., Snavely, N., et al.: Shading annotations in the wild. IEEE (2017)
Land, E.: Lightness and retinex theory. J. Opt. Soc. Am. 61(1), 1–11 (1971)
Li, Z., Snavely, N.: Cgintrinsics: better intrinsic image decomposition through physically-based rendering. Springer, Cham (2018)
Li, Z., Snavely, N.: Learning intrinsic image decomposition from watching the world. IEEE (2018b)
Li, Z., Shafiei, M., Ramamoorthi, R., et al.: Inverse rendering for complex indoor scenes: shape, spatially-varying lighting and svbrdf from a single image (2019)
Liu, Y., Li, Y., You, S., et al.: Unsupervised learning for intrinsic image decomposition from a single image. In: 2020 IEEE/CVF conference on computer vision and Pattern recognition (CVPR) (2020)
Luo, J., Huang, Z., Li, Y., et al.: Niid-net: Adapting surface normal knowledge for intrinsic image decomposition in indoor scenes. IEEE Trans. Vis. Comput. Gr. 26(12), 3434–3445 (2020)
Ma, W.C., Chu, H., Zhou, B.: Single image intrinsic decomposition without a single intrinsic image. Springer, Cham (2018)
Nestmeyer, T., Gehler, P.V.: Reflectance adaptive filtering improves intrinsic image estimation. IEEE (2016)
Nestmeyer, T., Gehler, P.V.: Reflectance adaptive filtering improves intrinsic image estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6789–6798 (2017)
Qi, Z., Ping, T., Qiang, D., et al.: A closed-form solution to intrinsic image decomposition with retinex and non-local texture constraints. IEEE Trans. Softw. Eng. 34(7), 1437 (2012)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. Springer International Publishing, Berlin (2015)
Bell, S., Bala, K., et al.: Intrinsic images in the wild. Acm Trans. Gr. Proc. Acm Siggraph 33(4), 1–12 (2014)
Ye, Y., Smith, W.: Outdoor inverse rendering from a single image using multiview self-supervision. IEEE Trans. Softw. Eng. PP(99) (2021)
Yu, Y., Smith, W.A.: Inverserendernet: Learning single image inverse rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3155–3164 (2019)
Zhou, H., Hadap, S.: Deep single-image portrait relighting. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Zhou, H., Yu, X., Jacobs, D.: Glosh: global-local spherical harmonics for intrinsic image decomposition. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Zhou, T., Krahenbuhl, P.: Learning data-driven reflectance priors for intrinsic image decomposition. In: 2015 IEEE International Conference on Computer Vision (ICCV) (2015)
Zhuo, H., Chakrabarti, A.: Learning to separate multiple illuminants in a single image. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Acknowledgements
This work is supported by National Natural Science Foundation of China (62162007) and Scientific Research Project of Guizhou University Talents Fund (No. GDRJHZ-2017-31). Special thanks to Dr.ChuHua Huang for his encouragement and guidance during the writing and experiment.
Funding
National Natural Science Foundation of China (62162007) and Scientific Research Project of Guizhou University Talents Fund (No. GDRJHZ-2017-31).
Author information
Authors and Affiliations
Contributions
This idea was proposed by ZZ, who experimented, data analyses and wrote the manuscript; CH, RH, YL and YC gave necessary help to this work and put forward valuable opinions.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Huang, C., Huang, R. et al. Illu-NASNet: unsupervised illumination estimation based on dense spatio-temporal smoothness. Multimedia Systems 29, 1453–1462 (2023). https://doi.org/10.1007/s00530-023-01057-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01057-2