RELIEF: Joint Low-Light Image Enhancement and Super-Resolution with Transformers

Aakerberg, Andreas; Nasrollahi, Kamal; Moeslund, Thomas B.

doi:10.1007/978-3-031-31435-3_11

Andreas Aakerberg¹⁰,
Kamal Nasrollahi^10,11 &
Thomas B. Moeslund¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13885))

Included in the following conference series:

Scandinavian Conference on Image Analysis

582 Accesses

Abstract

The goal of Single-Image Super-Resolution (SISR) is to reconstruct a High-Resolution (HR) version of a degraded Low-Resolution (LR) image. Existing Super-Resolution (SR) methods mostly assume that the LR image is a result of blurring and downsampling the HR image, while in reality LR images are often degraded by additional factors such as low-light, low-contrast, noise, and color distortion. Due to this, current State-of-the-Art (SoTA) SR methods cannot reconstruct real low-light low-resolution images, and a straightforward strategy is, therefore, to first perform Low-Light Enhancement (LLE), followed by SR, using dedicated methods for each task. Unfortunately, this approach leads to poor performance, which motivates us to propose a method for joint LLE and SR. However, since LLE and SR are both ill-posed and ill-conditioned inverse problems, the joint reconstruction task becomes highly challenging, which calls for efficient ways to leverage as much as possible of the available information in the degraded image during reconstruction. In this paper, we propose REsolution and LIght Enhancement transFormer (RELIEF), a novel Transformer-based multi-scale hierarchical encoder-decoder network with efficient cross-shaped attention mechanisms that can extract informative features from large training patches due to its strong long-range dependency modeling capabilities. This in turn leads to significant improvements in reconstruction performance on real Low-Light Low-Resolution (LLLR) images. We evaluate our method on two publicly available datasets and present SoTA results on both.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/cszn/KAIR/blob/master/options/swinir/train_swinir_sr_realworld_x4_psnr.json.

References

Aakerberg, A., Nasrollahi, K., Moeslund, T.B.: RELLISUR: a real low-light image super-resolution dataset. In: NeurIPS (2021)
Google Scholar
Andreas Lugmayr et al.: Ntire 2020 challenge on real-world image super-resolution: methods and results. In: CVPRW (2020)
Google Scholar
Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization (2016)
Google Scholar
Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., Zelnik-Manor, L.: The 2018 PIRM challenge on perceptual image super-resolution. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 334–355. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_21
Chapter Google Scholar
Cai, J., Gu, S., Zhang, L.: Learning a deep single image contrast enhancer from multi-exposure images. TIP (2018)
Google Scholar
Cai, Y., et al.: Learning delicate local representations for multi-person pose estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 455–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_27
Chapter Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chen, Y., Wang, Y., Kao, M., Chuang, Y.: Deep photo enhancer: unpaired learning for image enhancement from photographs with GANs. In: CVPR (2018)
Google Scholar
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: CVPR (2017)
Google Scholar
Chu, X., et al.: Conditional positional encodings for vision transformers (2021)
Google Scholar
Coltuc, D., Bolon, P., Chassery, J.: Exact histogram specification. TIP (2006)
Google Scholar
Ding, K., Ma, K., Wang, S., Simoncelli, E.P.: Image quality assessment: unifying structure and texture similarity (2020)
Google Scholar
Dong, C., Loy, C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. TPAMI 38, 295–307 (2016)
Article Google Scholar
Dong, X., et al.: CSWIN transformer: a general vision transformer backbone with cross-shaped windows (2021)
Google Scholar
Fu, X., Liao, Y., Zeng, D., Huang, Y., Zhang, X.S., Ding, X.: A probabilistic method for image enhancement with simultaneous illumination and reflectance estimation. TIP 24, 4965–4977 (2015)
MathSciNet MATH Google Scholar
Guo, K., et al.: Deep illumination-enhanced face super-resolution network for low-light images. In: TOMM (2022)
Google Scholar
Guo, X., Li, Y., Ling, H.: LIME: low-light image enhancement via illumination map estimation. TIP 26, 982–993 (2017)
MathSciNet MATH Google Scholar
Han, T.Y., Kim, Y.J., Song, B.C.: Convolutional neural network-based infrared image super resolution under low light environment. In: EUSIPCO (2017)
Google Scholar
Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus) (2016)
Google Scholar
Jiang, Y., et al.: Enlightengan: deep light enhancement without paired supervision. TIP 30, 2340–2349 (2021)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: CVPR (2016)
Google Scholar
Kim, T.H., Sajjadi, M.S.M., Hirsch, M., Schölkopf, B.: Spatio-temporal transformer network for video restoration. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 111–127. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_7
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)
Google Scholar
Klatzer, T., Hammernik, K., Knöbelreiter, P., Pock, T.: Learning joint demosaicing and denoising based on sequential energy minimization. In: ICCP (2016)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)
Google Scholar
Li, K., Wang, S., Zhang, X., Xu, Y., Xu, W., Tu, Z.: Pose recognition with cascade transformers. In: CVPR (2021)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: ICCVW (2021)
Google Scholar
Liang, Z., Zhang, D., Shao, J.: Jointly solving deblurring and super-resolution problems with dual supervised network. In: ICME (2019)
Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: CVPRW (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows (2021)
Google Scholar
Lore, K.G., Akintayo, A., Sarkar, S.: Llnet: a deep autoencoder approach to natural low-light image enhancement (2017)
Google Scholar
Luo, Z., Huang, Y., Li, S., Wang, L., Tan, T.: Learning the degradation distribution for blind image super-resolution. In: CVPR (2022)
Google Scholar
Ma, C., Yan, B., Tan, W., Jiang, X.: Perception-oriented stereo image super-resolution. In: ACM MM (2021)
Google Scholar
Ma, L., Liu, R., Wang, Y., Fan, X., Luo, Z.: Low-light image enhancement via self-reinforced retinex projection model. IEEE Trans. Multimedia (2022)
Google Scholar
Nasrollahi, K., Moeslund, T.B.: Super-resolution: A comprehensive survey. In: Mach. Vision Appl. (2014)
Google Scholar
Qin, Q., Yan, J., Wang, Q., Wang, X., Li, M., Wang, Y.: Etdnet: An efficient transformer deraining model. In: IEEE Access (2021)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sajjadi, M.S.M., Schölkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: ICCV (2017)
Google Scholar
Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. In: NAACL-HLT (2018)
Google Scholar
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR (2016)
Google Scholar
Stark, J.A.: Adaptive image contrast enhancement using generalizations of histogram equalization. TIP 9, 889–896 (2000)
Google Scholar
Vaswani, A., Ramachandran, P., Srinivas, A., Parmar, N., Hechtman, B.A., Shlens, J.: Scaling local self-attention for parameter efficient visual backbones. In: CVPR (2021)
Google Scholar
Vaswani, A., et al.: In: NeurIPS (2017)
Google Scholar
Wang, S., Zheng, J., Hu, H., Li, B.: Naturalness preserved enhancement algorithm for non-uniform illumination images. TIP 22, 3538–3548 (2013)
Google Scholar
Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions (2021)
Google Scholar
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-esrgan: Training real-world blind super-resolution with pure synthetic data (2021)
Google Scholar
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Loy, C.C.: ESRGAN: enhanced super-resolution generative adversarial networks. In: ECCVW (2019)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R.: Image quality assessment: from error visibility to structural similarity. TIP 13, 600–612 (2004)
Google Scholar
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: BMVC (2018)
Google Scholar
Wu, H., et al.: CVT: introducing convolutions to vision transformers (2021)
Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers (2021)
Google Scholar
Yang, J., et al.: Focal self-attention for local-global interactions in vision transformers (2021)
Google Scholar
Ying, C., Zhao, P., Li, Y.: Low-light-level image super-resolution reconstruction based on iterative projection photon localization algorithm. J. Electron. Imaging 27, 013026 (2018)
Article Google Scholar
Yuan, K., Guo, S., Liu, Z., Zhou, A., Yu, F., Wu, W.: Incorporating convolution designs into visual transformers (2021)
Google Scholar
Zamir, S.W., et al.: Learning enriched features for real image restoration and enhancement. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 492–511. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_30
Chapter Google Scholar
Zhang, K., Liang, J., Gool, L.V., Timofte, R.: Designing a practical degradation model for deep blind image super-resolution (2021)
Google Scholar
Zhang, P., et al.: Multi-scale vision longformer: a new vision transformer for high-resolution image encoding (2021)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhang, Y., Zhang, J., Guo, X.: Kindling the darkness: a practical low-light image enhancer. In: ACM MM (2019)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Chapter Google Scholar
Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: CVPR (2018)
Google Scholar
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: CVPR (2021)
Google Scholar
Zhou, R., El Helou, M., Sage, D., Laroche, T., Seitz, A., Süsstrunk, S.: W2S: microscopy data with joint denoising and super-resolution for widefield to SIM mapping. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12535, pp. 474–491. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66415-2_31
Chapter Google Scholar
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: ICLR (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Visual Analysis and Perception Laboratory, Aalborg University, Aalborg, Denmark
Andreas Aakerberg, Kamal Nasrollahi & Thomas B. Moeslund
Research Department, Milestone Systems A/S, Brondby, Denmark
Kamal Nasrollahi

Authors

Andreas Aakerberg
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Nasrollahi
View author publications
You can also search for this author in PubMed Google Scholar
Thomas B. Moeslund
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Aakerberg .

Editor information

Editors and Affiliations

Aalborg University, Aalborg, Denmark
Rikke Gade
Linköping University, Linköping, Sweden
Michael Felsberg
Tampere University, Tampere, Finland
Joni-Kristian Kämäräinen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aakerberg, A., Nasrollahi, K., Moeslund, T.B. (2023). RELIEF: Joint Low-Light Image Enhancement and Super-Resolution with Transformers. In: Gade, R., Felsberg, M., Kämäräinen, JK. (eds) Image Analysis. SCIA 2023. Lecture Notes in Computer Science, vol 13885. Springer, Cham. https://doi.org/10.1007/978-3-031-31435-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-31435-3_11
Published: 27 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31434-6
Online ISBN: 978-3-031-31435-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

RELIEF: Joint Low-Light Image Enhancement and Super-Resolution with Transformers