Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models

Xu, Jiaqi; Wu, Mengyang; Hu, Xiaowei; Fu, Chi-Wing; Dou, Qi; Heng, Pheng-Ann

doi:10.1007/978-3-031-72649-1_9

Jiaqi Xu¹³,
Mengyang Wu¹³,
Xiaowei Hu¹⁴,
Chi-Wing Fu¹³,
Qi Dou¹³ &
…
Pheng-Ann Heng¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15076))

Included in the following conference series:

European Conference on Computer Vision

645 Accesses
1 Citations

Abstract

This paper addresses the limitations of adverse weather image restoration approaches trained on synthetic data when applied to real-world scenarios. We formulate a semi-supervised learning framework employing vision-language models to enhance restoration performance across diverse adverse weather conditions in real-world settings. Our approach involves assessing image clearness and providing semantics using vision-language models on real data, serving as supervision signals for training restoration models. For clearness enhancement, we use real-world data, utilizing a dual-step strategy with pseudo-labels assessed by vision-language models and weather prompt learning. For semantic enhancement, we integrate real-world data by adjusting weather conditions in vision-language model descriptions while preserving semantic meaning. Additionally, we introduce an effective training strategy to bootstrap restoration performance. Our approach achieves superior results in real-world adverse weather image restoration, demonstrated through qualitative and quantitative comparisons with state-of-the-art works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MetaWeather: Few-Shot Weather-Degraded Image Restoration

GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

Article 20 May 2024

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding

References

Achiam, J., et al.: GPT-4 technical report (2023). arXiv preprint arXiv:2303.08774
Cai, B., Xu, X., Jia, K., Qing, C., Tao, D.: DehazeNet: an end-to-end system for single image haze removal. TIP 25(11), 5187–5198 (2016)
Google Scholar
Chen, W.T., et al.: All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. In: ICCV (2021)
Google Scholar
Chen, W.T., Huang, Z.K., Tsai, C.C., Yang, H.H., Ding, J.J., Kuo, S.Y.: Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization: toward a unified model. In: CVPR (2022)
Google Scholar
Chen, Z., et al.: InternVL: scaling up vision foundation models and aligning for generic visual-linguistic tasks. In: CVPR (2024)
Google Scholar
Deng, Z., et al.: Deep multi-model fusion for single-image dehazing. In: ICCV (2019)
Google Scholar
Fu, X., Huang, J., Zeng, D., Huang, Y., Ding, X., Paisley, J.: Removing rain from single images via a deep detail network. In: CVPR (2017)
Google Scholar
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. TPAMI 33(12), 2341–2353 (2010)
Google Scholar
Hu, X., Fu, C.W., Zhu, L., Heng, P.A.: Depth-attentional features for single-image rain removal. In: CVPR (2019)
Google Scholar
Hu, X., Zhu, L., Wang, T., Fu, C.W., Heng, P.A.: Single-image real-time rain removal based on depth-guided non-local features. TIP 30, 1759–1770 (2021)
Google Scholar
Huang, S., Wang, K., Liu, H., Chen, J., Li, Y.: Contrastive semi-supervised learning for underwater image restoration via reliable bank. In: CVPR (2023)
Google Scholar
Jiang, K., et al.: Multi-scale progressive fusion network for single image deraining. In: CVPR (2020)
Google Scholar
Jiang, Y., Zhang, Z., Xue, T., Gu, J.: AutoDIR: Automatic all-in-one image restoration with latent diffusion (2023). arXiv preprint arXiv:2310.10123
Ke, J., Wang, Q., Wang, Y., Milanfar, P., Yang, F.: MUSIQ: multi-scale image quality transformer. In: ICCV (2021)
Google Scholar
Li, B., et al.: Benchmarking single-image dehazing and beyond. TIP 28(1), 492–505 (2018)
Google Scholar
Li, R., Cheong, L.F., Tan, R.T.: Heavy rain image restoration: integrating physics model and conditional adversarial learning. In: CVPR (2019)
Google Scholar
Li, R., Tan, R.T., Cheong, L.F.: All in one bad weather removal using architectural search. In: CVPR (2020)
Google Scholar
Liang, Z., Li, C., Zhou, S., Feng, R., Loy, C.C.: Iterative prompt learning for unsupervised backlit image enhancement. In: ICCV (2023)
Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: CVPRW (2017)
Google Scholar
Liu, H., Li, C., Li, Y., Lee, Y.J.: Improved baselines with visual instruction tuning. In: CVPR (2024)
Google Scholar
Liu, H., et al.: LLaVA-NeXT: Improved reasoning, OCR, and world knowledge (2024). https://llava-vl.github.io/blog/2024-01-30-llava-next/
Liu, H., Li, C., Wu, Q., Lee, Y.J.: Visual instruction tuning. In: NeurIPS (2023)
Google Scholar
Liu, Y., Yue, Z., Pan, J., Su, Z.: Unpaired learning for deep image deraining with rain direction regularizer. In: ICCV (2021)
Google Scholar
Liu, Y.F., Jaw, D.W., Huang, S.C., Hwang, J.N.: DesnowNet: context-aware deep network for snow removal. TIP 27(6), 3064–3073 (2018)
Google Scholar
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Controlling vision-language models for universal image restoration. In: ICLR (2024)
Google Scholar
Mann, B., et al.: Language models are few-shot learners. In: NeurIPS (2020)
Google Scholar
Özdenizci, O., Legenstein, R.: Restoring vision in adverse weather conditions with patch-based denoising diffusion models. TPAMI 45(8), 10346–10357 (2023)
Google Scholar
Patil, P.W., Gupta, S., Rana, S., Venkatesh, S., Murala, S.: Multi-weather image restoration via domain translation. In: ICCV (2023)
Google Scholar
Potlapalli, V., Zamir, S.W., Khan, S., Khan, F.S.: PromptIR: prompting for all-in-one blind image restoration. In: NeruIPS (2023)
Google Scholar
Qian, R., Tan, R.T., Yang, W., Su, J., Liu, J.: Attentive generative adversarial network for raindrop removal from a single image. In: CVPR (2018)
Google Scholar
Qin, X., Wang, Z., Bai, Y., Xie, X., Jia, H.: FFA-Net: feature fusion attention network for single image dehazing. In: AAAI (2020)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML (2021)
Google Scholar
Richardson, E., Goldberg, K., Alaluf, Y., Cohen-Or, D.: ConceptLab: Creative generation using diffusion prior constraints (2023). arXiv preprint arXiv:2308.02669
Song, Y., He, Z., Qian, H., Du, X.: Vision transformers for single image dehazing. TIP 32, 1927–1941 (2023)
Google Scholar
Sun, Q., et al.: Generative multimodal models are in-context learners. In: CVPR (2024)
Google Scholar
Talebi, H., Milanfar, P.: NIMA: Neural image assessment. TIP 27(8), 3998–4011 (2018)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: NeruIPS (2017)
Google Scholar
Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: CVPRW (2017)
Google Scholar
Touvron, H., et al.: LLaMA: Open and efficient foundation language models (2023). arXiv preprint arXiv:2302.13971
Valanarasu, J.M.J., Yasarla, R., Patel, V.M.: TransWeather: transformer-based restoration of images degraded by adverse weather conditions. In: CVPR (2022)
Google Scholar
Wang, J., Chan, K.C., Loy, C.C.: Exploring clip for assessing the look and feel of images. In: AAAI (2023)
Google Scholar
Wang, T., Yang, X., Xu, K., Chen, S., Zhang, Q., Lau, R.W.: Spatial attentive single-image deraining with a high quality real rain dataset. In: CVPR (2019)
Google Scholar
Wang, X., Xie, L., Yu, K., Chan, K.C., Loy, C.C., Dong, C.: BasicSR: Open source image and video restoration toolbox (2022). https://github.com/XPixelGroup/BasicSR
Wang, Y., Ma, C., Liu, J.: SmartAssign: learning a smart knowledge assignment strategy for deraining and desnowing. In: CVPR (2023)
Google Scholar
Wei, W., Meng, D., Zhao, Q., Xu, Z., Wu, Y.: Semi-supervised transfer learning for image rain removal. In: CVPR (2019)
Google Scholar
Wu, H., et al.: Q-bench: a benchmark for general-purpose foundation models on low-level vision. In: ICLR (2024)
Google Scholar
Wu, H., et al.: Q-align: teaching LMMs for visual scoring via discrete text-defined levels. In: ICML (2024)
Google Scholar
Xiao, J., Fu, X., Liu, A., Wu, F., Zha, Z.J.: Image de-raining transformer. TPAMI 45(11), 12978–12995 (2022)
Google Scholar
Xu, J., et al.: Video dehazing via a multi-range temporal alignment network with physical prior. In: CVPR (2023)
Google Scholar
Yang, L., Kang, B., Huang, Z., Xu, X., Feng, J., Zhao, H.: Depth anything: unleashing the power of large-scale unlabeled data. In: CVPR (2024)
Google Scholar
Yang, W., Tan, R.T., Feng, J., Liu, J., Guo, Z., Yan, S.: Deep joint rain detection and removal from a single image. In: CVPR (2017)
Google Scholar
Ye, Q., et al.: mPLUG-Owl2: revolutionizing multi-modal large language model with modality collaboration. In: CVPR (2024)
Google Scholar
Ye, T., et al.: Adverse weather removal with codebook priors. In: ICCV (2023)
Google Scholar
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: CVPR (2022)
Google Scholar
Zhang, H., et al.: WeatherStream: light transport automation of single image deweathering. In: CVPR (2023)
Google Scholar
Zhang, W., Zhai, G., Wei, Y., Yang, X., Ma, K.: Blind image quality assessment via vision-language correspondence: a multitask learning perspective. In: CVPR (2023)
Google Scholar
Zhou, K., Yang, J., Loy, C.C., Liu, Z.: Learning to prompt for vision-language models. IJCV 130(9), 2337–2348 (2022)
Google Scholar
Zhu, L., et al.: Learning gated non-local residual for single-image rain streak removal. TCSVT 31(6), 2147–2159 (2020)
Google Scholar
Zhu, Y., et al.: Learning weather-general and weather-specific features for image restoration under multiple adverse weather conditions. In: CVPR (2023)
Google Scholar

Download references

Acknowledgements

The work was supported by the National Key R&D Program of China (Grant No. 2022ZD0160100), the Research Grants Council of the Hong Kong Special Administrative Region, China (Grant No. 14201620), and the Hong Kong Innovation and Technology Fund (Grant No. MHP/092/22).

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Hong Kong, China
Jiaqi Xu, Mengyang Wu, Chi-Wing Fu, Qi Dou & Pheng-Ann Heng
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Xiaowei Hu

Authors

Jiaqi Xu
View author publications
You can also search for this author in PubMed Google Scholar
Mengyang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Wing Fu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Dou
View author publications
You can also search for this author in PubMed Google Scholar
Pheng-Ann Heng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaowei Hu .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 30799 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, J., Wu, M., Hu, X., Fu, CW., Dou, Q., Heng, PA. (2025). Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15076. Springer, Cham. https://doi.org/10.1007/978-3-031-72649-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-72649-1_9
Published: 30 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72648-4
Online ISBN: 978-3-031-72649-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models