L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model

Hong, Yuchen; Zhong, Haofeng; Weng, Shuchen; Liang, Jinxiu; Shi, Boxin

doi:10.1007/978-3-031-72661-3_4

Yuchen Hong^13,14,
Haofeng Zhong^13,14,15,
Shuchen Weng^13,14,
Jinxiu Liang^13,14 &
…
Boxin Shi^13,14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15078))

Included in the following conference series:

European Conference on Computer Vision

295 Accesses

Abstract

In this paper, we introduce L-DiffER, a language-based diffusion model designed for the ill-posed single image reflection removal task. Although having shown impressive performance for image generation, existing language-based diffusion models struggle with precise control and faithfulness in image restoration. To overcome these limitations, we propose an iterative condition refinement strategy to resolve the problem of inaccurate control conditions. A multi-condition constraint mechanism is employed to ensure the recovery faithfulness of image color and structure while retaining the generation capability to handle low-transmitted reflections. We demonstrate the superiority of the proposed method through extensive experiments, showcasing both quantitative and qualitative improvements over existing methods.

Y. Hong and H. Zhong—Equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.99; Price excludes VAT (USA)

Softcover Book: USD 161.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Single-image reflection removal via self-supervised diffusion models

Article 26 December 2024

AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion

Single Image Reflection Removal Based on GAN with Gradient Constraint

Notes

1.
Note that the spatial conditions are varied with timesteps in the proposed method.
2.
Details of $\beta _t$ and $\gamma _t$ will be explained in the supplementary material.
3.
Since ControlNet [74] destructs image color and structure in a generative manner as shown in Fig. 1(c), we only run it for qualitative comparisons.
4.
Evaluations on reflection layers are provided in the supplementary material.
5.
More ablation studies are provided in the supplementary material.

References

Chang, Y., Jung, C., Sun, J.: Joint reflection removal and depth estimation from a single image. IEEE Trans. Cybern. 51(12), 5836–5849 (2020)
Article Google Scholar
Chang, Y., Jung, C., Sun, J., Wang, F.: Siamese dense network for reflection removal with flash and no-flash image pairs. Int. J. Comput. Vision 128, 1673–1698 (2020)
Article Google Scholar
Chang, Z., Weng, S., Li, Y., Li, S., Shi, B.: L-CoDer: language-based colorization with color-object decoupling transformer. In: Proceedings of European Conference on Computer Vision (2022)
Google Scholar
Chang, Z., Weng, S., Zhang, P., Li, Y., Li, S., Shi, B.: L-CAD: language-based colorization with any-level descriptions using diffusion priors. In: Proceedings of Advances in Neural Information Processing Systems (2023)
Google Scholar
Chang, Z., Weng, S., Zhang, P., Li, Y., Li, S., Shi, B.: L-CoIns: language-based colorization with instance awareness. In: Proceedings of Computer Vision and Pattern Recognition (2023)
Google Scholar
Chen, X., et al.: Microsoft coco captions: data collection and evaluation server. arXiv preprint arXiv:1504.00325 (2015)
Diamant, Y., Schechner, Y.Y.: Overcoming visual reverberations. In: Proceedings of Computer Vision and Pattern Recognition (2008)
Google Scholar
Dong, Z., Xu, K., Yang, Y., Bao, H., Xu, W., Lau, R.W.: Location-aware single image reflection removal. In: Proceedings of International Conference on Computer Vision (2021)
Google Scholar
Fan, Q., Yang, J., Hua, G., Chen, B., Wipf, D.: A generic deep architecture for single image reflection removal and image smoothing. In: Proceedings of International Conference on Computer Vision (2017)
Google Scholar
Han, B.J., Sim, J.Y.: Zero-shot learning for reflection removal of single 360-degree image. In: Proceedings of European Conference on Computer Vision (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of International Conference on Computer Vision (2015)
Google Scholar
Hertz, A., Mokady, R., Tenenbaum, J., Aberman, K., Pritch, Y., Cohen-Or, D.: Prompt-to-prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626 (2022)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of Advances in Neural Information Processing Systems (2017)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Proceedings of Advances in Neural Information Processing Systems (2020)
Google Scholar
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Hong, Y., Chang, Y., Liang, J., Ma, L., Huang, T., Shi, B.: Light flickering guided reflection removal. Int. J. Comput. Vision (2024)
Google Scholar
Hong, Y., Lyu, Y., Li, S., Cao, G., Shi, B.: Reflection removal with NIR and RGB image feature fusion. IEEE Trans. Multimedia 25, 7101–7112 (2022)
Article Google Scholar
Hong, Y., Lyu, Y., Li, S., Shi, B.: Near-infrared image guided reflection removal. In: Proceedings of International Conference on Multimedia and Expo (2020)
Google Scholar
Hong, Y., Zheng, Q., Zhao, L., Jiang, X., Kot, A.C., Shi, B.: Panoramic image reflection removal. In: Proceedings of Computer Vision and Pattern Recognition (2021)
Google Scholar
Hong, Y., Zheng, Q., Zhao, L., Jiang, X., Kot, A.C., Shi, B.: PAR$^2$Net: end-to-end panoramic image reflection removal. IEEE Trans. Pattern Anal. Mach. Intell. 45(10), 12192–12205 (2023)
Article Google Scholar
Hu, Q., Guo, X.: Trash or treasure? An interactive dual-stream strategy for single image reflection separation. In: Proceedings of Advances in Neural Information Processing Systems (2021)
Google Scholar
Hu, Q., Guo, X.: Single image reflection separation via component synergy. In: Proceedings of International Conference on Computer Vision (2023)
Google Scholar
Huynh-Thu, Q., Ghanbari, M.: Scope of validity of PSNR in image/video quality assessment. Electron. Lett. 44(13), 800–801 (2008)
Article Google Scholar
Kim, S., Huo, Y., Yoon, S.E.: Single image reflection removal with physically-based training images. In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kong, N., Tai, Y.W., Shin, S.Y.: A physically-based approach to reflection separation. In: Proceedings of Computer Vision and Pattern Recognition (2012)
Google Scholar
Lei, C., Chen, Q.: Robust reflection removal with reflection-free flash-only cues. In: Proceedings of Computer Vision and Pattern Recognition (2021)
Google Scholar
Lei, C., Huang, X., Zhang, M., Yan, Q., Sun, W., Chen, Q.: Polarized reflection removal with perfect alignment in the wild. In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Lei, C., Jiang, X., Chen, Q.: Robust reflection removal with flash-only cues in the wild. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Levin, A., Weiss, Y.: User assisted separation of reflections from a single image using a sparsity prior. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1647–1654 (2007)
Article Google Scholar
Li, C., Yang, Y., He, K., Lin, S., Hopcroft, J.E.: Single image reflection removal through cascaded refinement. In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Li, Y., Brown, M.S.: Exploiting reflection change for automatic reflection removal. In: Proceedings of International Conference on Computer Vision (2013)
Google Scholar
Li, Y., Brown, M.S.: Single image layer separation using relative smoothness. In: Proceedings of Computer Vision and Pattern Recognition (2014)
Google Scholar
Liu, Y.L., Lai, W.S., Yang, M.H., Chuang, Y.Y., Huang, J.B.: Learning to see through obstructions. In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Liu, Y.L., Lai, W.S., Yang, M.H., Chuang, Y.Y., Huang, J.B.: Learning to see through obstructions with layered decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8387–8402 (2021)
Article Google Scholar
Luo, J., et al.: 3D-SPS: single-stage 3D visual grounding via referred point progressive selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16454–16463 (2022)
Google Scholar
Lyu, Y., Cui, Z., Li, S., Pollefeys, M., Shi, B.: Reflection separation using a pair of unpolarized and polarized images. In: Proceedings of Advances in Neural Information Processing Systems (2019)
Google Scholar
Lyu, Y., Cui, Z., Li, S., Pollefeys, M., Shi, B.: Physics-guided reflection separation from a pair of unpolarized and polarized images. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2151–2165 (2022)
Article Google Scholar
Ma, D., Wan, R., Shi, B., Kot, A.C., Duan, L.Y.: Learning to jointly generate and separate reflections. In: Proceedings of International Conference on Computer Vision (2019)
Google Scholar
Meng, C., et al.: SDEdit: guided image synthesis and editing with stochastic differential equations. In: Proceedings of International Conference on Learning Representations (2021)
Google Scholar
Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2013)
Google Scholar
Mou, C., et al.: T2i-adapter: learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453 (2023)
Nam, S., Brubaker, M.A., Brown, M.S.: Neural image representations for multi-image fusion and layer separation. In: Proceedings of European Conference on Computer Vision (2022)
Google Scholar
Nayar, S.K., Fang, X.S., Boult, T.: Separation of reflection components using color and polarization. Int. J. Comput. Vision 21(3), 163–186 (1997)
Article Google Scholar
Park, J., Kim, H., Park, E., Sim, J.Y.: Fully-automatic reflection removal for 360-degree images. In: Proceedings of Winter Conference on Applications of Computer Vision (2024)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of Advances in Neural Information Processing Systems (2019)
Google Scholar
Qiu, J., Jiang, P.T., Zhu, Y., Yin, Z.X., Cheng, M.M., Ren, B.: Looking through the glass: neural surface reconstruction against high specular reflections. In: Proceedings of Computer Vision and Pattern Recognition (2023)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Proceedings of International Conference on Machine Learning. PMLR (2021)
Google Scholar
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of Computer Vision and Pattern Recognition (2022)
Google Scholar
Schechner, Y.Y., Kiryati, N., Basri, R.: Separation of transparent layers using focus. Int. J. Comput. Vision 39, 25–39 (2000)
Article Google Scholar
Shih, Y., Krishnan, D., Durand, F., Freeman, W.T.: Reflection removal using ghosting cues. In: Proceedings of Computer Vision and Pattern Recognition (2015)
Google Scholar
Simon, C., Kyu Park, I.: Reflection removal for in-vehicle black box videos. In: Proceedings of Computer Vision and Pattern Recognition (2015)
Google Scholar
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)
Sun, H., et al.: CoSeR: bridging image and language for cognitive super-resolution. arXiv preprint arXiv:2311.16512 (2023)
Sun, J., Weng, S., Chang, Z., Li, S., Shi, B.: UniCoRN: a unified conditional image repainting network. In: Proceedings of Computer Vision and Pattern Recognition (2022)
Google Scholar
Tang, J., Zhong, H., Weng, S., Shi, B.: LuminAIRe: illumination-aware conditional image repainting for lighting-realistic generation. In: Proceedings of Advances in Neural Information Processing Systems (2023)
Google Scholar
Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Gao, W., Kot, A.C.: Region-aware reflection removal with unified content and gradient priors. IEEE Trans. Image Process. 27(6), 2927–2941 (2018)
Article MathSciNet Google Scholar
Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Kot, A.C.: Benchmarking single-image reflection removal algorithms. In: Proceedings of International Conference on Computer Vision (2017)
Google Scholar
Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Kot, A.C.: CRRN: multi-scale guided concurrent reflection removal network. In: Proceedings of Computer Vision and Pattern Recognition (2018)
Google Scholar
Wan, R., Shi, B., Li, H., Duan, L.Y., Kot, A.C.: Face image reflection removal. Int. J. Comput. Vision 129, 385–399 (2021)
Article MathSciNet Google Scholar
Wan, R., Shi, B., Li, H., Duan, L.Y., Tan, A.H., Kot, A.C.: CoRRN: cooperative reflection removal network. IEEE Trans. Pattern Anal. Mach. Intell. 42(12), 2969–2982 (2019)
Article Google Scholar
Wan, R., Shi, B., Li, H., Hong, Y., Duan, L.Y., Kot, A.C.: Benchmarking single-image reflection removal algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1424–1441 (2022)
Google Scholar
Wang, Z., et al.: CRIS: CLIP-driven referring image segmentation. In: Proceedings of Computer Vision and Pattern Recognition (2022)
Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers (2003)
Google Scholar
Wei, K., Yang, J., Fu, Y., Wipf, D., Huang, H.: Single image reflection removal exploiting misaligned training data and network enhancements. In: Proceedings of Computer Vision and Pattern Recognition (2019)
Google Scholar
Wen, Q., Tan, Y., Qin, J., Liu, W., Han, G., He, S.: Single image reflection removal beyond linearity. In: Proceedings of Computer Vision and Pattern Recognition (2019)
Google Scholar
Weng, S., Li, W., Li, D., Jin, H., Shi, B.: MISC: multi-condition injection and spatially-adaptive compositing for conditional person image synthesis. In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Weng, S., Shi, B.: Conditional image repainting. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Weng, S., Wu, H., Chang, Z., Tang, J., Li, S., Shi, B.: L-CoDe: language-based colorization using color-object decoupled conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)
Google Scholar
Yang, J., Gong, D., Liu, L., Shi, Q.: Seeing deeply and bidirectionally: a deep learning approach for single image reflection removal. In: Proceedings of European Conference on Computer Vision (2018)
Google Scholar
Yang, Y., Ma, W., Zheng, Y., Cai, J.F., Xu, W.: Fast single image reflection suppression via convex optimization. In: Proceedings of Computer Vision and Pattern Recognition (2019)
Google Scholar
Yang, Z., Wang, J., Tang, Y., Chen, K., Zhao, H., Torr, P.H.: LAVT: language-aware vision transformer for referring image segmentation. In: Proceedings of Computer Vision and Pattern Recognition (2022)
Google Scholar
Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)
Article Google Scholar
Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: Proceedings of International Conference on Computer Vision (2023)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of Computer Vision and Pattern Recognition (2018)
Google Scholar
Zhang, X., Ng, R., Chen, Q.: Single image reflection separation with perceptual losses. In: Proceedings of Computer Vision and Pattern Recognition (2018)
Google Scholar
Zhang, Y.N., Shen, L., Li, Q.: Content and gradient model-driven deep network for single image reflection removal. In: Proceedings of ACM International Conference on Multimedia (2022)
Google Scholar
Zhao, S., et al.: Uni-controlnet: all-in-one control to text-to-image diffusion models. In: Proceedings of Advances in Neural Information Processing Systems (2024)
Google Scholar
Zheng, Q., et al.: What does plate glass reveal about camera calibration? In: Proceedings of Computer Vision and Pattern Recognition (2020)
Google Scholar
Zheng, Q., Shi, B., Chen, J., Jiang, X., Duan, L.Y., Kot, A.C.: Single image reflection removal with absorption effect. In: Proceedings of Computer Vision and Pattern Recognition (2021)
Google Scholar
Zhong, H., Hong, Y., Weng, S., Liang, J., Shi, B.: Language-guided image reflection separation. In: Proceedings of Computer Vision and Pattern Recognition (2024)
Google Scholar

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China under Grant No. 62136001, 62088102.

Author information

Authors and Affiliations

State Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China
Yuchen Hong, Haofeng Zhong, Shuchen Weng, Jinxiu Liang & Boxin Shi
National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, Beijing, China
Yuchen Hong, Haofeng Zhong, Shuchen Weng, Jinxiu Liang & Boxin Shi
AI Innovation Center, School of Computer Science, Peking University, Beijing, China
Haofeng Zhong & Boxin Shi

Authors

Yuchen Hong
View author publications
You can also search for this author in PubMed Google Scholar
Haofeng Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Shuchen Weng
View author publications
You can also search for this author in PubMed Google Scholar
Jinxiu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Boxin Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boxin Shi .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2270 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hong, Y., Zhong, H., Weng, S., Liang, J., Shi, B. (2025). L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15078. Springer, Cham. https://doi.org/10.1007/978-3-031-72661-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-72661-3_4
Published: 27 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72660-6
Online ISBN: 978-3-031-72661-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model