Skip to main content

L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15078))

Included in the following conference series:

  • 295 Accesses

Abstract

In this paper, we introduce L-DiffER, a language-based diffusion model designed for the ill-posed single image reflection removal task. Although having shown impressive performance for image generation, existing language-based diffusion models struggle with precise control and faithfulness in image restoration. To overcome these limitations, we propose an iterative condition refinement strategy to resolve the problem of inaccurate control conditions. A multi-condition constraint mechanism is employed to ensure the recovery faithfulness of image color and structure while retaining the generation capability to handle low-transmitted reflections. We demonstrate the superiority of the proposed method through extensive experiments, showcasing both quantitative and qualitative improvements over existing methods.

Y. Hong and H. Zhong—Equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Note that the spatial conditions are varied with timesteps in the proposed method.

  2. 2.

    Details of \(\beta _t\) and \(\gamma _t\) will be explained in the supplementary material.

  3. 3.

    Since ControlNet [74] destructs image color and structure in a generative manner as shown in Fig. 1(c), we only run it for qualitative comparisons.

  4. 4.

    Evaluations on reflection layers are provided in the supplementary material.

  5. 5.

    More ablation studies are provided in the supplementary material.

References

  1. Chang, Y., Jung, C., Sun, J.: Joint reflection removal and depth estimation from a single image. IEEE Trans. Cybern. 51(12), 5836–5849 (2020)

    Article  Google Scholar 

  2. Chang, Y., Jung, C., Sun, J., Wang, F.: Siamese dense network for reflection removal with flash and no-flash image pairs. Int. J. Comput. Vision 128, 1673–1698 (2020)

    Article  Google Scholar 

  3. Chang, Z., Weng, S., Li, Y., Li, S., Shi, B.: L-CoDer: language-based colorization with color-object decoupling transformer. In: Proceedings of European Conference on Computer Vision (2022)

    Google Scholar 

  4. Chang, Z., Weng, S., Zhang, P., Li, Y., Li, S., Shi, B.: L-CAD: language-based colorization with any-level descriptions using diffusion priors. In: Proceedings of Advances in Neural Information Processing Systems (2023)

    Google Scholar 

  5. Chang, Z., Weng, S., Zhang, P., Li, Y., Li, S., Shi, B.: L-CoIns: language-based colorization with instance awareness. In: Proceedings of Computer Vision and Pattern Recognition (2023)

    Google Scholar 

  6. Chen, X., et al.: Microsoft coco captions: data collection and evaluation server. arXiv preprint arXiv:1504.00325 (2015)

  7. Diamant, Y., Schechner, Y.Y.: Overcoming visual reverberations. In: Proceedings of Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  8. Dong, Z., Xu, K., Yang, Y., Bao, H., Xu, W., Lau, R.W.: Location-aware single image reflection removal. In: Proceedings of International Conference on Computer Vision (2021)

    Google Scholar 

  9. Fan, Q., Yang, J., Hua, G., Chen, B., Wipf, D.: A generic deep architecture for single image reflection removal and image smoothing. In: Proceedings of International Conference on Computer Vision (2017)

    Google Scholar 

  10. Han, B.J., Sim, J.Y.: Zero-shot learning for reflection removal of single 360-degree image. In: Proceedings of European Conference on Computer Vision (2022)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of International Conference on Computer Vision (2015)

    Google Scholar 

  12. Hertz, A., Mokady, R., Tenenbaum, J., Aberman, K., Pritch, Y., Cohen-Or, D.: Prompt-to-prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626 (2022)

  13. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  14. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Proceedings of Advances in Neural Information Processing Systems (2020)

    Google Scholar 

  15. Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)

  16. Hong, Y., Chang, Y., Liang, J., Ma, L., Huang, T., Shi, B.: Light flickering guided reflection removal. Int. J. Comput. Vision (2024)

    Google Scholar 

  17. Hong, Y., Lyu, Y., Li, S., Cao, G., Shi, B.: Reflection removal with NIR and RGB image feature fusion. IEEE Trans. Multimedia 25, 7101–7112 (2022)

    Article  Google Scholar 

  18. Hong, Y., Lyu, Y., Li, S., Shi, B.: Near-infrared image guided reflection removal. In: Proceedings of International Conference on Multimedia and Expo (2020)

    Google Scholar 

  19. Hong, Y., Zheng, Q., Zhao, L., Jiang, X., Kot, A.C., Shi, B.: Panoramic image reflection removal. In: Proceedings of Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  20. Hong, Y., Zheng, Q., Zhao, L., Jiang, X., Kot, A.C., Shi, B.: PAR\(^2\)Net: end-to-end panoramic image reflection removal. IEEE Trans. Pattern Anal. Mach. Intell. 45(10), 12192–12205 (2023)

    Article  Google Scholar 

  21. Hu, Q., Guo, X.: Trash or treasure? An interactive dual-stream strategy for single image reflection separation. In: Proceedings of Advances in Neural Information Processing Systems (2021)

    Google Scholar 

  22. Hu, Q., Guo, X.: Single image reflection separation via component synergy. In: Proceedings of International Conference on Computer Vision (2023)

    Google Scholar 

  23. Huynh-Thu, Q., Ghanbari, M.: Scope of validity of PSNR in image/video quality assessment. Electron. Lett. 44(13), 800–801 (2008)

    Article  Google Scholar 

  24. Kim, S., Huo, Y., Yoon, S.E.: Single image reflection removal with physically-based training images. In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  25. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  26. Kong, N., Tai, Y.W., Shin, S.Y.: A physically-based approach to reflection separation. In: Proceedings of Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  27. Lei, C., Chen, Q.: Robust reflection removal with reflection-free flash-only cues. In: Proceedings of Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  28. Lei, C., Huang, X., Zhang, M., Yan, Q., Sun, W., Chen, Q.: Polarized reflection removal with perfect alignment in the wild. In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  29. Lei, C., Jiang, X., Chen, Q.: Robust reflection removal with flash-only cues in the wild. IEEE Trans. Pattern Anal. Mach. Intell. (2023)

    Google Scholar 

  30. Levin, A., Weiss, Y.: User assisted separation of reflections from a single image using a sparsity prior. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1647–1654 (2007)

    Article  Google Scholar 

  31. Li, C., Yang, Y., He, K., Lin, S., Hopcroft, J.E.: Single image reflection removal through cascaded refinement. In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  32. Li, Y., Brown, M.S.: Exploiting reflection change for automatic reflection removal. In: Proceedings of International Conference on Computer Vision (2013)

    Google Scholar 

  33. Li, Y., Brown, M.S.: Single image layer separation using relative smoothness. In: Proceedings of Computer Vision and Pattern Recognition (2014)

    Google Scholar 

  34. Liu, Y.L., Lai, W.S., Yang, M.H., Chuang, Y.Y., Huang, J.B.: Learning to see through obstructions. In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  35. Liu, Y.L., Lai, W.S., Yang, M.H., Chuang, Y.Y., Huang, J.B.: Learning to see through obstructions with layered decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8387–8402 (2021)

    Article  Google Scholar 

  36. Luo, J., et al.: 3D-SPS: single-stage 3D visual grounding via referred point progressive selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16454–16463 (2022)

    Google Scholar 

  37. Lyu, Y., Cui, Z., Li, S., Pollefeys, M., Shi, B.: Reflection separation using a pair of unpolarized and polarized images. In: Proceedings of Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  38. Lyu, Y., Cui, Z., Li, S., Pollefeys, M., Shi, B.: Physics-guided reflection separation from a pair of unpolarized and polarized images. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2151–2165 (2022)

    Article  Google Scholar 

  39. Ma, D., Wan, R., Shi, B., Kot, A.C., Duan, L.Y.: Learning to jointly generate and separate reflections. In: Proceedings of International Conference on Computer Vision (2019)

    Google Scholar 

  40. Meng, C., et al.: SDEdit: guided image synthesis and editing with stochastic differential equations. In: Proceedings of International Conference on Learning Representations (2021)

    Google Scholar 

  41. Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2013)

    Google Scholar 

  42. Mou, C., et al.: T2i-adapter: learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453 (2023)

  43. Nam, S., Brubaker, M.A., Brown, M.S.: Neural image representations for multi-image fusion and layer separation. In: Proceedings of European Conference on Computer Vision (2022)

    Google Scholar 

  44. Nayar, S.K., Fang, X.S., Boult, T.: Separation of reflection components using color and polarization. Int. J. Comput. Vision 21(3), 163–186 (1997)

    Article  Google Scholar 

  45. Park, J., Kim, H., Park, E., Sim, J.Y.: Fully-automatic reflection removal for 360-degree images. In: Proceedings of Winter Conference on Applications of Computer Vision (2024)

    Google Scholar 

  46. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  47. Qiu, J., Jiang, P.T., Zhu, Y., Yin, Z.X., Cheng, M.M., Ren, B.: Looking through the glass: neural surface reconstruction against high specular reflections. In: Proceedings of Computer Vision and Pattern Recognition (2023)

    Google Scholar 

  48. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Proceedings of International Conference on Machine Learning. PMLR (2021)

    Google Scholar 

  49. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  50. Schechner, Y.Y., Kiryati, N., Basri, R.: Separation of transparent layers using focus. Int. J. Comput. Vision 39, 25–39 (2000)

    Article  Google Scholar 

  51. Shih, Y., Krishnan, D., Durand, F., Freeman, W.T.: Reflection removal using ghosting cues. In: Proceedings of Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  52. Simon, C., Kyu Park, I.: Reflection removal for in-vehicle black box videos. In: Proceedings of Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  53. Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)

  54. Sun, H., et al.: CoSeR: bridging image and language for cognitive super-resolution. arXiv preprint arXiv:2311.16512 (2023)

  55. Sun, J., Weng, S., Chang, Z., Li, S., Shi, B.: UniCoRN: a unified conditional image repainting network. In: Proceedings of Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  56. Tang, J., Zhong, H., Weng, S., Shi, B.: LuminAIRe: illumination-aware conditional image repainting for lighting-realistic generation. In: Proceedings of Advances in Neural Information Processing Systems (2023)

    Google Scholar 

  57. Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Gao, W., Kot, A.C.: Region-aware reflection removal with unified content and gradient priors. IEEE Trans. Image Process. 27(6), 2927–2941 (2018)

    Article  MathSciNet  Google Scholar 

  58. Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Kot, A.C.: Benchmarking single-image reflection removal algorithms. In: Proceedings of International Conference on Computer Vision (2017)

    Google Scholar 

  59. Wan, R., Shi, B., Duan, L.Y., Tan, A.H., Kot, A.C.: CRRN: multi-scale guided concurrent reflection removal network. In: Proceedings of Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  60. Wan, R., Shi, B., Li, H., Duan, L.Y., Kot, A.C.: Face image reflection removal. Int. J. Comput. Vision 129, 385–399 (2021)

    Article  MathSciNet  Google Scholar 

  61. Wan, R., Shi, B., Li, H., Duan, L.Y., Tan, A.H., Kot, A.C.: CoRRN: cooperative reflection removal network. IEEE Trans. Pattern Anal. Mach. Intell. 42(12), 2969–2982 (2019)

    Article  Google Scholar 

  62. Wan, R., Shi, B., Li, H., Hong, Y., Duan, L.Y., Kot, A.C.: Benchmarking single-image reflection removal algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1424–1441 (2022)

    Google Scholar 

  63. Wang, Z., et al.: CRIS: CLIP-driven referring image segmentation. In: Proceedings of Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  64. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers (2003)

    Google Scholar 

  65. Wei, K., Yang, J., Fu, Y., Wipf, D., Huang, H.: Single image reflection removal exploiting misaligned training data and network enhancements. In: Proceedings of Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  66. Wen, Q., Tan, Y., Qin, J., Liu, W., Han, G., He, S.: Single image reflection removal beyond linearity. In: Proceedings of Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  67. Weng, S., Li, W., Li, D., Jin, H., Shi, B.: MISC: multi-condition injection and spatially-adaptive compositing for conditional person image synthesis. In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  68. Weng, S., Shi, B.: Conditional image repainting. IEEE Trans. Pattern Anal. Mach. Intell. (2023)

    Google Scholar 

  69. Weng, S., Wu, H., Chang, Z., Tang, J., Li, S., Shi, B.: L-CoDe: language-based colorization using color-object decoupled conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)

    Google Scholar 

  70. Yang, J., Gong, D., Liu, L., Shi, Q.: Seeing deeply and bidirectionally: a deep learning approach for single image reflection removal. In: Proceedings of European Conference on Computer Vision (2018)

    Google Scholar 

  71. Yang, Y., Ma, W., Zheng, Y., Cai, J.F., Xu, W.: Fast single image reflection suppression via convex optimization. In: Proceedings of Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  72. Yang, Z., Wang, J., Tang, Y., Chen, K., Zhao, H., Torr, P.H.: LAVT: language-aware vision transformer for referring image segmentation. In: Proceedings of Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  73. Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)

    Article  Google Scholar 

  74. Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: Proceedings of International Conference on Computer Vision (2023)

    Google Scholar 

  75. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  76. Zhang, X., Ng, R., Chen, Q.: Single image reflection separation with perceptual losses. In: Proceedings of Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  77. Zhang, Y.N., Shen, L., Li, Q.: Content and gradient model-driven deep network for single image reflection removal. In: Proceedings of ACM International Conference on Multimedia (2022)

    Google Scholar 

  78. Zhao, S., et al.: Uni-controlnet: all-in-one control to text-to-image diffusion models. In: Proceedings of Advances in Neural Information Processing Systems (2024)

    Google Scholar 

  79. Zheng, Q., et al.: What does plate glass reveal about camera calibration? In: Proceedings of Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  80. Zheng, Q., Shi, B., Chen, J., Jiang, X., Duan, L.Y., Kot, A.C.: Single image reflection removal with absorption effect. In: Proceedings of Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  81. Zhong, H., Hong, Y., Weng, S., Liang, J., Shi, B.: Language-guided image reflection separation. In: Proceedings of Computer Vision and Pattern Recognition (2024)

    Google Scholar 

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China under Grant No. 62136001, 62088102.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boxin Shi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2270 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hong, Y., Zhong, H., Weng, S., Liang, J., Shi, B. (2025). L-DiffER: Single Image Reflection Removal with Language-Based Diffusion Model. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15078. Springer, Cham. https://doi.org/10.1007/978-3-031-72661-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72661-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72660-6

  • Online ISBN: 978-3-031-72661-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics