Abstract
Dynamic scene video deblurring aims to remove undesirable blurry artifacts captured during the exposure process. Although previous video deblurring methods have achieved impressive results, they suffer from significant performance drops due to the domain gap between training and testing videos, especially for those captured in real-world scenarios. To address this issue, we propose a domain adaptation scheme based on a blurring model to achieve test-time fine-tuning for deblurring models in unseen domains. Since blurred and sharp pairs are unavailable for fine-tuning during inference, our scheme can generate domain-adaptive training pairs to calibrate a deblurring model for the target domain. First, a Relative Sharpness Detection Module is proposed to identify relatively sharp regions from the blurry input images and regard them as pseudo-sharp images. Next, we utilize a blurring model to produce blurred images based on the pseudo-sharp images extracted during testing. To synthesize blurred images in compliance with the target data distribution, we propose a Domain-adaptive Blur Condition Generation Module to create domain-specific blur conditions for the blurring model. Finally, the generated pseudo-sharp and blurred pairs are used to fine-tune a deblurring model for better performance. Extensive experimental results demonstrate that our approach can significantly improve state-of-the-art video deblurring methods, providing performance gains of up to 7.54dB on various real-world video deblurring datasets. The source code is available at https://github.com/Jin-Ting-He/DADeblur.
J.-T. He and F.-J. Tsai—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The authors from the universities in Taiwan completed the experiments on the datasets.
References
https://scikit-image.org/docs/stable/auto_examples/segmentation/plot_regionprops.html
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chao, Z., Hang, D., Jinshan, P., Boyang, L., Yuhao, H., Lean, F., Fei, W.: Deep recurrent neural network with multi-scale bi-directional propagation for video deblurring. In: AAAI (2022)
Chen, C.F., Panda, R., Fan, Q.: RegionViT: regional-to-local attention for vision transformers. In: ICLR (2022)
Chen, L., et al.: Deliberated domain bridging for domain adaptive semantic segmentation. In: NeurIPS (2022)
Chi, Z., Wang, Y., Yu, Y., Tang, J.: Test-time fast adaptation for dynamic scene deblurring via meta-auxiliary learning. In: CVPR (2021)
Cho, S.J., Ji, S.W., Hong, J.P., Jung, S.W., Ko, S.J.: Rethinking coarse-to-fine approach in single image deblurring. In: ICCV (2021)
Chu, X., et al.: Twins: revisiting the design of spatial attention in vision transformers. In: NeurIPS (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv (2018)
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: NeurIPS (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML (2017)
Gao, H., Tao, X., Shen, X., Jia, J.: Dynamic scene deblurring with parameter selective sharing and nested skip connections. In: CVPR (2019)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NeurIPS (2020)
Hoyer, L., Dai, D., Van Gool, L.: HRDA: context-aware high-resolution domain-adaptive semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13690, pp. 372–391. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20056-4_22
Ji, B., Yao, A.: Multi-scale memory-based video deblurring. In: CVPR (2022)
Jiang, B., Xie, Z., Xia, Z., Li, S., Liu, S.: ERDN: equivalent receptive field deformable network for video deblurring. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13678, pp. 663–678. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19797-0_38
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2017)
Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In: ICCV (2019)
Lai, X., et al.: DecoupleNet: decoupled network for domain adaptive semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13693, pp. 369–387. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_22
Li, D., et al.: A simple baseline for video restoration with grouped spatial-temporal shift. In: CVPR (2023)
Li, D., et al.: ARVo: learning all-range volumetric correspondence for video deblurring. In: CVPR (2021)
Li, W., Liu, X., Yuan, Y.: SIGMA: semantic-complete graph matching for domain adaptive object detection. In: CVPR (2022)
Li, Y.J., et al.: Cross-domain adaptive teacher for object detection. In: CVPR (2022)
Liang, J., et al.: VRT: a video restoration transformer. In: arXiv (2022)
Liang, J., et al.: Recurrent video restoration transformer with guided deformable attention. In: NeurIPS (2022)
Liang, Z., Li, Z., Zhou, S., Li, C., Loy, C.C.: Control color: multimodal diffusion-based interactive image colorization. arXiv (2024)
Lin, J., et al.: Flow-guided sparse transformer for video deblurring. In: ICML (2022)
Liu, P.S., Tsai, F.J., Peng, Y.T., Tsai, C.C., Lin, C.W., Lin, Y.Y.: Meta transferring for deblurring. In: BMVC (2022)
Ma, J., Liang, J., Chen, C., Lu, H.: Subject-diffusion: open domain personalized text-to-image generation without test-time fine-tuning. arXiv (2023)
Meng, C., et al.: SDEdit: guided image synthesis and editing with stochastic differential equations. arXiv (2021)
Nah, S., Kim, T.H., Lee, K.M.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: CVPR (2017)
Nah, S., Son, S., Lee, J., Lee, K.M.: Clean images are hard to reblur: a new clue for deblurring. In: ICLR (2022)
Nah, S., Son, S., Lee, K.M.: Recurrent neural networks with intra-frame iterations for video deblurring. In: CVPR (2019)
Nichol, A., et al.: GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. arXiv (2021)
Pan, J., Bai, H., Tang, J.: Cascaded deep video deblurring using temporal sharpness prior. In: CVPR (2020)
Pan, J., Xu, B., Dong, J., Ge, J., Tang, J.: Deep discriminative spatial and temporal network for efficient video deblurring. In: CVPR (2023)
Park, D., Kang, D.U., Kim, J., Chun, S.Y.: Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 327–343. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_20
Purohit, K., Rajagopalan, A.N.: Region-adaptive dense network for efficient motion deblurring. In: AAAI (2020)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML (2021)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR 21, 1–67 (2020)
Ramesh, A., et al.: Zero-shot text-to-image generation. In: ICML (2021)
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: ICCV (2021)
Rim, J., Lee, H., Won, J., Cho, S.: Real-world blur dataset for learning and benchmarking deblurring algorithms. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 184–201. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_12
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation. In: CVPR (2023)
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. In: NeurIPS (2022)
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: ICLR (2021)
Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., Wang, O.: Deep video deblurring for hand-held cameras. In: CVPR (2017)
Suin, M., Purohit, K., Rajagopalan, A.N.: Spatially-attentive patch-hierarchical network for adaptive motion deblurring. In: CVPR (2020)
Suin, M., Rajagopalan, A.N.: Gated spatio-temporal attention-guided video deblurring. In: CVPR (2021)
Tao, X., Gao, H., Shen, X., Wang, J., Jia, J.: Scale-recurrent network for deep image deblurring. In: CVPR (2018)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Tsai, F.J., Peng, Y.T., Lin, Y.Y., Tsai, C.C., Lin, C.W.: StripFormer: strip transformer for fast image deblurring. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13679, pp. 146–162. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19800-7_9
Tsai, F.J., Peng, Y.T., Tsai, C.C., Lin, Y.Y., Lin, C.W.: BANet: a blur-aware attention network for dynamic scene deblurring. IEEE TIP 31, 6789–6799 (2022)
Voynov, A., Aberman, K., Cohen-Or, D.: Sketch-guided text-to-image diffusion models. In: SIGGRAPH (2023)
Wang, Y., Lu, Y., Gao, Y., Wang, L., Zhong, Z., Zheng, Y., Yamashita, A.: Efficient video deblurring guided by motion magnitude. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13679, pp. 413–429. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19800-7_24
Wu, J.H., Tsai, F.J., Peng, Y.T., Tsai, C.C., Lin, C.W., Lin, Y.Y.: ID-Blau: image deblurring by implicit diffusion-based reblurring augmentation. In: CVPR (2024)
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: NeurIPS (2021)
Zhang, H., Dai, Y., Li, H., Koniusz, P.: Deep stacked hierarchical multi-patch network for image deblurring. In: CVPR (2019)
Zhang, H., Xie, H., Yao, H.: Spatio-temporal deformable attention network for video deblurring. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13676, pp. 581–596. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19787-1_33
Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: ICCV (2023)
Zhang, Z., Hoai, M.: Object detection with self-supervised scene adaptation. In: CVPR (2023)
Zhong, Z., Gao, Y., Zheng, Y., Zheng, B., Sato, I.: Real-world video deblurring: a benchmark dataset and an efficient recurrent neural network. IJCV 131, 284–301 (2022)
Zhou, K., Li, W., Lu, L., Han, X., Lu, J.: Revisiting temporal alignment for video restoration. In: CVPR (2022)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: ICLR (2021)
Acknowledgments
This work was supported in part by the National Science and Technology Council (NSTC) under grants 112-2221-EA49-090-MY3, 111-2628-E-A49-025-MY3, 112-2634-F002-005, 112-2221-E-004-005, 113-2923-E-A49-003-MY2, 113-2221-E-004-001-MY3 and 113-2622-E-004-001 This work was funded in part by Qualcomm through a Taiwan University Research Collaboration Project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
He, JT. et al. (2025). Domain-Adaptive Video Deblurring via Test-Time Blurring. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15088. Springer, Cham. https://doi.org/10.1007/978-3-031-73404-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-73404-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73403-8
Online ISBN: 978-3-031-73404-5
eBook Packages: Computer ScienceComputer Science (R0)