research-article

LoFormer: Local Frequency Transformer for Image Deblurring

Authors:

Jiansheng Wang,

Yan WangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 10382 - 10391

https://doi.org/10.1145/3664647.3680888

Published: 28 October 2024 Publication History

Abstract

Due to the computational complexity of self-attention (SA), prevalent techniques for image deblurring often resort to either adopting localized SA or employing coarse-grained global SA methods, both of which exhibit drawbacks such as compromising global modeling or lacking fine-grained correlation. In order to address this issue by effectively modeling long-range dependencies without sacrificing fine-grained details, we introduce a novel approach termed Local Frequency Transformer (LoFormer). Within each unit of LoFormer, we incorporate a Local Channel-wise SA in the frequency domain (Freq-LC) to simultaneously capture cross-covariance within low- and high-frequency local windows. These operations offer the advantage of (1) ensuring equitable learning opportunities for both coarse-grained structures and fine-grained details, and (2) exploring a broader range of representational properties compared to coarse-grained global SA methods. Additionally, we introduce an MLP Gating mechanism complementary to Freq-LC, which serves to filter out irrelevant features while enhancing global learning capabilities. Our experiments demonstrate that LoFormer significantly improves performance in the image deblurring task, achieving a PSNR of 34.09 dB on the GoPro dataset with 126G FLOPs. https://github.com/DeepMed-Lab-ECNU/Single-Image-Deblur

References

[1]

Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR, Vol. abs/1607.06450 (2016).

[2]

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. In Proc. ECCV. 213--229.

Digital Library

[3]

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021. Pre-Trained Image Processing Transformer. In Proc. CVPR.

[4]

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. 2022. Simple Baselines for Image Restoration. In Proc. ECCV.

Digital Library

[5]

Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Chengpeng Chen. 2021. HINet: Half Instance Normalization Network for Image Restoration. In Proc. CVPR Workshop.

[6]

Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, and Ilya Sutskever. 2020. Generative Pretraining From Pixels. In Proc? ICML. 1691--1703.

[7]

Lu Chi, Borui Jiang, and Yadong Mu. 2020. Fast Fourier Convolution. In Proc. NeurIPS, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).

[8]

Sung-Jin Cho, Seo-Won Ji, Jun-Pyo Hong, Seung-Won Jung, and Sung-Jea Ko. 2021. Rethinking Coarse-to-Fine Approach in Single Image Deblurring. In Proc. ICCV.

[9]

Xiaojie Chu, Liangyu Chen, Chengpeng Chen, and Xin Lu. 2022. Improving Image Restoration by Revisiting Global Information Aggregation. In Proc. ECCV.

Digital Library

[10]

Yuning Cui, Wenqi Ren, Xiaochun Cao, and Alois Knoll. 2023. Focal Network for Image Restoration. In Proc. ICCV.

[11]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. CVPR.

[12]

Jiangxin Dong, Jinshan Pan, Zhongbao Yang, and Jinhui Tang. 2023. Multi-Scale Residual Low-Pass Filter Network for Image Deblurring. In Proc. ICCV.

[13]

Shuting Dong, Zhe Wu, Feng Lu, and Chun Yuan. 2023. Enhanced Image Deblurring: An Efficient Frequency Exploitation and Preservation Network. In Proc. ACM MM.

Digital Library

[14]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proc. ICLR.

[15]

Zhenxuan Fang, Fangfang Wu, Weisheng Dong, Xin Li, Jinjian Wu, and Guangming Shi. 2023. Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring. In Proc. CVPR.

[16]

Hu Gao and Depeng Dang. 2024. Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring. arXiv preprint arXiv:2403.20106 (2024).

[17]

Jun Guo and Hongyang Chao. 2016. Building Dual-Domain Representations for Compression Artifacts Reduction. In Proc. ECCV.

[18]

Shaojie Guo, Haofei Song, Qingli Li, and Yan Wang. 2024. Spatially-Variant Degradation Model for Dataset-free Super-resolution. arXiv preprint arXiv:2407.08252 (2024).

[19]

Chi Han, Mingxuan Wang, Heng Ji, and Lei Li. 2021. Learning shared semantic space for speech-to-text translation. arXiv preprint arXiv:2105.03095 (2021).

[20]

Ronghang Hu, Amanpreet Singh, Trevor Darrell, and Marcus Rohrbach. 2020. Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA. In Proc?CVPR.

[21]

Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, Jinwoo Shin, and Hyong-Euk Lee. 2024. Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization. In Proc. CVPR.

[22]

Taewoo Kim, Hoonhee Cho, and Kuk-Jin Yoon. 2024. Frequency-aware Event-based Video Deblurring for Real-World Motion Blur. In Proc. CVPR. 24966--24976.

[23]

Lingshun Kong, Jiangxin Dong, Jianjun Ge, Mingqiang Li, and Jinshan Pan. 2023. Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring. In Proc. CVPR.

[24]

Lingshun Kong, Jiangxin Dong, Ming-Hsuan Yang, and Jinshan Pan. 2024. Efficient Visual State Space Model for Image Deblurring. arXiv preprint arXiv:2405.14343 (2024).

[25]

Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2018. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. In Proc. CVPR.

[26]

Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. 2019. DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better. In Proc. ICCV.

[27]

Ao Li, Le Zhang, Yun Liu, and Ce Zhu. 2023. Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution. In Proc. ICCV.

[28]

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. SwinIR: Image Restoration Using Swin Transformer. In Proc. ICCV Workshop.

[29]

Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, and Ming-Hsuan Yang. 2024. Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring. In Proc. CVPR.

[30]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proc?ICCV.

[31]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proc. ICCV.

[32]

Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In Proc. ICLR.

[33]

Xiaoqian Lv, Shengping Zhang, Chenyang Wang, Yichen Zheng, Bineng Zhong, Chongyi Li, and Liqiang Nie. 2024. Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and Deblurring. In Proc. CVPR.

[34]

Xintian Mao, Qingli Li, and Yan Wang. 2024. AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring. In Proc. CVPR.

[35]

Xintian Mao, Yiming Liu, Fengze Liu, Qingli Li, Wei Shen, and Yan Wang. 2023. Intriguing Findings of Frequency Selection for Image Deblurring. In Proc. AAAI.

Digital Library

[36]

Sun Mengdi, Xiaohai He, Xiong Shuhua, Chao Ren, and Li Xinglong. 2020. Reduction of JPEG compression artifacts based on DCT coefficients prediction. Neurocomputing (2020).

[37]

Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. 2017. Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring. In Proc. CVPR.

[38]

Seungjun Nah, Sanghyun Son, Suyoung Lee, Radu Timofte, and Kyoung Mu Lee. 2021. NTIRE 2021 Challenge on Image Deblurring. In Proc. CVPR Workshop.

[39]

Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li. 2020. FcaNet: Frequency Channel Attention Networks. In Proc. ICCV.

[40]

Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, and Jie Zhou. 2021. Global Filter Networks for Image Classification. In Proc. NeurIPS.

[41]

Jaesung Rim, Haeyun Lee, Jucheol Won, and Sunghyun Cho. 2020. Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms. In Proc. ECCV.

Digital Library

[42]

Oren Rippel, Jasper Snoek, and Ryan P. Adams. 2015. Spectral Representations for Convolutional Neural Networks. In Proc. NeurIPS.

[43]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proc. MICCAI.

[44]

Jie Shao, Xin Wen, Bingchen Zhao, and Xiangyang Xue. 2021. Temporal Context Aggregation for Video Retrieval With Contrastive Learning. In Proc. WACV. 3268--3278.

[45]

Jiwei Shen, Pengjie Lou, Liang Yuan, Shujing Lyu, and Yue Lu. 2024. VME-Transformer: Enhancing Visual Memory Encoding for Navigation in Interactive Environments. IEEE Robotics and Automation Letters, Vol. 9, 1 (2024), 643--650. https://doi.org/10.1109/LRA.2023.3333238

[46]

Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. 2019. Human-Aware Motion Deblurring. In Proc. ICCV.

[47]

Haofei Song, Xintian Mao, Jing Yu, Qingli Li, and Yan Wang. 2024. I 3 Net: Inter-Intra-slice Interpolation Network for Medical Slice Synthesis. IEEE Transactions on Medical Imaging (2024).

[48]

Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2020. VL-BERT: Pre-training of Generic Visual-Linguistic Representations. In Proc. ICLR.

[49]

Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. 2019. VideoBERT: A Joint Model for Video and Language Representation Learning. In Proc. ICCV. 7463--7472.

[50]

Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, and Victor Lempitsky. 2022. Resolution-robust Large Mask Inpainting with Fourier Convolutions. In Proc. WACV.

[51]

Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. 2018. Scale-Recurrent Network for Deep Image Deblurring. In Proc. CVPR.

[52]

Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, and Chia-Wen Lin. 2022. Stripformer: Strip Transformer for Fast Image Deblurring. In Proc. ECCV.

Digital Library

[53]

Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. 2022. MAXIM: Multi-Axis MLP for Image Processing. In Proc. CVPR.

[54]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proc. NeurIPS.

[55]

Zhendong Wang, Xiaodong Cun, Jianmin Bao, and Jianzhuang Liu. 2022. Uformer: A General U-Shaped Transformer for Image Restoration. In Proc. CVPR.

[56]

Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, and Peyman Milanfar. 2022. Deblurring via Stochastic Refinement. In Proc. CVPR.

[57]

Wenbin Xie, Dehua Song, Chang Xu, Chunjing Xu, Hui Zhang, and Yunhe Wang. 2021. Learning Frequency-Aware Dynamic Network for Efficient Super-Resolution. In Proc. ICCV.

[58]

Yanchao Yang and Stefano Soatto. 2020. FDA: Fourier Domain Adaptation for Semantic Segmentation. In Proc. CVPR.

[59]

Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, and Tao Mei. 2022. Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning. In Proc. ECCV.

Digital Library

[60]

Hu Yu, Jie Huang, Feng Zhao, Jinwei Gu, Chen Change Loy, Deyu Meng, Chongyi Li, et al. 2022. Deep Fourier Up-Sampling. In Proc. NeurIPS.

[61]

Boxiang Yun, Xingran Xie, Qingli Li, and Yan Wang. 2023. Uni-Dual: A Generic Unified Dual-Task Medical Self-Supervised Learning Framework. In Proc. ACM MM. 3887--3896.

Digital Library

[62]

Syed Waqas Zamir, Aditya Arora, Salman H. Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient Transformer for High-Resolution Image Restoration. In Proc. CVPR.

[63]

Syed Waqas Zamir, Aditya Arora, Salman H. Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. 2021. Multi-Stage Progressive Image Restoration. In Proc. CVPR.

[64]

Hongguang Zhang, Yuchao Dai, Hongdong Li, and Piotr Koniusz. 2019. Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring. In Proc. CVPR.

[65]

Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Björn Stenger, Wei Liu, and Hongdong Li. 2020. Deblurring by Realistic Blurring. In Proc. CVPR.

[66]

Bolun Zheng, Shanxin Yuan, Chenggang Yan, Xiang Tian, Jiyong Zhang, Yaoqi Sun, Lin Liu, Ale Leonardis, and Gregory Slabaugh. 2022. Learning Frequency Domain Priors for Image Demoireing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2022), 7705--7717. https://doi.org/10.1109/TPAMI.2021.3115139

[67]

Yijie Zhong, Bo Li, Lv Tang, Senyun Kuang, Shuang Wu, and Shouhong Ding. 2022. Detecting Camouflaged Object in Frequency Domain. In Proc. CVPR.

[68]

Zhisheng Zhong, Tiancheng Shen, Yibo Yang, Zhouchen Lin, and Chao Zhang. 2018. Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution. In Proc. NeurIPS.

Cited By

Zhang ZChen ZHu DLi MXu ZFeng HLi QChen Y(2025)Jitter-Aware Restoration With Equivalent Jitter Model for Remote Sensing Push-Broom ImageIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352967163(1-14)Online publication date: 2025
https://doi.org/10.1109/TGRS.2025.3529671
Yang DZhu ZGe HXu CZhang J(2025)Wavelet-Transform-Based Neural Network for Tidal Flat Remote Sensing Image DeblurringIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.352970418(6152-6163)Online publication date: 2025
https://doi.org/10.1109/JSTARS.2025.3529704
Wang PKurihara TYu J(2025)CorNet: Enhancing Motion Deblurring in Challenging Scenarios Using Correlation Image SensorIEEE Access10.1109/ACCESS.2025.354359913(33834-33848)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3543599

Index Terms

LoFormer: Local Frequency Transformer for Image Deblurring
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

Spatially variant defocus blur map estimation and deblurring from a single image

A blur map estimation method using edge information is proposed.The blur map is segmented into multiple superpixels according to the image contours.Ringing artifacts and noise are detected and removed after deconvolution. In this paper, we propose a ...
Adaptive Non-Local Regression Prior based on Transformer for Image Deblurring
ICCPR '23: Proceedings of the 2023 12th International Conference on Computing and Pattern Recognition

The non-local regression prior has shown promising results in image deblurring by effectively combining local continuity and non-local self-similarity of images. However, traditional non-local regression priors have limited representation capacity, and ...
Multi-Scale Image Deblurring Based on Local Region Selection and Image Block Classification
WCNA 2017: Proceedings of the 2017 International Conference on Wireless Communications, Networking and Applications

A multi-scale image deblurring method based on local region selection and image block classification is proposed in this paper. Firstly, optimal local region is automatically chosen from the original blurred image and built to pyramid images. Then, the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
417
Total Downloads

Downloads (Last 12 months)417
Downloads (Last 6 weeks)238

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZChen ZHu DLi MXu ZFeng HLi QChen Y(2025)Jitter-Aware Restoration With Equivalent Jitter Model for Remote Sensing Push-Broom ImageIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352967163(1-14)Online publication date: 2025
https://doi.org/10.1109/TGRS.2025.3529671
Yang DZhu ZGe HXu CZhang J(2025)Wavelet-Transform-Based Neural Network for Tidal Flat Remote Sensing Image DeblurringIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.352970418(6152-6163)Online publication date: 2025
https://doi.org/10.1109/JSTARS.2025.3529704
Wang PKurihara TYu J(2025)CorNet: Enhancing Motion Deblurring in Challenging Scenarios Using Correlation Image SensorIEEE Access10.1109/ACCESS.2025.354359913(33834-33848)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3543599

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten