skip to main content
10.1145/3664647.3680888acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

LoFormer: Local Frequency Transformer for Image Deblurring

Published: 28 October 2024 Publication History

Abstract

Due to the computational complexity of self-attention (SA), prevalent techniques for image deblurring often resort to either adopting localized SA or employing coarse-grained global SA methods, both of which exhibit drawbacks such as compromising global modeling or lacking fine-grained correlation. In order to address this issue by effectively modeling long-range dependencies without sacrificing fine-grained details, we introduce a novel approach termed Local Frequency Transformer (LoFormer). Within each unit of LoFormer, we incorporate a Local Channel-wise SA in the frequency domain (Freq-LC) to simultaneously capture cross-covariance within low- and high-frequency local windows. These operations offer the advantage of (1) ensuring equitable learning opportunities for both coarse-grained structures and fine-grained details, and (2) exploring a broader range of representational properties compared to coarse-grained global SA methods. Additionally, we introduce an MLP Gating mechanism complementary to Freq-LC, which serves to filter out irrelevant features while enhancing global learning capabilities. Our experiments demonstrate that LoFormer significantly improves performance in the image deblurring task, achieving a PSNR of 34.09 dB on the GoPro dataset with 126G FLOPs. https://github.com/DeepMed-Lab-ECNU/Single-Image-Deblur

References

[1]
Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR, Vol. abs/1607.06450 (2016).
[2]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. In Proc. ECCV. 213--229.
[3]
Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021. Pre-Trained Image Processing Transformer. In Proc. CVPR.
[4]
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. 2022. Simple Baselines for Image Restoration. In Proc. ECCV.
[5]
Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Chengpeng Chen. 2021. HINet: Half Instance Normalization Network for Image Restoration. In Proc. CVPR Workshop.
[6]
Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, and Ilya Sutskever. 2020. Generative Pretraining From Pixels. In Proc? ICML. 1691--1703.
[7]
Lu Chi, Borui Jiang, and Yadong Mu. 2020. Fast Fourier Convolution. In Proc. NeurIPS, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).
[8]
Sung-Jin Cho, Seo-Won Ji, Jun-Pyo Hong, Seung-Won Jung, and Sung-Jea Ko. 2021. Rethinking Coarse-to-Fine Approach in Single Image Deblurring. In Proc. ICCV.
[9]
Xiaojie Chu, Liangyu Chen, Chengpeng Chen, and Xin Lu. 2022. Improving Image Restoration by Revisiting Global Information Aggregation. In Proc. ECCV.
[10]
Yuning Cui, Wenqi Ren, Xiaochun Cao, and Alois Knoll. 2023. Focal Network for Image Restoration. In Proc. ICCV.
[11]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. CVPR.
[12]
Jiangxin Dong, Jinshan Pan, Zhongbao Yang, and Jinhui Tang. 2023. Multi-Scale Residual Low-Pass Filter Network for Image Deblurring. In Proc. ICCV.
[13]
Shuting Dong, Zhe Wu, Feng Lu, and Chun Yuan. 2023. Enhanced Image Deblurring: An Efficient Frequency Exploitation and Preservation Network. In Proc. ACM MM.
[14]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proc. ICLR.
[15]
Zhenxuan Fang, Fangfang Wu, Weisheng Dong, Xin Li, Jinjian Wu, and Guangming Shi. 2023. Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring. In Proc. CVPR.
[16]
Hu Gao and Depeng Dang. 2024. Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring. arXiv preprint arXiv:2403.20106 (2024).
[17]
Jun Guo and Hongyang Chao. 2016. Building Dual-Domain Representations for Compression Artifacts Reduction. In Proc. ECCV.
[18]
Shaojie Guo, Haofei Song, Qingli Li, and Yan Wang. 2024. Spatially-Variant Degradation Model for Dataset-free Super-resolution. arXiv preprint arXiv:2407.08252 (2024).
[19]
Chi Han, Mingxuan Wang, Heng Ji, and Lei Li. 2021. Learning shared semantic space for speech-to-text translation. arXiv preprint arXiv:2105.03095 (2021).
[20]
Ronghang Hu, Amanpreet Singh, Trevor Darrell, and Marcus Rohrbach. 2020. Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA. In Proc?CVPR.
[21]
Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, Jinwoo Shin, and Hyong-Euk Lee. 2024. Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization. In Proc. CVPR.
[22]
Taewoo Kim, Hoonhee Cho, and Kuk-Jin Yoon. 2024. Frequency-aware Event-based Video Deblurring for Real-World Motion Blur. In Proc. CVPR. 24966--24976.
[23]
Lingshun Kong, Jiangxin Dong, Jianjun Ge, Mingqiang Li, and Jinshan Pan. 2023. Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring. In Proc. CVPR.
[24]
Lingshun Kong, Jiangxin Dong, Ming-Hsuan Yang, and Jinshan Pan. 2024. Efficient Visual State Space Model for Image Deblurring. arXiv preprint arXiv:2405.14343 (2024).
[25]
Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2018. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. In Proc. CVPR.
[26]
Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. 2019. DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better. In Proc. ICCV.
[27]
Ao Li, Le Zhang, Yun Liu, and Ce Zhu. 2023. Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution. In Proc. ICCV.
[28]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. SwinIR: Image Restoration Using Swin Transformer. In Proc. ICCV Workshop.
[29]
Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, and Ming-Hsuan Yang. 2024. Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring. In Proc. CVPR.
[30]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proc?ICCV.
[31]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proc. ICCV.
[32]
Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In Proc. ICLR.
[33]
Xiaoqian Lv, Shengping Zhang, Chenyang Wang, Yichen Zheng, Bineng Zhong, Chongyi Li, and Liqiang Nie. 2024. Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and Deblurring. In Proc. CVPR.
[34]
Xintian Mao, Qingli Li, and Yan Wang. 2024. AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring. In Proc. CVPR.
[35]
Xintian Mao, Yiming Liu, Fengze Liu, Qingli Li, Wei Shen, and Yan Wang. 2023. Intriguing Findings of Frequency Selection for Image Deblurring. In Proc. AAAI.
[36]
Sun Mengdi, Xiaohai He, Xiong Shuhua, Chao Ren, and Li Xinglong. 2020. Reduction of JPEG compression artifacts based on DCT coefficients prediction. Neurocomputing (2020).
[37]
Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. 2017. Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring. In Proc. CVPR.
[38]
Seungjun Nah, Sanghyun Son, Suyoung Lee, Radu Timofte, and Kyoung Mu Lee. 2021. NTIRE 2021 Challenge on Image Deblurring. In Proc. CVPR Workshop.
[39]
Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li. 2020. FcaNet: Frequency Channel Attention Networks. In Proc. ICCV.
[40]
Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, and Jie Zhou. 2021. Global Filter Networks for Image Classification. In Proc. NeurIPS.
[41]
Jaesung Rim, Haeyun Lee, Jucheol Won, and Sunghyun Cho. 2020. Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms. In Proc. ECCV.
[42]
Oren Rippel, Jasper Snoek, and Ryan P. Adams. 2015. Spectral Representations for Convolutional Neural Networks. In Proc. NeurIPS.
[43]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proc. MICCAI.
[44]
Jie Shao, Xin Wen, Bingchen Zhao, and Xiangyang Xue. 2021. Temporal Context Aggregation for Video Retrieval With Contrastive Learning. In Proc. WACV. 3268--3278.
[45]
Jiwei Shen, Pengjie Lou, Liang Yuan, Shujing Lyu, and Yue Lu. 2024. VME-Transformer: Enhancing Visual Memory Encoding for Navigation in Interactive Environments. IEEE Robotics and Automation Letters, Vol. 9, 1 (2024), 643--650. https://doi.org/10.1109/LRA.2023.3333238
[46]
Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. 2019. Human-Aware Motion Deblurring. In Proc. ICCV.
[47]
Haofei Song, Xintian Mao, Jing Yu, Qingli Li, and Yan Wang. 2024. I 3 Net: Inter-Intra-slice Interpolation Network for Medical Slice Synthesis. IEEE Transactions on Medical Imaging (2024).
[48]
Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2020. VL-BERT: Pre-training of Generic Visual-Linguistic Representations. In Proc. ICLR.
[49]
Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. 2019. VideoBERT: A Joint Model for Video and Language Representation Learning. In Proc. ICCV. 7463--7472.
[50]
Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, and Victor Lempitsky. 2022. Resolution-robust Large Mask Inpainting with Fourier Convolutions. In Proc. WACV.
[51]
Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. 2018. Scale-Recurrent Network for Deep Image Deblurring. In Proc. CVPR.
[52]
Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, and Chia-Wen Lin. 2022. Stripformer: Strip Transformer for Fast Image Deblurring. In Proc. ECCV.
[53]
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. 2022. MAXIM: Multi-Axis MLP for Image Processing. In Proc. CVPR.
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proc. NeurIPS.
[55]
Zhendong Wang, Xiaodong Cun, Jianmin Bao, and Jianzhuang Liu. 2022. Uformer: A General U-Shaped Transformer for Image Restoration. In Proc. CVPR.
[56]
Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, and Peyman Milanfar. 2022. Deblurring via Stochastic Refinement. In Proc. CVPR.
[57]
Wenbin Xie, Dehua Song, Chang Xu, Chunjing Xu, Hui Zhang, and Yunhe Wang. 2021. Learning Frequency-Aware Dynamic Network for Efficient Super-Resolution. In Proc. ICCV.
[58]
Yanchao Yang and Stefano Soatto. 2020. FDA: Fourier Domain Adaptation for Semantic Segmentation. In Proc. CVPR.
[59]
Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, and Tao Mei. 2022. Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning. In Proc. ECCV.
[60]
Hu Yu, Jie Huang, Feng Zhao, Jinwei Gu, Chen Change Loy, Deyu Meng, Chongyi Li, et al. 2022. Deep Fourier Up-Sampling. In Proc. NeurIPS.
[61]
Boxiang Yun, Xingran Xie, Qingli Li, and Yan Wang. 2023. Uni-Dual: A Generic Unified Dual-Task Medical Self-Supervised Learning Framework. In Proc. ACM MM. 3887--3896.
[62]
Syed Waqas Zamir, Aditya Arora, Salman H. Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient Transformer for High-Resolution Image Restoration. In Proc. CVPR.
[63]
Syed Waqas Zamir, Aditya Arora, Salman H. Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. 2021. Multi-Stage Progressive Image Restoration. In Proc. CVPR.
[64]
Hongguang Zhang, Yuchao Dai, Hongdong Li, and Piotr Koniusz. 2019. Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring. In Proc. CVPR.
[65]
Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Björn Stenger, Wei Liu, and Hongdong Li. 2020. Deblurring by Realistic Blurring. In Proc. CVPR.
[66]
Bolun Zheng, Shanxin Yuan, Chenggang Yan, Xiang Tian, Jiyong Zhang, Yaoqi Sun, Lin Liu, Ale Leonardis, and Gregory Slabaugh. 2022. Learning Frequency Domain Priors for Image Demoireing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2022), 7705--7717. https://doi.org/10.1109/TPAMI.2021.3115139
[67]
Yijie Zhong, Bo Li, Lv Tang, Senyun Kuang, Shuang Wu, and Shouhong Ding. 2022. Detecting Camouflaged Object in Frequency Domain. In Proc. CVPR.
[68]
Zhisheng Zhong, Tiancheng Shen, Yibo Yang, Zhouchen Lin, and Chao Zhang. 2018. Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution. In Proc. NeurIPS.

Cited By

View all
  • (2025)Jitter-Aware Restoration With Equivalent Jitter Model for Remote Sensing Push-Broom ImageIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352967163(1-14)Online publication date: 2025
  • (2025)Wavelet-Transform-Based Neural Network for Tidal Flat Remote Sensing Image DeblurringIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.352970418(6152-6163)Online publication date: 2025
  • (2025)CorNet: Enhancing Motion Deblurring in Challenging Scenarios Using Correlation Image SensorIEEE Access10.1109/ACCESS.2025.354359913(33834-33848)Online publication date: 2025

Index Terms

  1. LoFormer: Local Frequency Transformer for Image Deblurring

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. frequency domain
    2. image deblurring
    3. self-attention

    Qualifiers

    • Research-article

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)417
    • Downloads (Last 6 weeks)238
    Reflects downloads up to 08 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Jitter-Aware Restoration With Equivalent Jitter Model for Remote Sensing Push-Broom ImageIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352967163(1-14)Online publication date: 2025
    • (2025)Wavelet-Transform-Based Neural Network for Tidal Flat Remote Sensing Image DeblurringIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.352970418(6152-6163)Online publication date: 2025
    • (2025)CorNet: Enhancing Motion Deblurring in Challenging Scenarios Using Correlation Image SensorIEEE Access10.1109/ACCESS.2025.354359913(33834-33848)Online publication date: 2025

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media