research-article

Separable Modulation Network for Efficient Image Super-Resolution

Authors:

Dingjiang HuangAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 8086 - 8094

https://doi.org/10.1145/3581783.3612353

Published: 27 October 2023 Publication History

Abstract

Deep learning-based models have demonstrated unprecedented success in image super-resolution (SR) tasks. However, more attention has been paid to lightweight SR models lately, due to the increasing demand for on-device inference. In this paper, we propose a novel Separable Modulation Network (SMN) for efficient image SR. The key parts of the SMN are the Separable Modulation Unit (SMU) and the Locality Self-enhanced Network (LSN). SMU enables global relational interactions but significantly eases the process by separating spatial modulation from channel aggregation, hence making the long-range interaction efficient. Specifically, spatial modulation extracts global contexts from spatial, and channel aggregation condenses all global context features into the channel modulator, ultimately the aggregated contexts are fused into the final features. In addition, LSN allows guiding the network to focus on more refined image attributes by encoding local contextual information. By coupling two complementary components, SMN can capture both short- and long-range contexts for accurate image reconstruction. Extensive experimental results demonstrate that our SMN achieves state-of-the-art performance among the existing efficient SR methods with less complexity.

References

[1]

Eirikur Agustsson and Radu Timofte. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In CVPR workshops. 126--135.

[2]

Namhyuk Ahn, Byungkon Kang, and Kyung-Ah Sohn. 2018. Fast, accurate, and lightweight super-resolution with cascading residual network. In ECCV. 252--268.

[3]

Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, and Andrew Gordon Wilson. 2018. There are many consistent explanations of unlabeled data: Why you should average. arXiv preprint arXiv:1806.05594 (2018).

[4]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).

[5]

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In BMVC.

[6]

Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. 2019. Toward real-world single image super-resolution: A new benchmark and a new model. In ICCV. 3086--3095.

[7]

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021b. Pre-trained image processing transformer. In CVPR. 12299--12310.

[8]

Yinbo Chen, Sifei Liu, and Xiaolong Wang. 2021a. Learning continuous image representation with local implicit image function. In CVPR. 8628--8638.

[9]

Xiangxiang Chu, Bo Zhang, Hailong Ma, Ruijun Xu, and Qingyuan Li. 2021. Fast, accurate and lightweight super-resolution with neural architecture search. In ICPR. 59--64.

[10]

Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. 2019. Second-order attention network for single image super-resolution. In CVPR. 11065--11074.

[11]

Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language modeling with gated convolutional networks. In ICML. 933--941.

[12]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE TPAMI, Vol. 38, 2 (2015), 295--307.

Digital Library

[13]

Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In ECCV. 391--407.

[14]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[15]

Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, and Shi-Min Hu. 2022a. Visual attention network. IEEE TPAMI (2022).

[16]

Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R Martin, Ming-Ming Cheng, and Shi-Min Hu. 2022b. Attention mechanisms in computer vision: A survey. Computational Visual Media (2022), 1--38.

[17]

Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, and Dacheng Tao. 2023. A Survey on Vision Transformer. IEEE TPAMI, Vol. 45, 1 (2023), 87--110.

[18]

Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016).

[19]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In CVPR. 4700--4708.

[20]

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In CVPR. 5197--5206.

[21]

Zheng Hui, Xinbo Gao, Yunchu Yang, and Xiumei Wang. 2019. Lightweight image super-resolution with information multi-distillation network. In ACM MM. 2024--2032.

[22]

Zheng Hui, Xiumei Wang, and Xinbo Gao. 2018. Fast and accurate single image super-resolution via information distillation network. In CVPR. 723--731.

[23]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016a. Accurate image super-resolution using very deep convolutional networks. In CVPR. 1646--1654.

[24]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016b. Deeply-recursive convolutional network for image super-resolution. In CVPR. 1637--1645.

[25]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[26]

Fangyuan Kong, Mingxi Li, Songwei Liu, Ding Liu, Jingwen He, Yang Bai, Fangmin Chen, and Lean Fu. 2022. Residual Local Feature Network for Efficient Super-Resolution. In CVPR workshops. 766--776.

[27]

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In CVPR. 624--632.

[28]

Jaewon Lee and Kyong Hwan Jin. 2022. Local Texture Estimator for Implicit Representation Function. In CVPR. 1929--1938.

[29]

Wonkyung Lee, Junghyup Lee, Dohyung Kim, and Bumsub Ham. 2020. Learning with privileged information for efficient image super-resolution. In ECCV. 465--482.

[30]

Siyuan Li, Zedong Wang, Zicheng Liu, Cheng Tan, Haitao Lin, Di Wu, Zhiyuan Chen, Jiangbin Zheng, and Stan Z Li. 2022b. Efficient multi-order gated aggregation network. arXiv preprint arXiv:2211.03295 (2022).

[31]

Wenbo Li, Kun Zhou, Lu Qi, Nianjuan Jiang, Jiangbo Lu, and Jiaya Jia. 2020. Lapar: Linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond. NeurIPS, Vol. 33 (2020), 20343--20355.

[32]

Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, et al. 2022c. Ntire 2022 challenge on efficient super-resolution: Methods and results. In CVPR. 1062--1102.

[33]

Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Jinjin Gu, Yu Qiao, and Chao Dong. 2022a. Blueprint Separable Residual Network for Efficient Image Super-Resolution. In CVPR. 833--843.

[34]

Zhen Li, Jinglei Yang, Zheng Liu, Xiaomin Yang, Gwanggil Jeon, and Wei Wu. 2019. Feedback network for image super-resolution. In CVPR. 3867--3876.

[35]

Jingyun Liang and et al. 2021. Swinir: Image restoration using swin transformer. In ICCV. 1833--1844.

[36]

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In CVPR workshops. 136--144.

[37]

Jie Liu, Jie Tang, and Gangshan Wu. 2020. Residual feature distillation network for lightweight image super-resolution. In ECCV. 41--55.

[38]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV. 10012--10022.

[39]

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV. 416--423.

[40]

Yusuke Matsui, Kota Ito, Yuji Aramaki, Azuma Fujimoto, Toru Ogawa, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2017. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, Vol. 76, 20 (2017), 21811--21838.

Digital Library

[41]

Sachin Mehta and Mohammad Rastegari. 2022. Separable self-attention for mobile vision transformers. arXiv preprint arXiv:2206.02680 (2022).

[42]

Yiqun Mei, Yuchen Fan, and Yuqian Zhou. 2021. Image super-resolution with non-local sparse attention. In CVPR. 3517--3526.

[43]

Yiqun Mei, Yuchen Fan, Yuqian Zhou, Lichao Huang, Thomas S Huang, and Honghui Shi. 2020. Image super-resolution with cross-scale non-local attention and exhaustive self-exemplars mining. In CVPR. 5690--5699.

[44]

Ying Nie, Kai Han, Zhenhua Liu, Chuanjian Liu, and Yunhe Wang. 2022. GhostSR: Learning Ghost Features for Efficient Image Super-Resolution. TMLR (2022).

[45]

Bin Sun, Yulun Zhang, Songyao Jiang, and Yun Fu. 2023. Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution. AAAI (2023).

[46]

Long Sun, Jinshan Pan, and Jinhui Tang. 2022. ShuffleMixer: An Efficient ConvNet for Image Super-Resolution. NeurIPS (2022).

[47]

Ying Tai, Jian Yang, Xiaoming Liu, and Chunyan Xu. 2017. Memnet: A persistent memory network for image restoration. In ICCV. 4539--4547.

[48]

Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, and Lei Zhang. 2017. Ntire 2017 challenge on single image super-resolution: Methods and results. In CVPR workshops. 114--125.

[49]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. NeurIPS, Vol. 30 (2017).

[50]

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, and Yulan Guo. 2021. Exploring sparsity in image super-resolution for efficient inference. In CVPR. 4917--4926.

[51]

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. Esrgan: Enhanced super-resolution generative adversarial networks. In ECCV workshops. 0-0.

[52]

Zhihao Wang, Jian Chen, and Steven CH Hoi. 2020. Deep learning for image super-resolution: A survey. IEEE TPAMI, Vol. 43, 10 (2020), 3365--3387.

[53]

Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. 2022. Uformer: A general u-shaped transformer for image restoration. In CVPR. 17683--17693.

[54]

Jianchao Yang, John Wright, Thomas S Huang, and Yi Ma. 2010. Image super-resolution via sparse representation. IEEE TIP, Vol. 19, 11 (2010), 2861--2873.

[55]

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient transformer for high-resolution image restoration. In CVPR. 5728--5739.

[56]

Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018c. Learning a single convolutional super-resolution network for multiple degradations. In CVPR. 3262--3271.

[57]

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018a. Image super-resolution using very deep residual channel attention networks. In ECCV. 286--301.

[58]

Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, and Yun Fu. 2019. Residual non-local attention networks for image restoration. ICLR (2019).

[59]

Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018b. Residual dense network for image super-resolution. In CVPR. 2472--2481.

[60]

Yulun Zhang, Huan Wang, Can Qin, and Yun Fu. 2022. Learning efficient image super-resolution networks via structure-regularized pruning. In ICLR.

[61]

Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, and Chao Dong. 2020. Efficient image super-resolution using pixel attention. In ECCV. 56--72.

Cited By

Ji JZhong BWu QMa K(2024)A Channel-Wise Multi-Scale Network for Single Image Super-ResolutionIEEE Signal Processing Letters10.1109/LSP.2024.337278131(805-809)Online publication date: 2024
https://doi.org/10.1109/LSP.2024.3372781
Liu JShi YNi G(2024)Positive incentive CNN structure coupled nonconvex model for image super-resolutionPhysica Scripta10.1088/1402-4896/ad421599:6(065249)Online publication date: 21-May-2024
https://doi.org/10.1088/1402-4896/ad4215
Chen QQin JWen W(2023)RepECN: Making ConvNets Better Again for Efficient Image Super-ResolutionSensors10.3390/s2323957523:23(9575)Online publication date: 2-Dec-2023
https://doi.org/10.3390/s23239575

Index Terms

Separable Modulation Network for Efficient Image Super-Resolution
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

A Fast Domain Adaptation Network for Image Super-Resolution
Image and Graphics
Abstract
Most previous super-resolution (SR) methods are based on high-resolution (HR) images and corresponding low-resolution (LR) images obtained artificially through bicubic downsampling. However, in real scenes, LR images are usually obtained with ...
Efficient Blind Image Super-Resolution
Advances in Computational Intelligence
Abstract
A hybrid method to Single Image Super-resolution is proposed. We used zero-shot super-resolution method to reconstruct high-resolution image from low-resolution one based on the degradation trained on unpaired high-resolution and low-resolution ...
Image super-resolution by estimating the enhancement weight of self example and external missing patches

Image super-resolution (SR) is the process of generating a high-resolution (HR) image using one or more low-resolution (LR) inputs. Many SR methods have been proposed, but generating the small-scale structure of an SR image remains a challenging task. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSFC

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
475
Total Downloads

Downloads (Last 12 months)204
Downloads (Last 6 weeks)8

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ji JZhong BWu QMa K(2024)A Channel-Wise Multi-Scale Network for Single Image Super-ResolutionIEEE Signal Processing Letters10.1109/LSP.2024.337278131(805-809)Online publication date: 2024
https://doi.org/10.1109/LSP.2024.3372781
Liu JShi YNi G(2024)Positive incentive CNN structure coupled nonconvex model for image super-resolutionPhysica Scripta10.1088/1402-4896/ad421599:6(065249)Online publication date: 21-May-2024
https://doi.org/10.1088/1402-4896/ad4215
Chen QQin JWen W(2023)RepECN: Making ConvNets Better Again for Efficient Image Super-ResolutionSensors10.3390/s2323957523:23(9575)Online publication date: 2-Dec-2023
https://doi.org/10.3390/s23239575

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten