Abstract
Deep convolutional neural networks, particularly large models with large kernels (3 × 3 or more), have achieved significant progress in single image super-resolution (SISR) tasks. However, the heavy computational footprint of such models prevents their deployment in real-time, resource-constrained environments. Conversely, 1 × 1 convolutions have substantial computational efficiency, but struggle with aggregating local spatial representations, which is an essential capability for SISR models. In response to this dichotomy, we propose to harmonize the merits of both 3 × 3 and 1 × 1 kernels, and exploit their great potential for lightweight SISR tasks. Specifically, we propose a simple yet effective fully 1 × 1 convolutional network, named shift-Conv-based network (SCNet). By incorporating a parameter-free spatial-shift operation, the fully 1 × 1 convolutional network is equipped with a powerful representation capability and impressive computational efficiency. Extensive experiments demonstrate that SCNets, despite their fully 1 × 1 convolutional structure, consistently match or even surpass the performance of existing lightweight SR models that employ regular convolutions. The code and pretrained models can be found at https://github.com/Aitical/SCNet.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
V. K. Ha, J. C. Ren, X. Y. Xu, S. Zhao, G. Xie, V. Masero, A. Hussain. Deep learning based single image super-resolution: A survey. International Journal of Automation and Computing, vol. 16, no. 4, pp. 413–426, 2019. DOI: https://doi.org/10.1007/s11633-019-1183-x.
G. Gendy, G. H. He, N. Sabor. Lightweight image super-resolution based on deep learning: State-of-the-art and future directions. Information Fusion, vol. 94, pp. 284–310, 2023. DOI: https://doi.org/10.1016/j.inffus.2023.01.024.
C. Dong, C. C. Loy, K. M. He, X. O. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2439281.
C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. H. Wang, W. Z. Shi. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 105–114, 2017. DOI: https://doi.org/10.1109/CVPR.2017.19.
B. Lim, S. Son, H. Kim, S. Nah, K. M. Lee. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, pp. 1132–1140, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.151.
Y. L. Zhang, Y. P. Tian, Y. Kong, B. N. Zhong, Y. Fu. Residual dense network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 2472–2481, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00262.
M. Haris, G. Shakhnarovich, N. Ukita. Deep back-projection networks for super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 1664–1673, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00179.
Z. Liu, Y. T. Lin, Y. Cao, H. Hu, Y. X. Wei, Z. Zhang, S. Lin, B. N. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 9992–10002, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00986.
sJ. Y. Liang, J. Z. Cao, G. L. Sun, K. Zhang, L. Van Gool, R. Timofte. SwinIR: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Montreal, Canada, pp. 1833–1844, 2021. DOI: https://doi.org/10.1109/ICCVW54120.2021.00210.
C. Dong, C. C. Loy, X. O. Tang. Accelerating the super-resolution convolutional neural network. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, pp. 391–407, 2016. DOI: https://doi.org/10.1007/978-3-319-46475-6_25.
Ahn, B. Kang, K. A. Sohn. Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 256–272, 2018. DOI: https://doi.org/10.1007/978-3-030-01249-6_16.
L. G. Wang, X. Y. Dong, Y. Q. Wang, X. Y. Ying, Z. P. Lin, W. An, Y. L. Guo. Exploring sparsity in image super-resolution for efficient inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 4915–4924, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00488.
X. D. Zhang, H. Zeng, L. Zhang. Edge-oriented convolution block for real-time super resolution on mobile devices. In Proceedings of the 29th ACM International Conference on Multimedia, pp. 4034–4043, 2021. DOI: https://doi.org/10.1145/3474085.3475291.3.
G. W. Gao, W. J. Li, J. C. Li, F. Wu, H. M. Lu, Y. Yu. Feature distillation interaction weighting network for lightweight image super-resolution. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, pp. 661–669, 2022. DOI: https://doi.org/10.1609/aaai.v36i1.19946.
J. M. Li, T. Dai, M. Y. Zhu, B. Chen, Z. Wang, S. T. Xia. FSR: A general frequency-oriented framework to accelerate image super-resolution networks. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington DC, USA, pp. 1343–1350, 2023. DOI: https://doi.org/10.1609/aaai.v37i1.25218.
Z. Liu, H. Z. Mao, C. Y. Wu, C. Feichtenhofer, T. Darrell, S. N. Xie. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 11966–11976, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01167.
X. H. Ding, X. Y. Zhang, Y. G. Zhou, J. G. Han, G. G. Ding, J. Sun. Scaling up your kernels to 31×31: Revisiting large kernel design in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, pp. 11953–11965, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01166.
J. Lin, C. Gan, S. Han. TSM: Temporal shift module for efficient video understanding. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 7082–7092, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00718.
W. J. Chen, D. Xie, Y. Zhang, S. L. Pu. All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 7234–7243, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00741.
J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, Y. C. Wei. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 764–773, 2017. DOI: https://doi.org/10.1109/ICCV.2017.89.
L. L. Jing, Y. L. Tian. Self-supervised visual feature learning with deep neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 11, pp. 4037–4058, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2992393.
J. C. Li, Z. H. Pei, T. Y. Zeng. From beginner to master: A survey for deep learning-based single-image super-resolution, [Online], Available: https://arxiv.org/abs/2109.14335, 2021.
J. Kim, J. K. Lee, K. M. Lee. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1646–1654, 2016. DOI: https://doi.org/10.1109/CVPR.2016.182.
Y. Tai, J. Yang, X. M. Liu. Image super-resolution via deep recursive residual network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 2790–2798, 2017. DOI: https://doi.org/10.1109/CVPR.2017.298.
J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.
Y. L. Zhang, K. P. Li, K. Li, L. C. Wang, B. N. Zhong, Y. Fu. Image super-resolution using very deep residual channel attention networks. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 294–310, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_18.
T. Dai, J. R. Cai, Y. B. Zhang, S. T. Xia, L. Zhang. Second-order attention network for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 11057–11066, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01132.
B. Niu, W. L. Wen, W. Q. Ren, X. D. Zhang, L. P. Yang, S. Z. Wang, K. H. Zhang, X. C. Cao, H. F. Shen. Single image super-resolution via a holistic attention network. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 191–207, 2020. DOI: https://doi.org/10.1007/978-3-030-58610-2_12.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations, 2021.
H. T. Chen, Y. H. Wang, T. Y. Guo, C. Xu, Y. P. Deng, Z. H. Liu, S. W. Ma, C. J. Xu, C. Xu, W. Gao. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 12294–12305, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01212.
K. Zhang, Y. W. Li, J. Y. Liang, J. Z. Cao, Y. L. Zhang, H. Tang, D. P. Fan, R. Timofte, L. Van Gool. Practical blind image denoising via swin-conv-UNet and data synthesis. Machine Intelligence Research, vol. 20, no. 6, pp. 822–836, 2023. DOI: https://doi.org/10.1007/s11633-023-1466-0.
H. Wang, Y. L. Zhang, C. Qin, L. Van Gool, Y. Fu. Global aligned structured sparsity learning for efficient image super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 10974–10989, 2023. DOI: https://doi.org/10.1109/TPAMI.2023.3268675.
G. Wu, J. J. Jiang, X. M. Liu. A practical contrastive learning framework for single-image super-resolution. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: https://doi.org/10.1109/TNNLS.2023.3290038.
G. Wu, J. J. Jiang, K. Jiang, X. M. Liu. Learning from history: Task-agnostic model contrastive learning for image restoration. In Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, pp. 5976–5984, 2024.
Y. M. Zhang, H. T. Chen, X. H. Chen, Y. P. Deng, C. J. Xu, Y. H. Wang. Data-free knowledge distillation for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 7848–7857, 2021. DOI: https://doi.org/10.1109/CV-PR46437.2021.00776.
H. Zhao, O. Gallo, I. Frosio, J. Kautz. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47–57, 2017. DOI: https://doi.org/10.1109/TCI.2016.2644865.
Z. Hui, X. M. Wang, X. B. Gao. Fast and accurate single image super-resolution via information distillation network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 723–731, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00082.
Z. Hui, X. B. Gao, Y. C. Yang, X. M. Wang. Lightweight image super-resolution with information multi-distillation network. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, pp. 2024–2032, 2019. DOI: https://doi.org/10.1145/3343031.3351084.
W. B. Li, K. Zhou, L. Qi, N. J. Jiang, J. B. Lu, J. Y. Jia. LAPAR: Linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Article number 1708, 2020.
B. Li, B. Wang, J. B. Liu, Z. Q. Qi, Y. Shi. s-LWSR: Super lightweight super-resolution network. IEEE Transactions on Image Processing, vol. 29, pp. 8368–8380, 2020. DOI: https://doi.org/10.1109/TIP.2020.3014953.
L. Sun, J. S. Pan, J. H. Tang. ShuffleMixer: An efficient ConvNet for image super-resolution. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022.
B. C. Wu, A. Wan, X. Y. Yue, P. H. Jin, S. C. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, K. Keutzer. Shift: A zero flop, ZERO parameter alternative to spatial convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 9127–9135, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00951.
Y. Jeon, J. Kim. Constructing fast network through deconstruction of convolution. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 5955–5965, 2018.
X. D. Zhang, H. Zeng, S. Guo, L. Zhang. Efficient long-range attention network for image super-resolution. In Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel, pp. 649–667, 2022. DOI: https://doi.org/10.1007/978-3-031-19790-1_39.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, pp. 1122–1131, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.150.
M. Bevilacqua, A. Roumy, C. Guillemot, M. L. Alberi-Morel. Low-complexity single-image super-resolution based on non-negative neighbor embedding. In Proceedings of the British Machine Vision Conference, Surrey, UK, pp. 1–10, 2012.
R. Zeyde, M. Elad, M. Protter. On single image scale-up using sparse-representations. In Proceedings of the 7th International Conference on Curves and Surfaces, Avignon, France, pp. 711–730, 2010.DOI: https://doi.org/10.1007/978-3-642-27413-8_47.
D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the 8th IEEE International Conference on Computer Vision, Vancouver, Canada, pp. 416–423, 2001. DOI: https://doi.org/10.1109/ICCV.2001.937655.
J. B. Huang, A. Singh, N. Ahuja. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 5197–5206, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299156.
Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, K. Aizawa. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, vol. 76, no. 20, pp. 21811–21838, 2017. DOI: https://doi.org/10.1007/s11042-016-4020-z.
W. S. Lai, J. B. Huang, N. Ahuja, M. H. Yang. Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 5835–5843, 2017. DOI: https://doi.org/10.1109/CVPR.2017.618.
G. W. Gao, Z. X. Wang, J. C. Li, W. J. Li, Y. Y. Yu, T. Zeng. Lightweight bimodal network for single-image super-resolution via symmetric CNN and recursive transformer. In Proceedings of the 31st International Joint Conference on Artificial Intelligence, Vienna, Austria, pp. 913–919, 2022.
J. Kim, J. K. Lee, K. M. Lee. Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1637–1645, 2016. DOI: https://doi.org/10.1109/CVPR.2016.181.
Z. Li, J. L. Yang, Z. Liu, X. M. Yang, G. Jeon, W. Wu. Feedback network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 3862–3871, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00399.
J. J. Gu, C. Dong. Interpreting super-resolution networks with local attribution maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 9195–9204, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00908.
C. F. Wang, Z. Li, J. Shi. Lightweight image super-resolution with adaptive weighted learning network, [Online], Available: https://arxiv.org/abs/1904.02358, 2019.
Acknowledgements
The research was supported by the National Natural Science Foundation of China, China (Nos. U23B2009 and 92270116), and was partially supported by the Fundamental Research Funds for the Central Universities, China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declared that they have no conflicts of interest to this work.
Additional information
Gang Wu received the B. Eng. degree in computer science from the School of Computer Science and Technology, Soochow University, China in 2020. He is currently a Ph. D. degree candidate in Faculty of Computing, Harbin Institute of Technology, China.
His research interests include image restoration, representation learning and self-supervised learning.
Junjun Jiang received the B. Sc. degree in mathematics from the Huaqiao University, China in 2009, and the Ph. D. degree in computer science from Wuhan University, China in 2014. From 2015 to 2018, he was an associate professor with the School of Computer Science, China University of Geosciences, China. From 2016 to 2018, he was a Project Researcher with the National Institute of Informatics (NII), Japan. He is currently a professor with the School of Computer Science and Technology, Harbin Institute of Technology, China. He won the Best Student Paper Runner-up Award at MMM 2017, the Finalist of the World’s FIRST 10 K Best Paper Award at ICME 2017, and the Best Paper Award at IFTC 2018. He received the 2016 China Computer Federation (CCF) Outstanding Doctoral Dissertation Award and 2015 ACM Wuhan Doctoral Dissertation Award.
His research interests include image processing and computer vision.
Kui Jiang received the M. Eng. and Ph. D. degrees in computer science from the School of Computer Science, Wuhan University, China in 2019 and 2022, respectively. Before July 2023, he was a research scientist with the Cloud BU, Huawei, China. He is currently an associate professor with the School of Computer Science and Technology, Harbin Institute of Technology, China. He received the 2022 ACM Wuhan Doctoral Dissertation Award, China
His research interests include image/video processing and computer vision.
Xianming Liu received the B. Sc., M. Sc., and Ph. D. degrees in computer science from the Harbin Institute of Technology (HIT), China in 2006, 2008 and 2012, respectively. In 2011, he spent half a year at the Department of Electrical and Computer Engineering, McMaster University, Canada, as a visiting student, where he was a post-doctoral fellow from 2012 to 2013. He was a project researcher with the National Institute of Informatics (NII), Japan from 2014 to 2017. He is currently a professor with the School of Computer Science and Technology, HIT, China. He was a receipt of the IEEE ICME 2016 Best Student Paper Award.
His research interests include trustworthy AI, computational imaging, biomedical signal compression and 3D signal processing and analysis.
Rights and permissions
About this article
Cite this article
Wu, G., Jiang, J., Jiang, K. et al. Fully 1 × 1 Convolutional Network for Lightweight Image Super-resolution. Mach. Intell. Res. 21, 1062–1076 (2024). https://doi.org/10.1007/s11633-024-1501-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-024-1501-9