Skip to main content

CIDBNet: A Consecutively-Interactive Dual-Branch Network for JPEG Compressed Image Super-Resolution

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13802))

Included in the following conference series:

Abstract

Compressed image super-resolution (SR) task is useful in practical scenarios, such as mobile communication and the internet, where images are usually downsampled and compressed due to limited bandwidth and storage capacity. However, a combination of compression and downsampling degradations makes the SR problem more challenging. To restore high-quality and high-resolution images, local context and long-range dependency modeling are both crucial. In this paper, for JPEG compressed image SR, we propose a consecutively-interactive dual-branch network (CIDBNet) to take advantage of both convolution and transformer operations, which are good at extracting local features and global interactions, respectively. To better aggregate the two-branch information, we newly introduce an adaptive cross-branch fusion module (ACFM), which adopts a cross-attention scheme to enhance the two-branch features and then fuses them weighted by a content-adaptive map. Experiments show the effectiveness of CIDBNet, and in particular, CIDBNet achieves higher performance than a larger variant of HAT (HAT-L).

X. Qin and Y. Zhu—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–135 (2017)

    Google Scholar 

  2. Cao, J., Li, Y., Zhang, K., Van Gool, L.: Video super-resolution transformer. arXiv preprint arXiv:2106.06847 (2021)

  3. Chen, H., et al.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021)

    Google Scholar 

  4. Chen, H., He, X., Ren, C., Qing, L., Teng, Q.: CISRDCNN: super-resolution of compressed images using deep convolutional neural networks. Neurocomputing 285, 204–219 (2018)

    Article  Google Scholar 

  5. Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. arXiv preprint arXiv:2204.04676 (2022)

  6. Chen, X., Wang, X., Zhou, J., Dong, C.: Activating more pixels in image super-resolution transformer. arXiv preprint arXiv:2205.04437 (2022)

  7. Chen, Y., et al.: Mobile-former: bridging mobileNet and transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5270–5279 (2022)

    Google Scholar 

  8. Chu, X., Chen, L., Chen, C., Lu, X.: Revisiting global statistics aggregation for improving image restoration. arXiv preprint arXiv:2112.04491 (2021)

  9. Chu, X., Chen, L., Yu, W.: NAFSSR: Stereo image super-resolution using NAFNet. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1239–1248 (2022)

    Google Scholar 

  10. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941. PMLR (2017)

    Google Scholar 

  11. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  12. Dong, C., Deng, Y., Loy, C.C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 576–584 (2015)

    Google Scholar 

  13. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13

    Chapter  Google Scholar 

  14. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  15. Foi, A., Katkovnik, V., Egiazarian, K.: Pointwise shape-adaptive DCT for high-quality denoising and deblocking of grayscale and color images. IEEE Trans. Image Process. 16(5), 1395–1411 (2007)

    Article  MathSciNet  Google Scholar 

  16. Fu, X., Wang, X., Liu, A., Han, J., Zha, Z.J.: Learning dual priors for jpeg compression artifacts removal. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4086–4095 (2021)

    Google Scholar 

  17. Gunawan, A., Madjid, S.R.H.: CISRNet: compressed image super-resolution network. arXiv preprint arXiv:2201.06045 (2022)

  18. Guo, J., et al.: CMT: convolutional neural networks meet vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12175–12185 (2022)

    Google Scholar 

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  20. Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415 (2016)

  21. Li, W., Lu, X., Lu, J., Zhang, X., Jia, J.: On efficient transformer and image pre-training for low-level vision. arXiv preprint arXiv:2112.10175 (2021)

  22. Li, Y., Zhang, K., Cao, J., Timofte, R., Van Gool, L.: LocalViT: bringing locality to vision transformers. arXiv preprint arXiv:2104.05707 (2021)

  23. Liang, J., et al.: VRT: a video restoration transformer. arXiv preprint arXiv:2201.12288 (2022)

  24. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)

    Google Scholar 

  25. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

    Google Scholar 

  26. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  27. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)

  28. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  29. Mao, M., et al.: Dual-stream network for visual recognition. Adv. Neural. Inf. Process. Syst. 34, 25346–25358 (2021)

    Google Scholar 

  30. Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021)

  31. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML (2010)

    Google Scholar 

  32. Peng, Z., Huang, W., Gu, S., Xie, L., Wang, Y., Jiao, J., Ye, Q.: Conformer: local features coupling global representations for visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 367–376 (2021)

    Google Scholar 

  33. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  34. Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)

    Google Scholar 

  35. Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3147–3155 (2017)

    Google Scholar 

  36. Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)

    Google Scholar 

  37. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30 (2017)

    Google Scholar 

  38. Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consumer Electr. 38(1), xviii-xxxiv (1992)

    Google Scholar 

  39. Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5

    Chapter  Google Scholar 

  40. Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022)

    Google Scholar 

  41. Wu, H., et al.: CvT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)

    Google Scholar 

  42. Yang, R., Timofte, R., et al.: Aim 2022 challenge on super-resolution of compressed image and video: dataset, methods and results. In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW) (2022)

    Google Scholar 

  43. Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)

    Google Scholar 

  44. Zhang, J., Xiong, R., Zhao, C., Zhang, Y., Ma, S., Gao, W.: CONCOLOR: constrained non-convex low-rank model for image deblocking. IEEE Trans. Image Process. 25(3), 1246–1259 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  45. Zhang, K., Li, Y., Zuo, W., Zhang, L., Van Gool, L., Timofte, R.: Plug-and-play image restoration with deep denoiser prior. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)

    Google Scholar 

  46. Zhang, X., Yang, W., Hu, Y., Liu, J.: DMCNN: dual-domain multi-scale convolutional neural network for compression artifacts removal. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 390–394. IEEE (2018)

    Google Scholar 

  47. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18

    Chapter  Google Scholar 

  48. Zhang, Y., Li, K., Li, K., Zhong, B., Fu, Y.: Residual non-local attention networks for image restoration. arXiv preprint arXiv:1903.10082 (2019)

  49. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)

    Google Scholar 

  50. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 43(7), 2480–2495 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (Grant No. 2021ZD0201504), and National Natural Science Foundation of China (No.62106267).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chenghua Li or Jian Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qin, X., Zhu, Y., Li, C., Wang, P., Cheng, J. (2023). CIDBNet: A Consecutively-Interactive Dual-Branch Network for JPEG Compressed Image Super-Resolution. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25063-7_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25062-0

  • Online ISBN: 978-3-031-25063-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics