Skip to main content
Log in

A Nonlinear, Regularized, and Data-independent Modulation for Continuously Interactive Image Processing Network

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Most studies on convolutional Neural Network (CNN) based image processing have proposed networks that can be optimized for a single level. Here, the term “level" refers to the specific objective defined for each task, such as the degree of noise in denoising tasks. Hence, they underperform on other levels and must be retrained to deliver optimal performance. Using multiple models to cover multiple levels involves very high computational costs. To solve these problems, recent approaches train the networks on two different levels and propose their own modulation methods to enable the arbitrary intermediate levels. However, many of them 1) have difficulty adapting from one level to the other, 2) suffer from unintended artifacts in the intermediate levels, or 3) require large memory and computational cost. In this paper, we propose a novel framework using Filter Transition Network (FTN), which is a non-linear module that easily adapts to new levels, is regularized to prevent undesirable side-effects, and extremely lightweight being a data-independent module. Additionally, for stable learning of FTN, we newly propose a method to initialize nonlinear CNNs with identity mappings. Extensive results for various image processing tasks indicate that the performance of FTN is stable regarding adaptation and modulation and is comparable to that of the other heavy frameworks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Data Availability

Our manuscript has no associated data, and the source code will be uploaded on Github.

References

  • Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE conference on computer vision and pattern recognition (CVPR) Workshops.

  • Blau, Y., & Michaeli, T. (2018). The perception-distortion tradeoff. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6228–6237).

  • Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., & Zelnik-Manor, L. (2018). The 2018 pirm challenge on perceptual image super-resolution. In Proceedings of the European conference on computer vision (ECCV) (pp. 0–0).

  • Dong, C., Deng, Y., Change Loy, C., & Tang, X. (2015). Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE international conference on computer vision (pp. 576–584).

  • Dong, C., Loy, C. C., He, K., & Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.

    Article  Google Scholar 

  • Galteri, L., Seidenari, L., Bertini, M., & Del Bimbo, A. (2017). Deep generative adversarial compression artifact removal. In Proceedings of the IEEE international conference on computer vision (pp. 4826–4835).

  • Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576

  • Gatys, L. A., Ecker, A. S., Bethge, M., Hertzmann, A., & Shechtman, E. (2017). Controlling perceptual factors in neural style transfer. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3985–3993).

  • Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249–256).

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).

  • Guo, S., Yan, Z., Zhang, K., Zuo, W., & Zhang, L. (2019). Toward convolutional blind denoising of real photographs. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1712–1722).

  • Ha, D., Dai, A., & Le, Q. V. (2016). Hypernetworks. arXiv preprint arXiv:1609.09106

  • He, J., Dong, C., & Qiao, Y. (2019). Modulating image restoration with continual levels via adaptive feature modification layers. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11056–11064).

  • He, J., Dong, C., & Qiao, Y. (xxxx). Interactive multi-dimension modulation with dynamic controllable residual learning for image restoration. In European conference on computer vision. Springer.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026–1034).

  • Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision (pp. 1501–1510).

  • Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711). Springer

  • Kim, J., Kwon Lee, J., & Mu Lee, K. (2016). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1646–1654).

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).

  • Lai, W.-S., Huang, J.-B., Ahuja, N., & Yang, M. -H. (2017). Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 624–632).

  • Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4681–4690).

  • Lim, B., Son, S., Kim, H., Nah, S., & Mu Lee, K. (2017). Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 136–144).

  • Lin, T. -Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740–755). Springer

  • Liu, D., Wen, B., Fan, Y., Loy, C. C., & Huang, T. S. (2018) Non-local recurrent network for image restoration. In Advances in neural information processing systems (pp. 1673–1682).

  • Martin, D., Fowlkes, C., Tal, D., & Malik, J., et al. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Iccv Vancouver

  • Mildenhall, B., Barron, J. T., Chen, J., Sharlet, D., Ng, R., & Carroll, R. (2018). Burst denoising with kernel prediction networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2502–2510).

  • Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784

  • Mittal, A., Soundararajan, R., & Bovik, A. C. (2012). Making a “completely blind’’ image quality analyzer. IEEE Signal Processing Letters, 20(3), 209–212.

    Article  Google Scholar 

  • Moorthy, A. K., & Bovik, A. C. (2009). Visual importance pooling for image quality assessment. IEEE Journal of Selected Topics in Signal Processing, 3(2), 193–201.

    Article  Google Scholar 

  • Oliphant, T. E. (2006). A Guide to NumPy vol. 1. Trelgol Publishing USA.

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch. In NIPS Autodiff Workshop

  • Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434

  • Sheng, L., Lin, Z., Shao, J., & Wang, X. (2018). Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8242–8250).

  • Shoshan, A., Mechrez, R., & Zelnik-Manor, L. (2019). Dynamic-net: Tuning the objective without re-training for synthesis tasks. In Proceedings of the IEEE international conference on computer vision (pp. 3215–3223).

  • Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016). Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022

  • Wang, W., Guo, R., Tian, Y., & Yang, W. (2019). Cfsnet: Toward a controllable feature space for image restoration. In Proceedings of the IEEE international conference on computer vision (pp. 4140–4149).

  • Wang, X., Yu, K., Dong, C., Tang, X., & Loy, C. C. (2019). Deep network interpolation for continuous imagery effect transition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1692–1701).

  • Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., & Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 0–0).

  • Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P., et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.

    Article  Google Scholar 

  • Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492–1500).

  • Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., & Fu, Y. (2018). Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 286–301).

  • Zhang, Y., Tian, Y., Kong, Y., Zhong, B., & Fu, Y. (2018). Residual dense network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2472–2481).

  • Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing, 26(7), 3142–3155.

    Article  MathSciNet  Google Scholar 

  • Zhang, K., Zuo, W., & Zhang, L. (2018). Ffdnet: Toward a fast and flexible solution for cnn-based image denoising. IEEE Transactions on Image Processing, 27(9), 4608–4622.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was supported by R &D program for Advanced Integrated-intelligence for Identification (AIID) through the National Research Foundation of KOREA (NRF) funded by Ministry of Science and ICT (NRF-2018M3E3A1057289).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangyoun Lee.

Additional information

Communicated by Chen Change Loy.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, H., Kim, T., Son, H. et al. A Nonlinear, Regularized, and Data-independent Modulation for Continuously Interactive Image Processing Network. Int J Comput Vis 132, 74–94 (2024). https://doi.org/10.1007/s11263-023-01874-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01874-y

Keywords

Navigation