Abstract
Most studies on convolutional Neural Network (CNN) based image processing have proposed networks that can be optimized for a single level. Here, the term “level" refers to the specific objective defined for each task, such as the degree of noise in denoising tasks. Hence, they underperform on other levels and must be retrained to deliver optimal performance. Using multiple models to cover multiple levels involves very high computational costs. To solve these problems, recent approaches train the networks on two different levels and propose their own modulation methods to enable the arbitrary intermediate levels. However, many of them 1) have difficulty adapting from one level to the other, 2) suffer from unintended artifacts in the intermediate levels, or 3) require large memory and computational cost. In this paper, we propose a novel framework using Filter Transition Network (FTN), which is a non-linear module that easily adapts to new levels, is regularized to prevent undesirable side-effects, and extremely lightweight being a data-independent module. Additionally, for stable learning of FTN, we newly propose a method to initialize nonlinear CNNs with identity mappings. Extensive results for various image processing tasks indicate that the performance of FTN is stable regarding adaptation and modulation and is comparable to that of the other heavy frameworks.
Similar content being viewed by others
Data Availability
Our manuscript has no associated data, and the source code will be uploaded on Github.
References
Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. In The IEEE conference on computer vision and pattern recognition (CVPR) Workshops.
Blau, Y., & Michaeli, T. (2018). The perception-distortion tradeoff. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6228–6237).
Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., & Zelnik-Manor, L. (2018). The 2018 pirm challenge on perceptual image super-resolution. In Proceedings of the European conference on computer vision (ECCV) (pp. 0–0).
Dong, C., Deng, Y., Change Loy, C., & Tang, X. (2015). Compression artifacts reduction by a deep convolutional network. In Proceedings of the IEEE international conference on computer vision (pp. 576–584).
Dong, C., Loy, C. C., He, K., & Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.
Galteri, L., Seidenari, L., Bertini, M., & Del Bimbo, A. (2017). Deep generative adversarial compression artifact removal. In Proceedings of the IEEE international conference on computer vision (pp. 4826–4835).
Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576
Gatys, L. A., Ecker, A. S., Bethge, M., Hertzmann, A., & Shechtman, E. (2017). Controlling perceptual factors in neural style transfer. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3985–3993).
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249–256).
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Guo, S., Yan, Z., Zhang, K., Zuo, W., & Zhang, L. (2019). Toward convolutional blind denoising of real photographs. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1712–1722).
Ha, D., Dai, A., & Le, Q. V. (2016). Hypernetworks. arXiv preprint arXiv:1609.09106
He, J., Dong, C., & Qiao, Y. (2019). Modulating image restoration with continual levels via adaptive feature modification layers. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11056–11064).
He, J., Dong, C., & Qiao, Y. (xxxx). Interactive multi-dimension modulation with dynamic controllable residual learning for image restoration. In European conference on computer vision. Springer.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026–1034).
Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision (pp. 1501–1510).
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711). Springer
Kim, J., Kwon Lee, J., & Mu Lee, K. (2016). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1646–1654).
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Lai, W.-S., Huang, J.-B., Ahuja, N., & Yang, M. -H. (2017). Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 624–632).
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4681–4690).
Lim, B., Son, S., Kim, H., Nah, S., & Mu Lee, K. (2017). Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 136–144).
Lin, T. -Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740–755). Springer
Liu, D., Wen, B., Fan, Y., Loy, C. C., & Huang, T. S. (2018) Non-local recurrent network for image restoration. In Advances in neural information processing systems (pp. 1673–1682).
Martin, D., Fowlkes, C., Tal, D., & Malik, J., et al. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Iccv Vancouver
Mildenhall, B., Barron, J. T., Chen, J., Sharlet, D., Ng, R., & Carroll, R. (2018). Burst denoising with kernel prediction networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2502–2510).
Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Mittal, A., Soundararajan, R., & Bovik, A. C. (2012). Making a “completely blind’’ image quality analyzer. IEEE Signal Processing Letters, 20(3), 209–212.
Moorthy, A. K., & Bovik, A. C. (2009). Visual importance pooling for image quality assessment. IEEE Journal of Selected Topics in Signal Processing, 3(2), 193–201.
Oliphant, T. E. (2006). A Guide to NumPy vol. 1. Trelgol Publishing USA.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch. In NIPS Autodiff Workshop
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Sheng, L., Lin, Z., Shao, J., & Wang, X. (2018). Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8242–8250).
Shoshan, A., Mechrez, R., & Zelnik-Manor, L. (2019). Dynamic-net: Tuning the objective without re-training for synthesis tasks. In Proceedings of the IEEE international conference on computer vision (pp. 3215–3223).
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016). Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
Wang, W., Guo, R., Tian, Y., & Yang, W. (2019). Cfsnet: Toward a controllable feature space for image restoration. In Proceedings of the IEEE international conference on computer vision (pp. 4140–4149).
Wang, X., Yu, K., Dong, C., Tang, X., & Loy, C. C. (2019). Deep network interpolation for continuous imagery effect transition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1692–1701).
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., & Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 0–0).
Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P., et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492–1500).
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., & Fu, Y. (2018). Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 286–301).
Zhang, Y., Tian, Y., Kong, Y., Zhong, B., & Fu, Y. (2018). Residual dense network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2472–2481).
Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing, 26(7), 3142–3155.
Zhang, K., Zuo, W., & Zhang, L. (2018). Ffdnet: Toward a fast and flexible solution for cnn-based image denoising. IEEE Transactions on Image Processing, 27(9), 4608–4622.
Acknowledgements
This research was supported by R &D program for Advanced Integrated-intelligence for Identification (AIID) through the National Research Foundation of KOREA (NRF) funded by Ministry of Science and ICT (NRF-2018M3E3A1057289).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Chen Change Loy.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lee, H., Kim, T., Son, H. et al. A Nonlinear, Regularized, and Data-independent Modulation for Continuously Interactive Image Processing Network. Int J Comput Vis 132, 74–94 (2024). https://doi.org/10.1007/s11263-023-01874-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-023-01874-y