Abstract
With large amount of training data, Convolutional Neural Network (CNN)-based methods have made great progress for colorizing visible grayscale images. However, existing models rarely consider colorizing near-infrared (NIR) images, which are usually captured with less color or texture information. In this paper, we propose a novel two-stage framework to recover visible color for NIR images with limited data. Firstly, we propose a grayscale preprocessing network to cope with the visual blur and low contrast in NIR gray images, which provides grayscale image as close as the ground truth of visible ones. For the second stage, i.e., image colorization, a novel bilateral Res-Unet network is proposed to transfer color features on both sides of the encoder and decoder to improve the semantic correctness of the colored image. To deal with limited data, a feature memory module is adopted to memorize the color features of training images, providing color conditions for the colorization network. Furthermore, a multi-feature semantic perception loss is reformulated to make the final results with more semantic information and increases the vividness and naturalness. To verify the proposed method, we conduct various experiments on four limited datasets and achieve state-of-the-art against existing CNN-based methods.
Similar content being viewed by others
Data availability
Data openly available in a public repository. The data that support the findings of this study are openly available at: Oxford102: https://www.robots.ox.ac.uk/~vgg/data/flowers/102/. CUB-200: http://www.vision.caltech.edu/datasets/cub_200_2011/. Hero: https://github.com/dongheehand/MemoPainter-PyTorch. VSIAD: https://github.com/huster-wgm/VSIAD.
References
Wan S, Xia Y, Qi L, Yang Y-H, Atiquzzaman M (2020) Automated colorization of a grayscale image with seed points propagation. IEEE Trans Multimed 22(7):1756–1768. https://doi.org/10.1109/TMM.2020.2976573
Yoo S, Bahng H, Chung S, Lee J, Chang J, Choo J (2019) Coloring with limited data: Few-shot colorization via memory augmented networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11283–11292
Wu G, Zheng Y, Guo Z, Cai Z, Shi X, Ding X, Huang Y, Guo Y, Shibasaki R (2020) Learn to recover visible color for video surveillance in a day. In: European conference on computer vision, pp 495–511
Cheng Z, Yang Q, Sheng B (2015) Deep colorization. In: Proceedings of the IEEE international conference on computer vision, pp 415–423
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666
Bahng H, Yoo S, Cho W, Park DK, Wu Z, Ma X, Choo J (2018) Coloring with words: guiding image colorization through text-based palette generation. In: European conference on computer vision, pp 431–447
Huang Y, Qiu S, Wang C, Li C (2021) Learning representations for high-dynamic-range image color transfer in a self-supervised way. IEEE Trans Multimed 23:176–188. https://doi.org/10.1109/TMM.2020.2981994
Vitoria P, Raad L, Ballester C (2020) Chromagan: adversarial picture colorization with semantic class distribution. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2445–2454
Liu Q, Li X, He Z, Fan N, Yuan D, Wang H (2021) Learning deep multi-level similarity for thermal infrared object tracking. IEEE Trans Multimed 23:2114–2126. https://doi.org/10.1109/TMM.2020.3008028
Liu Q, He Z, Li X, Zheng Y (2020) PTB-TIR: a thermal infrared pedestrian tracking benchmark. IEEE Trans Multimed 22(3):666–675. https://doi.org/10.1109/TMM.2019.2932615
Chen X, Gao C, Li C, Yang Y, Meng D (2022) Infrared action detection in the dark via cross-stream attention mechanism. IEEE Trans Multimed 24:288–300. https://doi.org/10.1109/TMM.2021.3050069
Son C-H, Zhang X-P (2017) Near-infrared coloring via a contrast-preserving mapping model. IEEE Trans Image Process 26(11):5381–5394
Fredembach C, Süsstrunk S (2017) Colouring the near-infrared. In: Color and imaging conference, vol 2008, pp 176–182. Society for Imaging Science and Technology
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Ibtehaz N, Rahman MS (2020) Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801
Hore A, Ziou D (2010) Image quality metrics: PSNR vs. SSIM. In: 2010 20th International conference on pattern recognition, pp 2366–2369. IEEE
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv preprint arXiv:1406.2661
Wang B, Zou Y, Zhang L, Li Y, Chen Q, Zuo C (2022) Multimodal super-resolution reconstruction of infrared and visible images via deep learning. Opt Lasers Eng 156:107078. https://doi.org/10.1016/j.optlaseng.2022.107078
Guo Y, Chen J, Wang J, Chen Q, Cao J, Deng Z, Xu Y, Tan M (2020) Closed-loop matters: dual regression networks for single image super-resolution. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5406–5415. https://doi.org/10.1109/CVPR42600.2020.00545
Cho W, Choi S, Park DK, Shin I, Choo J (2019) Image-to-image translation via group-wise deep whitening-and-coloring transformation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10639–10647
Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 5907–5915
Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. In: International conference on machine learning, pp 1060–1069. PMLR
Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797
Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2018) Ganimation: anatomically-aware facial animation from a single image. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833
Chandaliya PK, Nain N (2022) AW-GAN: face aging and rejuvenation using attention with wavelet GAN. Neural Comput Appl 1–15
Bharti V, Biswas B, Shukla KK (2021) EMOCGAN: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation. Neural Comput Appl 1–15
Ciprián-Sánchez JF, Ochoa-Ruiz G, Gonzalez-Mendoza M, Rossi L (2021) FIRe-GAN: a novel deep learning-based infrared-visible fusion method for wildfire imagery. Neural Comput Appl 1–13
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241 . Springer
Diakogiannis FI, Waldner F, Caccetta P, Wu C (2020) Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J Photogramm Remote Sens 162:94–114
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999
Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401
Kumar A, Irsoy O, Ondruska P, Iyyer M, Bradbury J, Gulrajani I, Zhong V, Paulus R, Socher, R (2016) Ask me anything: dynamic memory networks for natural language processing. In: International conference on machine learning, pp 1378–1387. PMLR
Miller A, Fisch A, Dodge J, Karimi A-H, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. arXiv preprint arXiv:1606.03126
Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. arXiv preprint arXiv:1503.08895
Lee S, Sung J, Yu Y, Kim G (2018) A memory network approach for story-based temporal summarization of 360 videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1410–1419
Chunseong Park C, Kim B, Kim G (2017) Attend to you: personalized image captioning with context sequence memory networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 895–903
Park CC, Kim B, Kim G (2018) Towards personalized image captioning via multimodal memory networks. IEEE Trans Pattern Anal Mach Intell 41(4):999–1012
Kim Y, Kim M, Kim G (2018) Memorization precedes generation: learning unsupervised GANs with memory networks. arXiv preprint arXiv:1803.01500
Wu C, Herranz L, Liu X, Wang Y, Van de Weijer J, Raducanu B (2018) Memory replay GANs: learning to generate images from new categories without forgetting. arXiv preprint arXiv:1809.02058
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6924–6932
Kaiser Ł, Nachum O, Roy A, Bengio S (2017) Learning to remember rare events. arXiv preprint arXiv:1703.03129
Peters AF, Peters P (2015) The color thief. Albert Whitman and Company
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision, pp 1501–1510
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711. Springer
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: an imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703
Nilsback M-E, Zisserman A (2011) Automated flower classification over a large number of classes. In: 2008 Sixth Indian conference on computer vision, graphics & image processing, pp 722–729. IEEE
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 62071384, Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Y., Guo, Z., Guo, H. et al. Learning to colorize near-infrared images with limited data. Neural Comput & Applic 35, 19865–19884 (2023). https://doi.org/10.1007/s00521-023-08768-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08768-7