Skip to main content
Log in

Learning to colorize near-infrared images with limited data

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With large amount of training data, Convolutional Neural Network (CNN)-based methods have made great progress for colorizing visible grayscale images. However, existing models rarely consider colorizing near-infrared (NIR) images, which are usually captured with less color or texture information. In this paper, we propose a novel two-stage framework to recover visible color for NIR images with limited data. Firstly, we propose a grayscale preprocessing network to cope with the visual blur and low contrast in NIR gray images, which provides grayscale image as close as the ground truth of visible ones. For the second stage, i.e., image colorization, a novel bilateral Res-Unet network is proposed to transfer color features on both sides of the encoder and decoder to improve the semantic correctness of the colored image. To deal with limited data, a feature memory module is adopted to memorize the color features of training images, providing color conditions for the colorization network. Furthermore, a multi-feature semantic perception loss is reformulated to make the final results with more semantic information and increases the vividness and naturalness. To verify the proposed method, we conduct various experiments on four limited datasets and achieve state-of-the-art against existing CNN-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Data openly available in a public repository. The data that support the findings of this study are openly available at: Oxford102: https://www.robots.ox.ac.uk/~vgg/data/flowers/102/. CUB-200: http://www.vision.caltech.edu/datasets/cub_200_2011/. Hero: https://github.com/dongheehand/MemoPainter-PyTorch. VSIAD: https://github.com/huster-wgm/VSIAD.

References

  1. Wan S, Xia Y, Qi L, Yang Y-H, Atiquzzaman M (2020) Automated colorization of a grayscale image with seed points propagation. IEEE Trans Multimed 22(7):1756–1768. https://doi.org/10.1109/TMM.2020.2976573

    Article  Google Scholar 

  2. Yoo S, Bahng H, Chung S, Lee J, Chang J, Choo J (2019) Coloring with limited data: Few-shot colorization via memory augmented networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11283–11292

  3. Wu G, Zheng Y, Guo Z, Cai Z, Shi X, Ding X, Huang Y, Guo Y, Shibasaki R (2020) Learn to recover visible color for video surveillance in a day. In: European conference on computer vision, pp 495–511

  4. Cheng Z, Yang Q, Sheng B (2015) Deep colorization. In: Proceedings of the IEEE international conference on computer vision, pp 415–423

  5. Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666

  6. Bahng H, Yoo S, Cho W, Park DK, Wu Z, Ma X, Choo J (2018) Coloring with words: guiding image colorization through text-based palette generation. In: European conference on computer vision, pp 431–447

  7. Huang Y, Qiu S, Wang C, Li C (2021) Learning representations for high-dynamic-range image color transfer in a self-supervised way. IEEE Trans Multimed 23:176–188. https://doi.org/10.1109/TMM.2020.2981994

    Article  Google Scholar 

  8. Vitoria P, Raad L, Ballester C (2020) Chromagan: adversarial picture colorization with semantic class distribution. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2445–2454

  9. Liu Q, Li X, He Z, Fan N, Yuan D, Wang H (2021) Learning deep multi-level similarity for thermal infrared object tracking. IEEE Trans Multimed 23:2114–2126. https://doi.org/10.1109/TMM.2020.3008028

    Article  Google Scholar 

  10. Liu Q, He Z, Li X, Zheng Y (2020) PTB-TIR: a thermal infrared pedestrian tracking benchmark. IEEE Trans Multimed 22(3):666–675. https://doi.org/10.1109/TMM.2019.2932615

    Article  Google Scholar 

  11. Chen X, Gao C, Li C, Yang Y, Meng D (2022) Infrared action detection in the dark via cross-stream attention mechanism. IEEE Trans Multimed 24:288–300. https://doi.org/10.1109/TMM.2021.3050069

    Article  Google Scholar 

  12. Son C-H, Zhang X-P (2017) Near-infrared coloring via a contrast-preserving mapping model. IEEE Trans Image Process 26(11):5381–5394

    Article  MathSciNet  MATH  Google Scholar 

  13. Fredembach C, Süsstrunk S (2017) Colouring the near-infrared. In: Color and imaging conference, vol 2008, pp 176–182. Society for Imaging Science and Technology

  14. Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  15. Ibtehaz N, Rahman MS (2020) Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87

    Article  Google Scholar 

  16. Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801

    Article  Google Scholar 

  17. Hore A, Ziou D (2010) Image quality metrics: PSNR vs. SSIM. In: 2010 20th International conference on pattern recognition, pp 2366–2369. IEEE

  18. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595

  19. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  20. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784

  21. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv preprint arXiv:1406.2661

  22. Wang B, Zou Y, Zhang L, Li Y, Chen Q, Zuo C (2022) Multimodal super-resolution reconstruction of infrared and visible images via deep learning. Opt Lasers Eng 156:107078. https://doi.org/10.1016/j.optlaseng.2022.107078

    Article  Google Scholar 

  23. Guo Y, Chen J, Wang J, Chen Q, Cao J, Deng Z, Xu Y, Tan M (2020) Closed-loop matters: dual regression networks for single image super-resolution. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5406–5415. https://doi.org/10.1109/CVPR42600.2020.00545

  24. Cho W, Choi S, Park DK, Shin I, Choo J (2019) Image-to-image translation via group-wise deep whitening-and-coloring transformation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10639–10647

  25. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 5907–5915

  26. Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. In: International conference on machine learning, pp 1060–1069. PMLR

  27. Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583

  28. Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797

  29. Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2018) Ganimation: anatomically-aware facial animation from a single image. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833

  30. Chandaliya PK, Nain N (2022) AW-GAN: face aging and rejuvenation using attention with wavelet GAN. Neural Comput Appl 1–15

  31. Bharti V, Biswas B, Shukla KK (2021) EMOCGAN: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation. Neural Comput Appl 1–15

  32. Ciprián-Sánchez JF, Ochoa-Ruiz G, Gonzalez-Mendoza M, Rossi L (2021) FIRe-GAN: a novel deep learning-based infrared-visible fusion method for wildfire imagery. Neural Comput Appl 1–13

  33. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241 . Springer

  34. Diakogiannis FI, Waldner F, Caccetta P, Wu C (2020) Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J Photogramm Remote Sens 162:94–114

    Article  Google Scholar 

  35. Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999

  36. Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401

  37. Kumar A, Irsoy O, Ondruska P, Iyyer M, Bradbury J, Gulrajani I, Zhong V, Paulus R, Socher, R (2016) Ask me anything: dynamic memory networks for natural language processing. In: International conference on machine learning, pp 1378–1387. PMLR

  38. Miller A, Fisch A, Dodge J, Karimi A-H, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. arXiv preprint arXiv:1606.03126

  39. Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. arXiv preprint arXiv:1503.08895

  40. Lee S, Sung J, Yu Y, Kim G (2018) A memory network approach for story-based temporal summarization of 360 videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1410–1419

  41. Chunseong Park C, Kim B, Kim G (2017) Attend to you: personalized image captioning with context sequence memory networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 895–903

  42. Park CC, Kim B, Kim G (2018) Towards personalized image captioning via multimodal memory networks. IEEE Trans Pattern Anal Mach Intell 41(4):999–1012

    Article  Google Scholar 

  43. Kim Y, Kim M, Kim G (2018) Memorization precedes generation: learning unsupervised GANs with memory networks. arXiv preprint arXiv:1803.01500

  44. Wu C, Herranz L, Liu X, Wang Y, Van de Weijer J, Raducanu B (2018) Memory replay GANs: learning to generate images from new categories without forgetting. arXiv preprint arXiv:1809.02058

  45. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  46. Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6924–6932

  47. Kaiser Ł, Nachum O, Roy A, Bengio S (2017) Learning to remember rare events. arXiv preprint arXiv:1703.03129

  48. Peters AF, Peters P (2015) The color thief. Albert Whitman and Company

  49. Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision, pp 1501–1510

  50. Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189

  51. Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807

  52. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711. Springer

  53. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: an imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703

  54. Nilsback M-E, Zisserman A (2011) Automated flower classification over a large number of classes. In: 2008 Sixth Indian conference on computer vision, graphics & image processing, pp 722–729. IEEE

  55. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 62071384, Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huaxin Xiao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Guo, Z., Guo, H. et al. Learning to colorize near-infrared images with limited data. Neural Comput & Applic 35, 19865–19884 (2023). https://doi.org/10.1007/s00521-023-08768-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08768-7

Keywords

Navigation