Skip to main content
Log in

Visible-to-infrared image translation based on an improved CGAN

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

This study proposes an Infrared (IR) generative adversarial network (IR-GAN) to generate high-quality IR images using visible images, based on a conditional generative adversarial network. IR-GAN improves texture loss and edge distortion during infrared image generation and includes a novel generator implementing a U-Net architecture based on ConvNeXt (UConvNeXt). This approach enhances the utilization of underlying and deep features in the image during the upsampling process using two types of skip connections, thereby improving texture information. IR-GAN also adds gradient vector loss to generator training, which effectively improves the edge extraction capabilities of the generator. In addition, a multi-scale PatchGAN was included in IR-GAN to enrich local and global image features. Results produced by the proposed model were compared to those of the Pix2Pix and ThermalGAN architectures applied to the IVFG dataset and assessed using five evaluation metrics. Our method produced a structural similarity index measure (SSIM) 10.1% higher than that of Pix2Pix and 12.4% higher than ThermalGAN for the IVFG dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The code and data information in the manuscript are as follows: Pix2Pix code: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix. ThermalGAN code: https://github.com/vlkniaz/ThermalGAN. VEDAI data: https://downloads.greyc.fr/vedai/. IVFG data used to support this research was collected by the authors through UAV, which is equipped with a thermal infrared camera and a visible camera (both of which are coaxially installed) to capture the desired target and scene in the designated area.

References

  1. Han, T., Kang, W., Choi, G.: IR-UWB sensor based fall detection method using CNN algorithm. Sensors 20(20), 5948 (2020)

    Article  Google Scholar 

  2. Maheepala, M., Kouzani, A.Z., Joordens, M.A.: Light-based indoor positioning systems: A review. IEEE Sens. J. 20(8), 3971–3995 (2020)

    Article  Google Scholar 

  3. Chen, C.P., Li, H., Wei, Y., Xia, T., Tang, Y.Y.: A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 52(1), 574–581 (2013)

    Article  Google Scholar 

  4. Yilmaz, A., Shafique, K., Shah, M.: Target tracking in airborne forward looking infrared imagery. Image Vis. Comput. 21(7), 623–635 (2003)

    Article  Google Scholar 

  5. Jacobs, P. A.: Thermal infrared characterization of ground targets and backgrounds (Vol. 70). SPIE press (2006)

  6. Ben-Yosef, N., Rahat, B., Feigin, G.: Simulation of IR images of natural backgrounds. Appl. Opt. 22(1), 190–193 (1983)

    Article  Google Scholar 

  7. Ross, V., & Dion, D.: SMART and SMARTI: visible and IR atmospheric radiative-transfer libraries optimized for wide-band applications. In: Infrared Imaging Systems: Design, Analysis, Modeling, and Testing XXII (Vol. 8014, pp. 257–266), SPIE (2011)

  8. Dion, D.: EOSPEC: a complementary toolbox for MODTRAN calculations. In: Laser Communication and Propagation through the Atmosphere and Oceans V (Vol. 9979, pp. 239–244), SPIE (2016)

  9. Thompson, D.R., Natraj, V., Green, R.O., Helmlinger, M.C., Gao, B.C., Eastwood, M.L.: Optimal estimation for imaging spectrometer atmospheric correction. Remote Sens. Environ. 216, 355–373 (2018)

    Article  Google Scholar 

  10. Zheng, L., Sun, S., & Zhang, T.: A method for dynamic infrared image simulation under various natural conditions. In: MIPPR 2009: Multispectral Image Acquisition and Processing (Vol. 7494, p. 74940B). International Society for Optics and Photonics (2009)

  11. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Bengio, Y.: Generative adversarial nets. Advances in neural information processing systems, 27 (2014)

  12. Bai, J., Chen, R., Liu, M.: Feature-attention module for context-aware image-to-image translation. Vis. Comput. 36(10), 2145–2159 (2020)

    Article  Google Scholar 

  13. Liu, H., Li, C., Lei, D., Zhu, Q.: Unsupervised video-to-video translation with preservation of frame modification tendency. Vis. Comput. 36(10), 2105–2116 (2020)

    Article  Google Scholar 

  14. Li, L., Tang, J., Shao, Z., Tan, X., & Ma, L.: Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement. The Visual Computer, pp. 1–18 (2021)

  15. Wang, L., Sun, Y., & Wang, Z.: CCS-GAN: a semi-supervised generative adversarial network for image classification. The Visual Computer, pp. 1–13 (2021)

  16. Abbas, F., & Babahenini, M. C.: Forest fog rendering using generative adversarial networks. The Visual Computer, pp. 1–10 (2022)

  17. Bi, F., Han, J., Tian, Y., Wang, Y.: SSGAN: generative adversarial networks for the stroke segmentation of calligraphic characters. Vis. Comput. 38(7), 2581–2590 (2022)

    Article  Google Scholar 

  18. Rao, J., Ke, A., Liu, G., & Ming, Y.: MS-GAN: multi-scale GAN with parallel class activation maps for image reconstruction. The Visual Computer, pp. 1–16 (2022)

  19. Zhang, S., Su, S., Li, L., Lu, J., Zhou, Q., Chang, C.C.: CSST-Net: an arbitrary image style transfer network of coverless steganography. Vis. Comput. 38(6), 2125–2137 (2022)

    Article  Google Scholar 

  20. Manu, C. M., & Sreeni, K. G.: GANID: a novel generative adversarial network for image dehazing. The Visual Computer, pp. 1–14 (2022)

  21. Soroush, R., & Baleghi, Y.: NIR/RGB image fusion for scene classification using deep neural networks. The Visual Computer, pp. 1–15 (2022)

  22. Reisfeld, E., & Sharf, A.: OneSketch: learning high-level shape features from simple sketches. The Visual Computer, pp. 1–12 (2022)

  23. Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125–1134) (2017)

  24. Kniaz, V. V., Knyaz, V. A., Hladuvka, J., Kropatsch, W. G., & Mizginov, V.: Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multispectral dataset. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (pp. 0–0) (2018)

  25. Mizginov, V., Kniaz, V.V., & Fomin, N.: A METHOD FOR SYNTHESIZING THERMAL IMAGES USING GAN MULTI-LAYERED APPROACH. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pp. 155–162 (2021)

  26. Li, B., Xian, Y., Su, J., Zhang, D. Q., & Guo, W. L.: I-GANs for Infrared Image Generation. Complexity, 2021 (2021)

  27. Ma, Y., Hua, Y., & Zuo, Z.: Infrared Image Generation By Pix2pix Based on Multi-receptive Field Feature Fusion. In: 2021 International Conference on Control, Automation and Information Sciences (ICCAIS) (pp. 1029–1036), IEEE (2021)

  28. Aslahishahri, M., Stanley, K. G., Duddu, H., Shirtliffe, S., Vail, S., Bett, K., & Stavness, I.: From RGB to NIR: Predicting of near infrared reflectance from visible spectrum aerial images of crops. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1312–1322) (2021)

  29. Uddin, M.S., Hoque, R., Islam, K.A., Kwan, C., Gribben, D., Li, J.: Converting optical videos to infrared videos using attention gan and its impact on target detection and classification performance. Remote Sensing 13(16), 3257 (2021)

    Article  Google Scholar 

  30. Özkanoğlu, M.A., Ozer, S.: InfraGAN: A GAN architecture to transfer visible images to infrared domain. Pattern Recogn. Lett. 155, 69–76 (2022)

    Article  Google Scholar 

  31. Li, Y., Ko, Y., Lee, W.: RGB image-based hybrid model for automatic prediction of flashover in compartment fires. Fire Saf. J. 132, 103629 (2022)

    Article  Google Scholar 

  32. Mozaffari, M. H., Li, Y., & Ko, Y.: Detecting Flashover in a Room Fire based on the Sequence of Thermal Infrared Images using Convolutional Neural Networks. In: Proceedings of the Canadian Conference on Artificial Intelligence (2022). https://doi.org/10.21428/594757db.7c1cd4e1

  33. Mirza, M., & Osindero, S.: Conditional generative adversarial nets. arXiv preprint (2014). arXiv:1411.1784.

  34. Schonfeld, E., Schiele, B., & Khoreva, A.: A u-net based discriminator for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8207–8216) (2020)

  35. Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., & Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11976–11986) (2022)

  36. Li, C., & Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision (pp. 702–716). Springer, Cham (2016)

  37. Chandaliya, P. K., & Nain, N.: Child Face Age Progression and Regression using Self-Attention Multi-Scale Patch GAN. IEEE/CVF IJCB, pp. 1–8 (2021)

  38. Siddique, N., Paheding, S., Elkin, C. P., & Devabhaktuni, V.: U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access (2021)

  39. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). Ieee (2009)

  40. Zeiler, M. D., Taylor, G. W., & Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International Conference on Computer Vision (pp. 2018–2025). IEEE (2011)

  41. Xu, J., Liu, W., Xing, W., & Wei, X.: MSPENet: multi-scale adaptive fusion and position enhancement network for human pose estimation. The Visual Computer, pp. 1–15 (2022)

  42. Sobel, Irwin.: An Isotropic 3x3 image gradient operator. Presentation at Stanford A.I. Project 1968 (2014)

  43. Gupta, S., Gupta, C., & Chakarvarti, S. K.: Image Edge Detection: A Review. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 2(7) (2013)

  44. Razakarivony, S., Jurie, F.: Vehicle detection in aerial imagery: A small target detection benchmark. J. Vis. Commun. Image Represent. 34, 187–203 (2016)

    Article  Google Scholar 

  45. Hore, A., & Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: 2010 20th international conference on pattern recognition (pp. 2366–2369). IEEE (2010)

  46. Wang, Z., Simoncelli, E. P., & Bovik, A. C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 (Vol. 2, pp. 1398–1402). Ieee (2003)

  47. Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62103432.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Decao Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, D., Xian, Y., Li, B. et al. Visible-to-infrared image translation based on an improved CGAN. Vis Comput 40, 1289–1298 (2024). https://doi.org/10.1007/s00371-023-02847-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02847-5

Keywords

Navigation