Visible-to-infrared image translation based on an improved CGAN

Ma, Decao; Xian, Yong; Li, Bing; Li, Shaopeng; Zhang, Daqiao

doi:10.1007/s00371-023-02847-5

Visible-to-infrared image translation based on an improved CGAN

Original article
Published: 07 April 2023

Volume 40, pages 1289–1298, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Decao Ma ORCID: orcid.org/0000-0001-8545-6215¹,
Yong Xian¹,
Bing Li¹,
Shaopeng Li^1,2 &
…
Daqiao Zhang¹

634 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

This study proposes an Infrared (IR) generative adversarial network (IR-GAN) to generate high-quality IR images using visible images, based on a conditional generative adversarial network. IR-GAN improves texture loss and edge distortion during infrared image generation and includes a novel generator implementing a U-Net architecture based on ConvNeXt (UConvNeXt). This approach enhances the utilization of underlying and deep features in the image during the upsampling process using two types of skip connections, thereby improving texture information. IR-GAN also adds gradient vector loss to generator training, which effectively improves the edge extraction capabilities of the generator. In addition, a multi-scale PatchGAN was included in IR-GAN to enrich local and global image features. Results produced by the proposed model were compared to those of the Pix2Pix and ThermalGAN architectures applied to the IVFG dataset and assessed using five evaluation metrics. Our method produced a structural similarity index measure (SSIM) 10.1% higher than that of Pix2Pix and 12.4% higher than ThermalGAN for the IVFG dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Data availability

The code and data information in the manuscript are as follows: Pix2Pix code: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix. ThermalGAN code: https://github.com/vlkniaz/ThermalGAN. VEDAI data: https://downloads.greyc.fr/vedai/. IVFG data used to support this research was collected by the authors through UAV, which is equipped with a thermal infrared camera and a visible camera (both of which are coaxially installed) to capture the desired target and scene in the designated area.

References

Han, T., Kang, W., Choi, G.: IR-UWB sensor based fall detection method using CNN algorithm. Sensors 20(20), 5948 (2020)
Article Google Scholar
Maheepala, M., Kouzani, A.Z., Joordens, M.A.: Light-based indoor positioning systems: A review. IEEE Sens. J. 20(8), 3971–3995 (2020)
Article Google Scholar
Chen, C.P., Li, H., Wei, Y., Xia, T., Tang, Y.Y.: A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 52(1), 574–581 (2013)
Article Google Scholar
Yilmaz, A., Shafique, K., Shah, M.: Target tracking in airborne forward looking infrared imagery. Image Vis. Comput. 21(7), 623–635 (2003)
Article Google Scholar
Jacobs, P. A.: Thermal infrared characterization of ground targets and backgrounds (Vol. 70). SPIE press (2006)
Ben-Yosef, N., Rahat, B., Feigin, G.: Simulation of IR images of natural backgrounds. Appl. Opt. 22(1), 190–193 (1983)
Article Google Scholar
Ross, V., & Dion, D.: SMART and SMARTI: visible and IR atmospheric radiative-transfer libraries optimized for wide-band applications. In: Infrared Imaging Systems: Design, Analysis, Modeling, and Testing XXII (Vol. 8014, pp. 257–266), SPIE (2011)
Dion, D.: EOSPEC: a complementary toolbox for MODTRAN calculations. In: Laser Communication and Propagation through the Atmosphere and Oceans V (Vol. 9979, pp. 239–244), SPIE (2016)
Thompson, D.R., Natraj, V., Green, R.O., Helmlinger, M.C., Gao, B.C., Eastwood, M.L.: Optimal estimation for imaging spectrometer atmospheric correction. Remote Sens. Environ. 216, 355–373 (2018)
Article Google Scholar
Zheng, L., Sun, S., & Zhang, T.: A method for dynamic infrared image simulation under various natural conditions. In: MIPPR 2009: Multispectral Image Acquisition and Processing (Vol. 7494, p. 74940B). International Society for Optics and Photonics (2009)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Bengio, Y.: Generative adversarial nets. Advances in neural information processing systems, 27 (2014)
Bai, J., Chen, R., Liu, M.: Feature-attention module for context-aware image-to-image translation. Vis. Comput. 36(10), 2145–2159 (2020)
Article Google Scholar
Liu, H., Li, C., Lei, D., Zhu, Q.: Unsupervised video-to-video translation with preservation of frame modification tendency. Vis. Comput. 36(10), 2105–2116 (2020)
Article Google Scholar
Li, L., Tang, J., Shao, Z., Tan, X., & Ma, L.: Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement. The Visual Computer, pp. 1–18 (2021)
Wang, L., Sun, Y., & Wang, Z.: CCS-GAN: a semi-supervised generative adversarial network for image classification. The Visual Computer, pp. 1–13 (2021)
Abbas, F., & Babahenini, M. C.: Forest fog rendering using generative adversarial networks. The Visual Computer, pp. 1–10 (2022)
Bi, F., Han, J., Tian, Y., Wang, Y.: SSGAN: generative adversarial networks for the stroke segmentation of calligraphic characters. Vis. Comput. 38(7), 2581–2590 (2022)
Article Google Scholar
Rao, J., Ke, A., Liu, G., & Ming, Y.: MS-GAN: multi-scale GAN with parallel class activation maps for image reconstruction. The Visual Computer, pp. 1–16 (2022)
Zhang, S., Su, S., Li, L., Lu, J., Zhou, Q., Chang, C.C.: CSST-Net: an arbitrary image style transfer network of coverless steganography. Vis. Comput. 38(6), 2125–2137 (2022)
Article Google Scholar
Manu, C. M., & Sreeni, K. G.: GANID: a novel generative adversarial network for image dehazing. The Visual Computer, pp. 1–14 (2022)
Soroush, R., & Baleghi, Y.: NIR/RGB image fusion for scene classification using deep neural networks. The Visual Computer, pp. 1–15 (2022)
Reisfeld, E., & Sharf, A.: OneSketch: learning high-level shape features from simple sketches. The Visual Computer, pp. 1–12 (2022)
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125–1134) (2017)
Kniaz, V. V., Knyaz, V. A., Hladuvka, J., Kropatsch, W. G., & Mizginov, V.: Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multispectral dataset. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (pp. 0–0) (2018)
Mizginov, V., Kniaz, V.V., & Fomin, N.: A METHOD FOR SYNTHESIZING THERMAL IMAGES USING GAN MULTI-LAYERED APPROACH. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pp. 155–162 (2021)
Li, B., Xian, Y., Su, J., Zhang, D. Q., & Guo, W. L.: I-GANs for Infrared Image Generation. Complexity, 2021 (2021)
Ma, Y., Hua, Y., & Zuo, Z.: Infrared Image Generation By Pix2pix Based on Multi-receptive Field Feature Fusion. In: 2021 International Conference on Control, Automation and Information Sciences (ICCAIS) (pp. 1029–1036), IEEE (2021)
Aslahishahri, M., Stanley, K. G., Duddu, H., Shirtliffe, S., Vail, S., Bett, K., & Stavness, I.: From RGB to NIR: Predicting of near infrared reflectance from visible spectrum aerial images of crops. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1312–1322) (2021)
Uddin, M.S., Hoque, R., Islam, K.A., Kwan, C., Gribben, D., Li, J.: Converting optical videos to infrared videos using attention gan and its impact on target detection and classification performance. Remote Sensing 13(16), 3257 (2021)
Article Google Scholar
Özkanoğlu, M.A., Ozer, S.: InfraGAN: A GAN architecture to transfer visible images to infrared domain. Pattern Recogn. Lett. 155, 69–76 (2022)
Article Google Scholar
Li, Y., Ko, Y., Lee, W.: RGB image-based hybrid model for automatic prediction of flashover in compartment fires. Fire Saf. J. 132, 103629 (2022)
Article Google Scholar
Mozaffari, M. H., Li, Y., & Ko, Y.: Detecting Flashover in a Room Fire based on the Sequence of Thermal Infrared Images using Convolutional Neural Networks. In: Proceedings of the Canadian Conference on Artificial Intelligence (2022). https://doi.org/10.21428/594757db.7c1cd4e1
Mirza, M., & Osindero, S.: Conditional generative adversarial nets. arXiv preprint (2014). arXiv:1411.1784.
Schonfeld, E., Schiele, B., & Khoreva, A.: A u-net based discriminator for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8207–8216) (2020)
Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., & Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11976–11986) (2022)
Li, C., & Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision (pp. 702–716). Springer, Cham (2016)
Chandaliya, P. K., & Nain, N.: Child Face Age Progression and Regression using Self-Attention Multi-Scale Patch GAN. IEEE/CVF IJCB, pp. 1–8 (2021)
Siddique, N., Paheding, S., Elkin, C. P., & Devabhaktuni, V.: U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access (2021)
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). Ieee (2009)
Zeiler, M. D., Taylor, G. W., & Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International Conference on Computer Vision (pp. 2018–2025). IEEE (2011)
Xu, J., Liu, W., Xing, W., & Wei, X.: MSPENet: multi-scale adaptive fusion and position enhancement network for human pose estimation. The Visual Computer, pp. 1–15 (2022)
Sobel, Irwin.: An Isotropic 3x3 image gradient operator. Presentation at Stanford A.I. Project 1968 (2014)
Gupta, S., Gupta, C., & Chakarvarti, S. K.: Image Edge Detection: A Review. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 2(7) (2013)
Razakarivony, S., Jurie, F.: Vehicle detection in aerial imagery: A small target detection benchmark. J. Vis. Commun. Image Represent. 34, 187–203 (2016)
Article Google Scholar
Hore, A., & Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: 2010 20th international conference on pattern recognition (pp. 2366–2369). IEEE (2010)
Wang, Z., Simoncelli, E. P., & Bovik, A. C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 (Vol. 2, pp. 1398–1402). Ieee (2003)
Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62103432.

Author information

Authors and Affiliations

High-Tech Institute of Xi’an, Xi’an, 710025, China
Decao Ma, Yong Xian, Bing Li, Shaopeng Li & Daqiao Zhang
Department of Automation, Tsinghua University, Beijing, 100084, China
Shaopeng Li

Authors

Decao Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xian
View author publications
You can also search for this author in PubMed Google Scholar
Bing Li
View author publications
You can also search for this author in PubMed Google Scholar
Shaopeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Daqiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Decao Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, D., Xian, Y., Li, B. et al. Visible-to-infrared image translation based on an improved CGAN. Vis Comput 40, 1289–1298 (2024). https://doi.org/10.1007/s00371-023-02847-5

Download citation

Accepted: 15 March 2023
Published: 07 April 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02847-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visible-to-infrared image translation based on an improved CGAN

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Methods for image denoising using convolutional neural network: a review

Image Matching from Handcrafted to Deep Features: A Survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visible-to-infrared image translation based on an improved CGAN

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Methods for image denoising using convolutional neural network: a review

Image Matching from Handcrafted to Deep Features: A Survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation