Skip to main content
Log in

TVA-GAN: attention guided generative adversarial network for thermal to visible image transformations

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In the recent improvement in deep learning approaches for realistic image generation and translation, Generative Adversarial Networks (GANs) delivered favorable results. GAN generates novel samples that look indistinguishable from authentic images. This paper proposes a novel generative network for thermal-to-visible image translation. Thermal to Visible synthesis is challenging due to the non-availability of accurate semantic and textural information in thermal images. The thermal sensors acquire the thermal face images by capturing the object’s luminance with fewer details about the actual facial information. However, it is advantageous for low-light and night-time vision, where image information cannot be captured in a complex environment by an RGB camera. We design a new Attention-guided Cyclic Generative Adversarial Network for Thermal to Visible Face transformation (TVA-GAN) by integrating a new attention network. We utilize attention guidance with a recurrent block with an Inception module to simplify the learning space toward the optimum solution. The proposed TVA-GAN is trained and evaluated for thermal to visible face synthesis over three benchmark datasets, including the WHU-IIP, Tufts Face Thermal2RGB, and CVBL-CHILD datasets. The proposed TVA-GAN results show promising improvement in face synthesis compared to the state-of-the-art GAN methods. For the proposed TVA-GAN, code is available at: https://github.com/GANGREEK/TVA-GAN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The WHU-IIP, Tufts Face Thermal2RGB (http://tdface.ece.tufts.edu/), and CVBL-CHILD datasets (https://cvbl.iiita.ac.in/dataset.php) used in the paper were taken from [67,68,69], respectively. Datasets are available from the authors upon reasonable request.

Notes

  1. https://github.com/junyanz/pytorch-CycleGAN-and-Pix2pix.

  2. https://github.com/duxingren14/DualGAN.

  3. https://github.com/AlamiMejjati/Unsupervised-Attention-guided-Image-to-Image-Translation.

  4. https://github.com/Ha0Tang/AttentionGAN.

  5. https://github.com/vlkniaz/ThermalGAN.

  6. https://cvbl.iiita.ac.in/dataset.php.

References

  1. Siesler HW, Ozaki Y, Kawata S, Heise HM (2008) Near-infrared spectroscopy: principles, instruments, applications. John Wiley & Sons, London

    Google Scholar 

  2. Havens KJ, Sharp EJ (2016) Chapter 7—thermal imagers and system considerations. In: Havens KJ, Sharp EJ (eds) Thermal imaging techniques to survey and monitor animals in the wild. Academic Press, Boston, pp 101–119. https://doi.org/10.1016/B978-0-12-803384-5.00007-5

    Chapter  Google Scholar 

  3. Banfield D, Conrath B, Pearl J, Smith M, Christensen P (2000) Thermal tides and stationary waves on mars as revealed by mars global surveyor thermal emission spectrometer. J Geophys Res 105:9521–9537

    Article  Google Scholar 

  4. FLIR A (2010) The ultimate infrared handbook for r &d professionals. FLIR Systems, Boston

  5. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  6. Mao X, Shen C, Yang YB (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. Adv Neural Inf Process syst, pp 2802–2810

  7. Wang L, Sindagi V, Patel V (2018) High-quality facial photo-sketch synthesis using multi-adversarial networks. In: 2018 13th IEEE international conference on automatic face and gesture recognition (FG 2018). IEEE, pp 83–90

  8. Shen Y, Luo P, Yan J, Wang X, Tang X (2018) Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830

  9. Peng C, Wang N, Li J, Gao X (2020) Face sketch synthesis in the wild via deep patch representation-based probabilistic graphical model. IEEE Trans Inf Forensics Security 15:172–183

    Article  Google Scholar 

  10. Xia Y, Zheng W, Wang Y, Yu H, Dong J, Wang FY (2021) Local and global perception generative adversarial network for facial expression synthesis. IEEE Trans Circuits Syst Video Technol

  11. Yang Y, Liu J, Huang S, Wan W, Wen W, Guan J (2021) Infrared and visible image fusion via texture conditional generative adversarial network. IEEE Trans Circuits Syst Video Technol

  12. Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976

  13. Bharti V, Biswas B, Shukla KK (2021) Emocgan: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation. Neural Comput Appl, pp 1–15

  14. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder—decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734

  15. Xu S, Zhu Q, Wang J (2020) Generative image completion with image-to-image translation. Neural Comput Appl 32(11):7333–7345

    Article  Google Scholar 

  16. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). 10.1109/CVPR.2017.632

  17. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  18. Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857

  19. Liao B, Chen Y (2007) An image quality assessment algorithm based on dual-scale edge structure similarity. In: Second international conference on innovative computing, informatio and control (ICICIC 2007). IEEE, pp 56–56

  20. Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision. Springer, pp 649–666

  21. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690

  22. Souly N, Spampinato C, Shah M (2017) Semi supervised semantic segmentation using generative adversarial network. In: Proceedings of the IEEE international conference on computer vision, pp 5688–5696

  23. Abdal R, Qin Y, Wonka P (2019) Image2stylegan: How to embed images into the stylegan latent space? In: Proceedings of the IEEE international conference on computer vision, pp 4432–4441

  24. Yuan M, Peng Y (2019) Bridge-gan: interpretable representation learning for text-to-image synthesis. IEEE Trans Circuits Syst Video Technol 30(11):4258–4268

    Article  Google Scholar 

  25. Liao K, Lin C, Zhao Y, Gabbouj M (2019) Dr-gan: Automatic radial distortion rectification using conditional gan in real-time. IEEE Trans Circuits Syst Video Technol 30(3):725–733

    Article  Google Scholar 

  26. Zhang S, Ji R, Hu J, Lu X, Li X (2018) Face sketch synthesis by multidomain adversarial learning. IEEE Trans Neural Netw Learn Syst 30(5):1419–1428

    Article  Google Scholar 

  27. Serengil SI, Ozpinar A (2020) Lightface: A hybrid deep face recognition framework. In: 2020 Innovations in intelligent systems and applications conference (ASYU). IEEE, pp 23–27. https://doi.org/10.1109/ASYU50717.2020.9259802

  28. Li J, Hao P, Zhang C, Dou M (2008) Hallucinating faces from thermal infrared images. In: 2008 15th IEEE international conference on image processing. IEEE, pp 465–468

  29. Choi J, Hu S, Young SS, Davis LS (2012) Thermal to visible face recognition. In: Sensing technologies for global health, military medicine, disaster response, and environmental monitoring II; and biometric technology for human identification IX, vol 8371. International Society for Optics and Photonics, p 83711L

  30. Chen C, Ross A (2016) Matching thermal to visible face images using hidden factor analysis in a cascaded subspace learning framework. Pattern Recogn Lett 72:25–32

    Article  Google Scholar 

  31. Zhang H, Riggan BS, Hu S, Short NJ, Patel VM (2019) Synthesis of high-quality visible faces from polarimetric thermal faces using generative adversarial networks. Int J Comput Vis 127(6–7):845–862

    Article  Google Scholar 

  32. Hu S, Short NJ, Riggan BS, Gordon C, Gurton KP, Thielke M, Gurram P, Chan AL (2016) A polarimetric thermal database for face recognition research. In: 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 187–194

  33. Iranmanesh SM, Dabouei A, Kazemi H, Nasrabadi NM (2018) Deep cross polarimetric thermal-to-visible face recognition. In: 2018 international conference on biometrics (ICB). IEEE, pp 166–173

  34. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  35. Babu KK, Dubey SR (2020) Pcsgan: Perceptual cyclic-synthesized generative adversarial networks for thermal and nir to visible image transformation. Neurocomputing 413:41–50

    Article  Google Scholar 

  36. Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Adv Neural Inf Process Syst, pp 3693–3703

  37. Tang H, Xu D, Sebe N, Yan Y (2019) Attention-guided generative adversarial networks for unsupervised image-to-image translation. In: 2019 International joint conference on neural networks (IJCNN). IEEE, pp 1–8

  38. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354–7363

  39. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  40. Liu MY, Tuzel O (2016) Coupled generative adversarial networks. Adv Neural Inf Process Syst, pp 469–477

  41. Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. Adv Neural Inf Process Syst, pp 3693–3703

  42. Zhang H, Goodfellow IJ, Metaxas DN, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318

  43. Lejbølle AR, Nasrollahi K, Krogh B, Moeslund TB (2020) Person re-identification using spatial and layer-wise attention. IEEE Trans Inf Forensics Security 15:1216–1231. https://doi.org/10.1109/TIFS.2019.2938870

    Article  Google Scholar 

  44. Tang H, Liu HC, Xu D, Torr PHS, Sebe N (2019) Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks. arXiv:1911.11897

  45. Tang H, Xu D, Sebe N, Wang Y, Corso JJ, Yan Y (2019) Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2417–2426

  46. Tang H, Chen X, Wang W, Xu D, Corso JJ, Sebe N, Yan Y (2019) Attribute-guided sketch generation. In: 2019 14th IEEE international conference on automatic face and gesture recognition (FG 2019). IEEE, pp 1–7

  47. Chen H, Hu G, Lei Z, Chen Y, Robertson NM, Li SZ (2020) Attention-based two-stream convolutional networks for face spoofing detection. IEEE Trans Inf Forensics Security 15:578–593. https://doi.org/10.1109/TIFS.2019.2922241

    Article  Google Scholar 

  48. Nyberg A, Eldesokey A, Bergstrom D, Gustafsson D (2018) Unpaired thermal to visible spectrum transfer using adversarial training. In: Proceedings of the European conference on computer vision (ECCV) Workshops

  49. Kuang X, Zhu J, Sui X, Liu Y, Liu C, Chen Q, Gu G (2020) Thermal infrared colorization via conditional generative adversarial network. Infrared Phys Technol 107:103338

    Article  Google Scholar 

  50. Zhang T, Wiliem A, Yang S, Lovell B (2018) Tv-gan: Generative adversarial network based thermal to visible face recognition. In: 2018 International conference on biometrics (ICB). IEEE, pp 174–181

  51. Bhat N, Saggu N, Kumar S, et al (2020) Generating visible spectrum images from thermal infrared using conditional generative adversarial networks. In: 2020 5th International conference on communication and electronics systems (ICCES). IEEE, pp 1390–1394

  52. Kantarci A, Ekenel HK (2019) Thermal to visible face recognition using deep autoencoders. In: 2019 International conference of the biometrics special interest group (BIOSIG), pp 1–5

  53. Kezebou L, Oludare V, Panetta K, Agaian S (2020) Tr-gan: thermal to rgb face synthesis with generative adversarial network for cross-modal face recognition. In: Mobile multimedia/image processing, security, and applications 2020, vol 11399. International Society for Optics and Photonics, p 113990P

  54. Lahiri A, Bairagya S, Bera S, Haldar S, Biswas PK (2021) Lightweight modules for efficient deep learning based image restoration. IEEE Trans Circuits Syst Video Technol 31(4):1395–1410. https://doi.org/10.1109/TCSVT.2020.3007723

    Article  Google Scholar 

  55. Tan DS, Lin YX, Hua KL (2021) Incremental learning of multi-domain image-to-image translations. IEEE Trans Circuits Syst Video Technol 31(4):1526–1539. https://doi.org/10.1109/TCSVT.2020.3005311

    Article  Google Scholar 

  56. Xu S, Liu D, Xiong Z (2021) E2i: Generative inpainting from edge to image. IEEE Trans Circuits Syst Video Technol 31(4):1308–1322. https://doi.org/10.1109/TCSVT.2020.3001267

    Article  Google Scholar 

  57. Zhong X, Lu T, Huang W, Ye M, Jia X, Lin CW (2021) Grayscale enhancement colorization network for visible-infrared person re-identification. IEEE Trans Circuits Syst Video Technol, pp 1–1. https://doi.org/10.1109/TCSVT.2021.3072171

  58. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  59. Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999

  60. Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1412–1421

  61. Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3367–3375. https://doi.org/10.1109/CVPR.2015.7298958

  62. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges C, Bottou L, Weinberger K (eds) Advances in neural information processing systems, vol 25. Curran Associates Inc, Red Hook

    Google Scholar 

  63. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308

  64. Mao X, Li Q, Xie H, Lau RY, Wang Z, Smolley SP (2017) Least squares generative adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 2813–2821

  65. Kancharagunta KB, Dubey SR (2019) Csgan: Cyclic-synthesized generative adversarial networks for image-to-image transformation. arXiv preprint arXiv:1901.03554

  66. Kniaz VV, Knyaz VA, Hladůvka J, Kropatsch WG, Mizginov VA (2018) ThermalGAN: multimodal color-to-thermal image translation for person re-identification in multispectral dataset. In: Computer vision—ECCV 2018 workshops. Springer International Publishing

  67. Wang Z, Chen Z, Wu F (2018) Thermal to visible facial image translation using generative adversarial networks. IEEE Signal Process Lett 25:1161–1165

    Article  Google Scholar 

  68. Panetta K, Wan Q, Agaian S, Rajeev S, Kamath S, Rajendran R, Rao S, Kaszowska A, Taylor H, Samani A, et al (2018) A comprehensive database for benchmarking imaging systems. IEEE Trans Pattern Anal Mach Intell

  69. Kumar S, Singh SK (2018) A comparative analysis on the performance of different handcrafted descriptors over thermal and low resolution visible image dataset. In: 2018 5th IEEE Uttar Pradesh section international conference on electrical, electronics and computer engineering (UPCON), pp 1–6. https://doi.org/10.1109/UPCON.2018.8596897

  70. Dubey SR, Chakraborty S, Roy SK, Mukherjee S, Singh SK, Chaudhuri BB (2020) diffgrad: An optimization method for convolutional neural networks. IEEE Trans Neural Netw Learn Syst 31(11):4500–4511. https://doi.org/10.1109/TNNLS.2019.2955777

    Article  MathSciNet  Google Scholar 

  71. Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: International conference on learning representation

  72. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR

  73. Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430–444. https://doi.org/10.1109/TIP.2005.859378

    Article  Google Scholar 

  74. Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR abs/1312.6034

  75. Lahitani AR, Permanasari AE, Setiawan NA (2016) Cosine similarity to determine similarity measure: Study case in online essay assessment. In: 2016 4th International conference on cyber and IT service management, pp 1–6. https://doi.org/10.1109/CITSM.2016.7577578

Download references

Acknowledgements

The authors acknowledge the High Performance Computing facility of IIIT Allahabad used for the experiments in this paper.

Funding

We gratefully acknowledge the Indian Institute of Information Technology Allahabad, Ministry of Education, Govt. of India, for providing the fellowship to pursue this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nand Kumar Yadav.

Ethics declarations

Conflict of interest

We declare no conflict of interest.

Human and animal rights

We use the publicly available WHU-IIP and Tufts Thermal2RGB datasets for the experiments. The CVBL-CHILD dataset is collected by following the due process and consent from the subjects. No images are used in a way that can cause embarrassment to the subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, N.K., Singh, S.K. & Dubey, S.R. TVA-GAN: attention guided generative adversarial network for thermal to visible image transformations. Neural Comput & Applic 35, 19729–19749 (2023). https://doi.org/10.1007/s00521-023-08724-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08724-5

Keywords

Navigation