Thermal infrared image semantic segmentation for night-time driving scenes based on deep learning

Maheswari, B.; Reeja, S. R.

doi:10.1007/s11042-023-15882-0

Thermal infrared image semantic segmentation for night-time driving scenes based on deep learning

Published: 10 June 2023

Volume 82, pages 44885–44910, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

B. Maheswari¹ &
S. R. Reeja¹

371 Accesses
1 Citation
Explore all metrics

Abstract

Semantic segmentation of thermal infrared (ThIR) images is challenging because the images considered in this task are highly complex. The discrimination of image regions is very difficult, and the traditional techniques fail to discover the crucial semantic information from the images completely. To overcome such issue, this paper introduces a novel network model for ThIR image semantic segmentation that facilitates effective image-to-image translation and reduces semantic encoding ambiguity. The proposed model is named top-down attention and gradient alignment-based graph neural network (AGAGNN). A top-down guided attention module (GAM) is utilized in the proposed model to deal with semantic encoding ambiguity. Apart from this, an elaborate attention loss is introduced to ensure a hierarchical coding of features. Also, the edge distortion problem due to the translation of images is reduced with an organized gradient alignment loss. The proposed model is evaluated under the Python platform based on pixel-level annotations over the KAIST dataset. The proposed model has shown 98.3% accuracy, and the comparative analysis has proved that the model is more effective than the existing models in preserving semantic information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation

Article 01 July 2022

Rgb-t semantic segmentation based on cross-operational fusion attention in autonomous driving scenario

Article 23 February 2024

A Novel Semantic Segmentation Method for High-Resolution Remote Sensing Images Based on Visual Attention Network

Data availability

Data sharing is not applicable to this article.

References

Abbadi NKE, Razaq ES (2020) Automatic gray images colorization based on lab color space. Indones J Electr Eng Comput Sci 18(3):1501–1509
Google Scholar
Anoosheh A, Sattler T, Timofte R, Pollefeys M and Van Gool L (2019) Night-to-day image translation for retrieval-based localization. In 2019 International Conference on Robotics and Automation (ICRA), IEEE, pp. 5958-5964.
Asano H, Hirakawa E, Hayashi H, Hamada K, Asayama Y, Oohashi M, Uchiyama A, Higashino T (2022) A method for improving semantic segmentation using thermographic images in infants. BMC Med Imag 22(1):1–13
Article Google Scholar
Balit E, Chadli A (2020) GMFNet: Gated multimodal fusion network for visible-thermal semantic segmentation, In Proceedings 16th the European Conference on Computer Vision pp. 1-4
Cao Y, Guan D, Huang W, Yang J, Cao Y, Qiao Y (2019) Pedestrian detection with unsupervised multispectral feature learning using deep neural networks. Inform Fusion 46:206–217
Article Google Scholar
Choi KC, Ryu KS, Lee SH, Kim YH, Lee SJ, Park CK (2021) Thermal image semantic segmentation using multispectral unsupervised domain adaptation
Deng F, Feng H, Liang M, Wang H, Yang Y, Gao Y, Chen J, Hu J, Guo X, Lam TL (2021) FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation, In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4467-4473. https://doi.org/10.1109/IROS51168.2021.9636084
Feng D, Haase-Schütz C, Rosenbaum L, Hertlein H, Glaeser C, Timm F, Wiesbeck W, Dietmayer K (2020) Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Trans Intell Transp Syst 22(3):1341–1360
Article Google Scholar
He DH, Yang KF, Wan XM, Xiao F, Yan HM, Li YJ (2022) A new representation of scene layout improves saliency detection in traffic scenes, Expert Systems with Applications 193:116425.
He Y, Deng B, Wang H, Cheng L, Zhou K, Cai S, Ciampa F (2021) Infrared machine vision and infrared thermography with deep learning: a review. Infrared Physics Technol 116:103754
Article Google Scholar
Hou J, Zhang D, Wu W, Ma J, Zhou H (2021) A generative adversarial network for infrared and visible image fusion based on semantic segmentation. Entropy. 23(3):376
Article MathSciNet Google Scholar
Huang X, Liu MY, Belongie S and Kautz J (2018) Multimodal unsupervised image-to-image translation. In Proceedings of the European conference on computer vision (ECCV), pp. 172-189.
John V, Mita S, Lakshmanan A, Boyali A, Thompson S (2021) Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic Forecasting, J Auton Veh Syst 1(2):
Khalid B, Akram MU, Khan AM (2020) Multistage deep neural network framework for people detection and localization using fusion of visible and thermal images. International Conference on Image and Signal Processing. Springer, Cham, pp 138–147
Chapter Google Scholar
Kim J, Kim M, Kang H and Lee K (2019) U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. arXiv preprint arXiv:1907.10830
Kniaz VV, Bordodymov AN (2019) Long wave infrared image colorization for person re-identification. In International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences
Kuang X, Zhu J, Sui X, Liu Y, Liu C, Chen Q, Gu G (2020) Thermal infrared colorization via conditional generative adversarial network. Infrared Physics Technol 107:103338
Article Google Scholar
Lee HY, Tseng HY, Mao Q, Huang JB, Lu YD, Singh M, Yang MH (2020) Drit++: Diverse image-to-image translation via disentangled representations. Int J Comput Vis 128:2402–2417
Article Google Scholar
Li Y, Ma Y, Wu J and Long C (2021) Hybrid feature based Pyramid Network for Night-time Semantic Segmentation, In VISIGRAPP (4: VISAPP). 321-328
Li C, Xia W, Yan Y, Luo B, Tang J (2021) Segmenting objects in day and night: Edge-conditioned CNN for thermal image semantic segmentation. IEEE Trans Neural Netw Learn Syst 32(7):3069–3082
Article Google Scholar
Li G, Yang Y, Qu X, Cao D, Li K (2021) A deep learning based image enhancement approach for autonomous driving at night. Knowl-Based Syst 213:106617
Article Google Scholar
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks, Advances in neural information processing systems 30.
Lu Y, Lu G (2021) An alternative of Lidar in night-time: Unsupervised depth estimation based on single thermal image, In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 3833-3843.
Luo F, Li Y, Zeng G, Peng P, Wang G, Li Y (2022) Thermal Infrared Image Colorization for Night-time Driving Scenes With Top-Down Guided Attention, In IEEE Transactions On Intelligent Transportation Systems, pp. 1-16
Luo F, Cao Y, Li Y (2021) Night-time thermal infrared image colorization with dynamic label mining. In Image and Graphics: 11th International Conference, ICIG 2021, Haikou, China, August 6–8, 2021, Proceedings, Part III vol. 12890, (pp. 388-399). Cham: Springer
Lyu Y, Schiopu I, Munteanu A (2020) Multi-modal neural networks with multi-scale RGB-T fusion for semantic segmentation. Electron Lett 56(18):920–923
Article Google Scholar
Masouleh MK, Shah-Hosseini R (2019) Development and evaluation of a deep learning model for real-time ground vehicle semantic segmentation from UAV-based thermal infrared imagery. ISPRS J Photogramm Remote Sens 155:172–186
Article Google Scholar
Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646
Müller D, Ehlen A, Valeske B (2021) Convolutional neural networks for semantic segmentation as a tool for multiclass face analysis in thermal infrared. J Nondestruct Eval 40(1):1–10
Article Google Scholar
Munir F, Azam S, Fatima U and Jeon M (2021) ARTSeg: Employing Attention for Thermal Images Semantic Segmentation. In: Wallraven, C., Liu, Q., Nagahara, H. (eds) Pattern Recognition. ACPR 2021. Lecture Notes in Computer Science, vol 13188. Springer, Cham. https://doi.org/10.1007/978-3-031-02375-0_27 arXiv preprint arXiv:2111.15257
Panetta K, Kamath KS, Rajeev S, Agaian S (2021) FTNet: Feature Transverse Network for Thermal Image Semantic Segmentation. IEEE Access 9:145212–145227
Article Google Scholar
Pemasiri A, Nguyen K, Sridharan S, Fookes C (2021) Multi-modal semantic image segmentation. Comput Vis Image Underst 202:103085
Article Google Scholar
Pozzer S, Azar ER, Rosa FD, Pravia ZC (2021) Semantic Segmentation of Defects in Infrared Thermographic Images of Highly Damaged Concrete Structures. J Perform Constructed Facil 35(1):04020131
Rahman AK, Raihan MFMR, Islam SMM (2021) Pedestrian Detection in Thermal Images Using Deep Saliency Map and Instance Segmentation. Int J Image Graphics Signal Process 13(1):40–49
Article Google Scholar
Salau AO and Jain S (2019) Feature extraction: a survey of the types, techniques, applications. In 2019 international conference on signal processing and communication (ICSC) pp. 158-164. https://doi.org/10.1109/ICSC45622.2019.8938371
Salau AO, Yesufu TK, Ogundare BS (2021) Vehicle plate number localization using a modified GrabCut algorithm. J King Saud Univ-Comput Inform Sci 33(4):399–407
Google Scholar
Shojaiee F, Baleghi Y (2023) EFASPP U-Net for semantic segmentation of night traffic scenes using fusion of visible and thermal images. Eng Applic Art Intell 117:105627
Article Google Scholar
Shopovska I, Jovanov L, Philips W (2019) Deep visible and thermal image fusion for enhanced pedestrian visibility. Sensors. 19(17):3727
Article Google Scholar
Song S, Chen W, Liu Q, Hu H, Huang T, Zhu Q (2022) A novel deep learning network for accurate lane detection in low-light environments. Proc Inst Mech Eng Part D: J Automob Eng 236(2–3):424–438
Article Google Scholar
Speth S, Gonçalves A, Rigault B, Suzuki S, Bouazizi M, Matsuo Y, Prendinger H (2022) D Deep learning with RGB and thermal images onboard a drone for monitoring operations, J Field Robot 39(6):840–868
Sun Y, Zuo W, Liu M (2019) Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes. IEEE Robot Autom Lett 4(3):2576–2583
Article Google Scholar
Sun L, Wang K, Yang K, Xiang K (2019) See clearer at night: towards robust night-time semantic segmentation through day-night image conversion. Artif Intell Mach Learn Defense Applic Int Soc Opt Photon 111(69):111690
Google Scholar
Sun Y, Zuo W, Yun P, Wang H, Liu M (2021) FuseSeg: Semantic Segmentation of Urban Scenes Based on RGB and Thermal Data Fusion. IEEE Trans Autom Sci Eng 18:1000–1011
Article Google Scholar
Wang P, Bai X (2019) Thermal Infrared Pedestrian Segmentation Based on Conditional GAN. IEEE Trans Image Process 28:6007–6021
Article MathSciNet MATH Google Scholar
Xiong H, Cai W, Liu Q (2021) MCNet: Multi-level Correction Network for thermal image semantic segmentation of night-time driving scene. Infrared Physics Technol 113:103628
Article Google Scholar
Xu J, Lu K, Wang H (2021) Attention fusion network for multispectral semantic segmentation. Pattern Recogn Lett 146:179–184
Article Google Scholar
Xuan P, Cui H, Zhang H, Zhang T, Wang L, Nakaguchi T, Duh HB (2022) Dynamic graph convolutional autoencoder with node-attribute-wise attention for kidney and tumor segmentation from CT volumes. Knowl-Based Syst 236:107360
Article Google Scholar
Yadav R, Samir A, Rashed H, Yogamani S, Dahyot R (2020) Cnn based color and thermal image fusion for object detection in automated driving, Irish Machine Vision and Image Processing
Yi S, Li J, Liu X, Yuan X (2022) CCAFFMNet: Dual-spectral semantic segmentation network with channel-coordinate attention feature fusion module. Neurocomputing. 482:236–251
Article Google Scholar
Zhang Q, Zhao S, Luo Y, Zhang D, Huang N, Han J (2021) ABMDRNet: Adaptive-weighted bi-directional modality difference reduction network for RGB-T semantic segmentation, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 2633-2642.
Zhang L, Liu Z, Zhang S, Yang X, Qiao H, Huang K, Hussain A (2019) Cross-modality interactive attention network for multispectral pedestrian detection. Inform Fusion 50:20–29
Article Google Scholar
Zheng Z, Wu Y, Han X and Shi J (2020) Forkgan: Seeing into the rainy night. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 12348:155-170
Zhu JY, Park T, Isola P and Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pp. 2223-2232.

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, VIT-AP University, Amaravati, Andhra Pradesh, 522237, India
B. Maheswari & S. R. Reeja

Authors

B. Maheswari
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Reeja
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have equal contributions to this work.

Corresponding author

Correspondence to B. Maheswari.

Ethics declarations

Conflict of interest

Authors have no conflict of interest to declare.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

All the authors involved have agreed to participate in this submitted article.

Consent to publish

All the authors involved in this manuscript fully consent to publish this submitted article.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Maheswari, B., Reeja, S.R. Thermal infrared image semantic segmentation for night-time driving scenes based on deep learning. Multimed Tools Appl 82, 44885–44910 (2023). https://doi.org/10.1007/s11042-023-15882-0

Download citation

Received: 02 August 2022
Revised: 28 March 2023
Accepted: 22 May 2023
Published: 10 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11042-023-15882-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Thermal infrared image semantic segmentation for night-time driving scenes based on deep learning

Abstract

Access this article

Similar content being viewed by others

CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation

Rgb-t semantic segmentation based on cross-operational fusion attention in autonomous driving scenario

A Novel Semantic Segmentation Method for High-Resolution Remote Sensing Images Based on Visual Attention Network

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Thermal infrared image semantic segmentation for night-time driving scenes based on deep learning

Abstract

Access this article

Similar content being viewed by others

CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation

Rgb-t semantic segmentation based on cross-operational fusion attention in autonomous driving scenario

A Novel Semantic Segmentation Method for High-Resolution Remote Sensing Images Based on Visual Attention Network

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation