Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments

Kumar, Wahengbam Kanan; Singh, Ningthoujam Johny; Singh, Aheibam Dinamani; Nongmeikapam, Kishorjit

doi:10.1007/s00138-021-01210-9

Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments

Original Paper
Published: 28 May 2021

Volume 32, article number 88, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Wahengbam Kanan Kumar ORCID: orcid.org/0000-0002-6605-7304¹,
Ningthoujam Johny Singh²,
Aheibam Dinamani Singh³ &
…
Kishorjit Nongmeikapam²

459 Accesses
Explore all metrics

Abstract

A multi-spectral imaging technique for the swift fusion of red–green–blue (RGB) and near infrared (NIR) image pairs with a deep learning based resolution enhancement technique is proposed, mpirically investigated and compared to some state-of-the-art techniques in the current work. The results of the proposed multi-spectral image fusion demonstrate good chrominance preservation, improved sharpness and optimised lighting in low-light dawn and dusk scenes. The fused image shows the culmination of the edges that are inherent to both the RGB and NIR spectrum images. Some examples include increased visibility between vegetation and the sky, shadowed and non-shaded areas, and increased optical depth in tree branches and vehicles. A hue, saturation, value (HSV)–NIR fusion is also evaluated by simply converting the RGB image to the HSV colour space. HSV, due to its high colour strength, illuminates high-colour contrast artefacts such as road signs and the rear of vehicles better than their RGB-based fused image equivalent. Empirical research shows that RGB–NIR fusion outperforms other strategies in contrast restoration metric (r), two image quality assessment metrics, and a peak-to-noise-ratio metric. The two image fusion models are implemented in a deep learning semantic segmentation network to investigate their perceived consistency in real-world scenarios. The proposed coarse-grained semantic segmentation network is trained to auto-annotate pixels as belonging to one of the 10 classes. The per-class performance of the RGB–NIR and HSV–NIR-based semantic segmentation in comparison with other methods is discussed in detail in the current work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Infrared and visible image fusion via gradientlet filter and salience-combined map

Article 16 December 2023

SIE: infrared and visible image fusion based on scene information embedding

Article 15 April 2024

Infrared and Visible Image Fusion Using Morphological Reconstruction Filters and Refined Toggle-Contrast Edge Features

Abbreviations

BRISQUE:: Blind/Referenceless Image Spatial Quality Evaluator
CNN:: Convolutional neural network
CRF:: Conditional random field
DCNN:: Deep convolutional neural network
DWT:: Discrete wavelet transform
FAAGKFCM:: Fast and automatically adjustable GRBF kernel-based FCM
FN:: False negative
FP:: False positive
GPU:: Graphics processing unit
IDWT:: Inverse discrete wavelet transform
IOU:: Intersection of union
IQA:: Image quality assessment
ILSVRC:: ImageNet Large Scale Visual Recognition Challenge
MEITY:: Ministry of Electronics and Information Technology
NIR:: Near infrared
NIQE:: Naturalness Image Quality Evaluator
PSNR:: Peak signal to noise ratio
RANUS:: RGB and NIR urban scene dataset
RGB:: Red–green–blue
SGDM:: Stochastic gradient descent with momentum
SIFT:: Scale invariant feature transform
SISR:: Single image super resolution
SSIM:: Structural Similarity Index
TN:: True negative
TP:: True positive
UAV:: Unmanned aerial vehicle
VDSR:: Very deep super resolution
VGG:: Visual Geometry Group

References

Salamati, N., Larius, D., Csurka, G., Susstrunk, S.: Incorporating near-infrared information into semantic image segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 461–471 (2012)
Salamati, N., Fredembach, C., Süsstrunk, S.: Material classification using color and NIR images. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2009)
Salamati, N., Larlus, D., Csurka, G., Süsstrunk, S.: Semantic image segmentation using visible and near-infrared channels. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012)
Morris, N.J.W., Avidan, S., Matusik, W., Pfister, H.: Statistics of infrared images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007)
Zhou, W., Huang, G., Troy, A., Cadenasso, M.L.: Object-based land cover classification of shaded areas in high spatial resolution imagery of urban areas: a comparison study. Remote Sens. Environ. (2009). https://doi.org/10.1016/j.rse.2009.04.007
Article Google Scholar
Walter, V.: Object-based classification of remote sensing data for change detection. ISPRS J. Photogramm. Remote Sens. (2004). https://doi.org/10.1016/j.isprsjprs.2003.09.007
Article Google Scholar
Kong, S.G., Heo, J., Abidi, B.R., Paik, J., Abidi, M.A.: Recent advances in visual and infrared face recognition—a review. Comput. Vis. Image Underst. (2005). https://doi.org/10.1016/j.cviu.2004.04.001
Article Google Scholar
Shwartz, S., Namer, E., Schechner, Y.Y.: Blind haze separation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)
Feng, C., Zhuo, S., Zhang, X., Shen, L., Süsstrunk, S.: Near-infrared guided color image dehazing. In: 2013 IEEE International Conference on Image Processing, ICIP 2013—Proceedings (2013)
Zhang, X., Sim, T., Miao, X.: Enhancing photographs with near infrared images. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Schaul, L., Fredembach, C., Süsstrunk, S.: Color image dehazing using the near-infrared. In: Proceedings—International Conference on Image Processing, ICIP (2009)
Li, Z., Tan, P., Tan, R.T., Zou, D., Zhou, S.Z., Cheong, L.F.: Simultaneous video defogging and stereo reconstruction. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Meng, G., Wang, Y., Duan, J., Xiang, S., Pan, C.: Efficient image dehazing with boundary constraint and contextual regularization. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
Tang, K., Yang, J., Wang, J.: Investigating haze-relevant features in a learning framework for image dehazing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)
Jang, D.W., Park, R.H.: Colour image dehazing using near-infrared fusion. IET Image Process. (2017). https://doi.org/10.1049/iet-ipr.2017.0192
Article Google Scholar
Ancuti, C.O., Ancuti, C.: Single image dehazing by multi-scale fusion. IEEE Trans. Image Process. (2013). https://doi.org/10.1109/TIP.2013.2262284
Article MATH Google Scholar
Kudo, Y., Kubota, A.: Image dehazing method by fusing weighted near-infrared image. In: 2018 International Workshop on Advanced Image Technology, IWAIT 2018 (2018)
Sappa, A.D., Carvajal, J.A., Aguilera, C.A., Oliveira, M., Romero, D., Vintimilla, B.X.: Wavelet-based visible and infrared image fusion: a comparative study. Sensors (Switzerland) (2016). https://doi.org/10.3390/s16060861
Article Google Scholar
Varjo, S., Hannuksela, J., Alenius, S.: Comparison of near infrared and visible image fusion methods. In: Proceedings of International Workshop on Applications, Systems and Services for Camera Phone Sensing (2011)
Li, J., Song, M., Peng, Y.: Infrared and visible image fusion based on robust principal component analysis and compressed sensing. Infrared Phys. Technol. (2018). https://doi.org/10.1016/j.infrared.2018.01.003
Article Google Scholar
Scharwachter, T., Franke, U.: Low-level fusion of color, texture and depth for robust road scene understanding. In: IEEE Intelligent Vehicles Symposium, Proceedings (2015)
Sturgess, P., Alahari, K., Ladický, L., Torr, P.H.S.: Combining appearance and structure from motion features for road scene understanding. In: British Machine Vision Conference, BMVC 2009—Proceedings (2009)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. (2014). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollár, P.: Microsoft COCO: common objects in context. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014—Proceedings of the 2014 ACM Conference on Multimedia (2014)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: A matlab-like environment for machine learning. In: BigLearn, NIPS workshop (No. CONF) (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Communications of the ACM, vol. 60, no. 6, pp. 84–90 (2017)
Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)
Hariharan, B., ArbelÃiez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: European Conference on Computer Vision, pp. 297–312. Springer, Cham (2014)
Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: 2009 IEEE 12th international conference on computer vision, pp. 670–677. IEEE (2009)
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 (2011)
Farabet, C., Couprie, C., Najman, L., Lecun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. (2013). https://doi.org/10.1109/TPAMI.2012.231
Article Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)
Salamati, N., Süsstrunk, S.: Material-based object segmentation using near-infrared information. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2010)
Choe, G., Kim, S.H., Im, S., Lee, J.Y., Narasimhan, S.G., Kweon, I.S.: RANUS: RGB and NIR urban scene dataset for deep scene parsing. IEEE Robot. Autom. Lett. (2018). https://doi.org/10.1109/LRA.2018.2801390
Article Google Scholar
Nongmeikapam, K., Kumar, W.K., Singh, A.D.: Fast and automatically adjustable GRBF kernel based fuzzy C-means for cluster-wise coloured feature extraction and segmentation of MR images. IET Image Process. (2018). https://doi.org/10.1049/iet-ipr.2017.1102
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)
Henning, M., Thomas, D., others: The IAPR benchmark: a new evaluation resource for visual information systems. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)
Höft, N., Schulz, H., Behnke, S.: Fast semantic segmentation of RGB-D scenes with GPU-accelerated deep neural networks. In: Lecture Notes in Computer Science (LNCS) (Including Subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics) (2014). https://doi.org/10.1007/978-3-319-11206-0_9
Socher, R., Lin, C.C.Y., Ng, A.Y., Manning, C.D.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011 (2011)
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Article Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010 (2010)
Hautière, N., Tarel, J.P., Aubert, D., Dumont, É.: Blind contrast enhancement assessment by gradient ratioing at visible edges. Image Anal. Stereol. (2008). https://doi.org/10.5566/ias.v27.p87-95
Article MathSciNet MATH Google Scholar
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. (2011). https://doi.org/10.1109/TPAMI.2010.168
Article Google Scholar
Zhu, Q., Mai, J., Shao, L.: A fast single image haze removal algorithm using color attenuation prior. IEEE Trans. Image Process. (2015). https://doi.org/10.1109/TIP.2015.2446191
Article MathSciNet MATH Google Scholar
Yan, Q., Shen, X., Xu, L., Zhuo, S., Zhang, X., Shen, L., Jia, J.: Cross-field joint image restoration via scale map. In: Proceedings of the IEEE International Conference on Computer Vision (2013)

Download references

Acknowledgements

The current work is supported by a research grant from The Ministry of Electronics and Information Technology (MEITY), Govt. of India vide Grant No. 4(6)/2018-ITEA.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, North Eastern Regional Institute of Science and Technology, Nirjuli, Arunachal Pradesh, India
Wahengbam Kanan Kumar
Department of Computer Science and Engineering, Indian Institute of Information Technology, Imphal, Manipur, India
Ningthoujam Johny Singh & Kishorjit Nongmeikapam
Department of Electronics and Communication Engineering, National Institute of Technology, Langol, Manipur, India
Aheibam Dinamani Singh

Authors

Wahengbam Kanan Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Ningthoujam Johny Singh
View author publications
You can also search for this author in PubMed Google Scholar
Aheibam Dinamani Singh
View author publications
You can also search for this author in PubMed Google Scholar
Kishorjit Nongmeikapam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wahengbam Kanan Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, W., Singh, N., Singh, A. et al. Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments. Machine Vision and Applications 32, 88 (2021). https://doi.org/10.1007/s00138-021-01210-9

Download citation

Received: 14 April 2019
Revised: 14 January 2021
Accepted: 06 May 2021
Published: 28 May 2021
DOI: https://doi.org/10.1007/s00138-021-01210-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Infrared and visible image fusion via gradientlet filter and salience-combined map

SIE: infrared and visible image fusion based on scene information embedding

Infrared and Visible Image Fusion Using Morphological Reconstruction Filters and Refined Toggle-Contrast Edge Features

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Infrared and visible image fusion via gradientlet filter and salience-combined map

SIE: infrared and visible image fusion based on scene information embedding

Infrared and Visible Image Fusion Using Morphological Reconstruction Filters and Refined Toggle-Contrast Edge Features

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation