Abstract
A multi-spectral imaging technique for the swift fusion of red–green–blue (RGB) and near infrared (NIR) image pairs with a deep learning based resolution enhancement technique is proposed, mpirically investigated and compared to some state-of-the-art techniques in the current work. The results of the proposed multi-spectral image fusion demonstrate good chrominance preservation, improved sharpness and optimised lighting in low-light dawn and dusk scenes. The fused image shows the culmination of the edges that are inherent to both the RGB and NIR spectrum images. Some examples include increased visibility between vegetation and the sky, shadowed and non-shaded areas, and increased optical depth in tree branches and vehicles. A hue, saturation, value (HSV)–NIR fusion is also evaluated by simply converting the RGB image to the HSV colour space. HSV, due to its high colour strength, illuminates high-colour contrast artefacts such as road signs and the rear of vehicles better than their RGB-based fused image equivalent. Empirical research shows that RGB–NIR fusion outperforms other strategies in contrast restoration metric (r), two image quality assessment metrics, and a peak-to-noise-ratio metric. The two image fusion models are implemented in a deep learning semantic segmentation network to investigate their perceived consistency in real-world scenarios. The proposed coarse-grained semantic segmentation network is trained to auto-annotate pixels as belonging to one of the 10 classes. The per-class performance of the RGB–NIR and HSV–NIR-based semantic segmentation in comparison with other methods is discussed in detail in the current work.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig4_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig5_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00138-021-01210-9/MediaObjects/138_2021_1210_Fig7_HTML.png)
Similar content being viewed by others
Abbreviations
- BRISQUE:
-
Blind/Referenceless Image Spatial Quality Evaluator
- CNN:
-
Convolutional neural network
- CRF:
-
Conditional random field
- DCNN:
-
Deep convolutional neural network
- DWT:
-
Discrete wavelet transform
- FAAGKFCM:
-
Fast and automatically adjustable GRBF kernel-based FCM
- FN:
-
False negative
- FP:
-
False positive
- GPU:
-
Graphics processing unit
- IDWT:
-
Inverse discrete wavelet transform
- IOU:
-
Intersection of union
- IQA:
-
Image quality assessment
- ILSVRC:
-
ImageNet Large Scale Visual Recognition Challenge
- MEITY:
-
Ministry of Electronics and Information Technology
- NIR:
-
Near infrared
- NIQE:
-
Naturalness Image Quality Evaluator
- PSNR:
-
Peak signal to noise ratio
- RANUS:
-
RGB and NIR urban scene dataset
- RGB:
-
Red–green–blue
- SGDM:
-
Stochastic gradient descent with momentum
- SIFT:
-
Scale invariant feature transform
- SISR:
-
Single image super resolution
- SSIM:
-
Structural Similarity Index
- TN:
-
True negative
- TP:
-
True positive
- UAV:
-
Unmanned aerial vehicle
- VDSR:
-
Very deep super resolution
- VGG:
-
Visual Geometry Group
References
Salamati, N., Larius, D., Csurka, G., Susstrunk, S.: Incorporating near-infrared information into semantic image segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 461–471 (2012)
Salamati, N., Fredembach, C., Süsstrunk, S.: Material classification using color and NIR images. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2009)
Salamati, N., Larlus, D., Csurka, G., Süsstrunk, S.: Semantic image segmentation using visible and near-infrared channels. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012)
Morris, N.J.W., Avidan, S., Matusik, W., Pfister, H.: Statistics of infrared images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007)
Zhou, W., Huang, G., Troy, A., Cadenasso, M.L.: Object-based land cover classification of shaded areas in high spatial resolution imagery of urban areas: a comparison study. Remote Sens. Environ. (2009). https://doi.org/10.1016/j.rse.2009.04.007
Walter, V.: Object-based classification of remote sensing data for change detection. ISPRS J. Photogramm. Remote Sens. (2004). https://doi.org/10.1016/j.isprsjprs.2003.09.007
Kong, S.G., Heo, J., Abidi, B.R., Paik, J., Abidi, M.A.: Recent advances in visual and infrared face recognition—a review. Comput. Vis. Image Underst. (2005). https://doi.org/10.1016/j.cviu.2004.04.001
Shwartz, S., Namer, E., Schechner, Y.Y.: Blind haze separation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)
Feng, C., Zhuo, S., Zhang, X., Shen, L., Süsstrunk, S.: Near-infrared guided color image dehazing. In: 2013 IEEE International Conference on Image Processing, ICIP 2013—Proceedings (2013)
Zhang, X., Sim, T., Miao, X.: Enhancing photographs with near infrared images. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Schaul, L., Fredembach, C., Süsstrunk, S.: Color image dehazing using the near-infrared. In: Proceedings—International Conference on Image Processing, ICIP (2009)
Li, Z., Tan, P., Tan, R.T., Zou, D., Zhou, S.Z., Cheong, L.F.: Simultaneous video defogging and stereo reconstruction. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Meng, G., Wang, Y., Duan, J., Xiang, S., Pan, C.: Efficient image dehazing with boundary constraint and contextual regularization. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
Tang, K., Yang, J., Wang, J.: Investigating haze-relevant features in a learning framework for image dehazing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)
Jang, D.W., Park, R.H.: Colour image dehazing using near-infrared fusion. IET Image Process. (2017). https://doi.org/10.1049/iet-ipr.2017.0192
Ancuti, C.O., Ancuti, C.: Single image dehazing by multi-scale fusion. IEEE Trans. Image Process. (2013). https://doi.org/10.1109/TIP.2013.2262284
Kudo, Y., Kubota, A.: Image dehazing method by fusing weighted near-infrared image. In: 2018 International Workshop on Advanced Image Technology, IWAIT 2018 (2018)
Sappa, A.D., Carvajal, J.A., Aguilera, C.A., Oliveira, M., Romero, D., Vintimilla, B.X.: Wavelet-based visible and infrared image fusion: a comparative study. Sensors (Switzerland) (2016). https://doi.org/10.3390/s16060861
Varjo, S., Hannuksela, J., Alenius, S.: Comparison of near infrared and visible image fusion methods. In: Proceedings of International Workshop on Applications, Systems and Services for Camera Phone Sensing (2011)
Li, J., Song, M., Peng, Y.: Infrared and visible image fusion based on robust principal component analysis and compressed sensing. Infrared Phys. Technol. (2018). https://doi.org/10.1016/j.infrared.2018.01.003
Scharwachter, T., Franke, U.: Low-level fusion of color, texture and depth for robust road scene understanding. In: IEEE Intelligent Vehicles Symposium, Proceedings (2015)
Sturgess, P., Alahari, K., Ladický, L., Torr, P.H.S.: Combining appearance and structure from motion features for road scene understanding. In: British Machine Vision Conference, BMVC 2009—Proceedings (2009)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (2015). https://doi.org/10.1007/s11263-015-0816-y
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. (2014). https://doi.org/10.1007/s11263-014-0733-5
Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollár, P.: Microsoft COCO: common objects in context. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014—Proceedings of the 2014 ACM Conference on Multimedia (2014)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: A matlab-like environment for machine learning. In: BigLearn, NIPS workshop (No. CONF) (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Communications of the ACM, vol. 60, no. 6, pp. 84–90 (2017)
Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)
Hariharan, B., ArbelÃiez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: European Conference on Computer Vision, pp. 297–312. Springer, Cham (2014)
Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: 2009 IEEE 12th international conference on computer vision, pp. 670–677. IEEE (2009)
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 (2011)
Farabet, C., Couprie, C., Najman, L., Lecun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. (2013). https://doi.org/10.1109/TPAMI.2012.231
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)
Salamati, N., Süsstrunk, S.: Material-based object segmentation using near-infrared information. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2010)
Choe, G., Kim, S.H., Im, S., Lee, J.Y., Narasimhan, S.G., Kweon, I.S.: RANUS: RGB and NIR urban scene dataset for deep scene parsing. IEEE Robot. Autom. Lett. (2018). https://doi.org/10.1109/LRA.2018.2801390
Nongmeikapam, K., Kumar, W.K., Singh, A.D.: Fast and automatically adjustable GRBF kernel based fuzzy C-means for cluster-wise coloured feature extraction and segmentation of MR images. IET Image Process. (2018). https://doi.org/10.1049/iet-ipr.2017.1102
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)
Henning, M., Thomas, D., others: The IAPR benchmark: a new evaluation resource for visual information systems. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)
Höft, N., Schulz, H., Behnke, S.: Fast semantic segmentation of RGB-D scenes with GPU-accelerated deep neural networks. In: Lecture Notes in Computer Science (LNCS) (Including Subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics) (2014). https://doi.org/10.1007/978-3-319-11206-0_9
Socher, R., Lin, C.C.Y., Ng, A.Y., Manning, C.D.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011 (2011)
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010 (2010)
Hautière, N., Tarel, J.P., Aubert, D., Dumont, É.: Blind contrast enhancement assessment by gradient ratioing at visible edges. Image Anal. Stereol. (2008). https://doi.org/10.5566/ias.v27.p87-95
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. (2011). https://doi.org/10.1109/TPAMI.2010.168
Zhu, Q., Mai, J., Shao, L.: A fast single image haze removal algorithm using color attenuation prior. IEEE Trans. Image Process. (2015). https://doi.org/10.1109/TIP.2015.2446191
Yan, Q., Shen, X., Xu, L., Zhuo, S., Zhang, X., Shen, L., Jia, J.: Cross-field joint image restoration via scale map. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
Acknowledgements
The current work is supported by a research grant from The Ministry of Electronics and Information Technology (MEITY), Govt. of India vide Grant No. 4(6)/2018-ITEA.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, W., Singh, N., Singh, A. et al. Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments. Machine Vision and Applications 32, 88 (2021). https://doi.org/10.1007/s00138-021-01210-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-021-01210-9