Skip to main content
Log in

Object matching between visible and infrared images using a Siamese network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this study, we propose a method for object matching between visible and infrared images. We consider object matching between visible and infrared images as a computational patch-matching problem, and the main solution to this problem involves calculating the similarities of the relationships between the objects in the target and search images. Therefore, in this study, we propose a Siamese neural network, which takes a pair of visible and infrared images as the input. Our proposed Siamese network comprises a convolutional neural network (CNN) to ensure the effective extraction of features from visible and infrared images. The CNN comprises convolutional and pooling layers without padding. By calculating the cross-correlation of the objects in the visible image and those in the entire infrared image, we regard the parts with the highest similarity as the matched targets. During the training process, we use focal loss to solve the problem of the imbalance between the positive and negative samples in the dataset, after which we use interpolation to determine the locations of the target patches in the infrared images. We then conduct experiments on different classes of targets, and the results demonstrate that our proposed approach achieves greater accuracy and precision than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International Journal Of Computer Vision 60:91–110

    Article  Google Scholar 

  2. Dellinger F, Delon J, Gousseau Y et al (2015) SAR-SIFT: A SIFT-Like algorithm for SAR images. IEEE Trans Geosci Remote Sens 53:453–466

    Article  Google Scholar 

  3. Ye Y, Shan J, Bruzzone L et al (2017) Robust registration of multimodal remote sensing images based on structural similarity. Ieee Transactions on Geoscience And Remote Sensing 55:2941–2958

    Article  Google Scholar 

  4. Gao P, Zhang Q, Wang F et al (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inf Sci 517:52–67

    Article  Google Scholar 

  5. Wu Y, Jiang X, Fang Z et al (2021) Multi-modal 3D object detection by 2D-guided precision anchor proposal and multi-layer fusion. Appl Soft Comput, 108

  6. Xu Y, Yang C, Sun B et al (2021) A novel multi-scale fusion framework for detail-preserving low-light image enhancement. Inf Sci 548:378–397

    Article  MathSciNet  Google Scholar 

  7. Xu Y, Sun B, Yan X et al (2020) Multi-focus image fusion using learning based matting with sum of the Gaussian-based modified Laplacian. Digital Signal Processing, p 106

  8. Xu Y, Sun B (2020) Color-compensated multi-scale exposure fusion based on physical features. Optik, p 223

  9. Yan X, Liu Y, Xu Y et al (2020) Multistep forecasting for diurnal wind speed based on hybrid deep learning model with improved singular spectrum decomposition. Energy Conversion And Management 225(2015):3279–3286

    Google Scholar 

  10. Hanif MS (2019) Patch match networks: Improved two-channel and Siamese networks for image patch matching. Pattern Recogn Lett 120:54–61

    Article  Google Scholar 

  11. Liu X, Ai Y, Zhang J et al (2018) A novel affine and contrast invariant descriptor for infrared and visible image registration. Remote Sens, p 10

  12. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Schmid C, Soatto S, Tomasi C (eds) 2005 Ieee computer society conference on computer vision and pattern recognition, vol 1, proceedings, pp 886–893

  13. Sedaghat A, Ebadi H (2015) Remote sensing image matching based on adaptive binning SIFT descriptor. Ieee Transactions on Geoscience And Remote Sensing 53:5283–5293

    Article  Google Scholar 

  14. Mao Y, He Z (2021) Dual-y network: infrared-visible image patches matching via semi-supervised transfer learning. Appl Intell 51:2188–2197

    Article  Google Scholar 

  15. Zhu R, Yu D, Ji S et al (2019) Matching RGB and infrared remote sensing images with densely-connected convolutional neural networks. Remote Sens 11(23):2836

    Article  Google Scholar 

  16. Yang Z, Dan T, Yang Y (2018) Multi-Temporal Remote sensing image registration using deep convolutional features. Ieee Access 6:38544–38555

    Article  Google Scholar 

  17. Zhang H, Ni W, Yan W et al (2019) Registration of multimodal remote sensing image based on deep fully convolutional neural network. Ieee Journal Of Selected Topics In Applied Earth Observations And Remote Sensing 12:3028–3042

    Article  Google Scholar 

  18. Gao P, Yuan R, Wang F et al (2020) Siamese attentional keypoint network for high performance visual tracking. Knowl-Based Syst, p 193

  19. He H., Chen M., Chen T., Li D., Cheng P. (2019) “Learning to match multitemporal optical satellite images using multi-support-patches Siamese networks,” Remote Sensing Letters, vol 110, pp 516-525, Jun 3

  20. Zhang Y, Po LM, Liu M et al (2020) Data-level information enhancement: Motion-patch-based Siamese Convolutional Neural Networks for human activity recognition in videos. Expert Syst Appl, p 147

  21. Gao Y, Xiong N, Yu W, et al. (2019) Learning Identity-Aware face features across poses based on deep siamese networks. Ieee Access 105789-105799:7

    Google Scholar 

  22. Qi YK, Zhang SP, Jiang F et al (2020) Siamese local and global networks for robust face tracking. IEEE Trans. Image Process. 29:9152–9164

    Article  Google Scholar 

  23. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26Th annual conference on neural information processing systems, vol 2012, pp 1097–1105

  24. Bertinetto L, Valmadre J, Henriques JF et al (2016) Fully-convolutional siamese networks for object tracking. In: 14Th european conference on computer vision(ECCV, vol 2016, pp 850–865

  25. Lin T -Y, Goyal P, Girshick R et al (2020) Focal loss for dense object detection. Ieee Transactions on Pattern Analysis And Machine Intelligence 42:318–327

    Article  Google Scholar 

  26. Kristan M, Matas J, Leonardis A et al (2019) The seventh visual object tracking VOT2019 challenge results. In: 17Th IEEE/CVF international conference on computer vision workshop(ICCVW, vol 2019, pp 2206–2241

  27. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. Journal Of Machine Learning Research 7:1–30

    MathSciNet  MATH  Google Scholar 

  28. Brown I, Mues C (2012) An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst Appl 39:3446–3453

    Article  Google Scholar 

  29. Bertinetto L, Valmadre J, Golodetz S et al (2016) Staple: Complementary Learners for Real-Time Tracking. In: 2016 Ieee Conference on Computer Vision And Pattern Recognition. p 1401–1409

  30. Galoogahi HK, Fagg A, Lucey S et al (2017) Learning Background-Aware Correlation Filters for Visual Tracking. In: 2017 Ieee International Conference on Computer Vision. p 1144–1152

  31. Wang N, Zhou W, Tian Q et al (2018) Multi-Cue Correlation Filters for Robust Visual Tracking. In: 2018 Ieee/Cvf Conference on Computer Vision And Pattern Recognition. p 4844– 4853

  32. Li F, Tian C, Zuo W et al (2018) Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking. In: 2018 Ieee/Cvf Conference on Computer Vision And Pattern Recognition. p 4904–4913

  33. Li Y., Zhu J., Hoi S.C., Song W., Wang Z., Liu H. (2019) Robust estimation of similarity transformation for visual object tracking. Proceedings of the AAAI Conference on Artificial Intelligence 33(01):8666–8673

    Article  Google Scholar 

  34. Li B, Wu W, Wang Q, et al. (2019) SIAMRPN++: Evolution Of siamese visual tracking with very deep networks. In: 32Nd IEEE/CVF conference on computer vision and pattern recognition(CVPR, vol 2019, pp 4277–4286

  35. Lessmann S, Baesens B, Mues C et al (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. Ieee Transactions on Software Engineering 34:485–496

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China(Grant no.11773018, 61727802),Key Research & Development programs in Jiangsu China(Grant no. BE2018126),Fundamental Research Funds for the Central Universities(Grant no. 30919011401, 30920010001),Leading Technology of Jiangsu Basic Research Plan(BK20192003), Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX21_0270).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiubao Sui.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Chen, Q., Gu, G. et al. Object matching between visible and infrared images using a Siamese network. Appl Intell 52, 7734–7746 (2022). https://doi.org/10.1007/s10489-021-02841-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02841-1

Keywords

Navigation