Skip to main content
Log in

Two-view correspondence learning via complex information extraction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Establishing reliable correspondences plays a vital role in many feature-matching based computer vision tasks. Given putative correspondences of feature points in two images, in this paper, we propose a novel network for inferring the probabilities of correspondences being inliers or outliers and regressing the relative pose encoded by the essential matrix. Previous research proposed an end-to-end permutation-equivariant classification network based on multi-layer perceptrons and context normalization. However, the context normalization treats each correspondence equally and ignore the extraction of channel information, as a result the representation capability of potential inliers can be reduced. To solve this problem, we apply attention mechanism in our network to capture complex information of the feature maps. Specifically, we introduce two types of attention blocks. We adopt the spatial attention block to capture complex spatial contextual information, and the rich channel information can be obtained by utilizing the channel attention block. To obtain richer contextual information and feature maps with stronger representative capacity, We combine these attention blocks with the PointCN block to form a new network with strong representative ability. Experimental results on several benchmark datasets show that the performance on outlier removal and camera pose estimation is significantly improved over the state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Amin S, Hamid E (2015) Remote sensing image matching based on adaptive binning sift descriptor. IEEE Transactions on Geoscience and Remote Sensing 53(10):5283–5293

    Article  Google Scholar 

  2. Antoine Maintz JB, Viergever MA (1998) A survey of medical image registration. Medical Image Analysis 2(1):1–36

    Article  Google Scholar 

  3. Bay H, Andreas E, Tinne T, Luc VG (2008) Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3):346–359

    Article  Google Scholar 

  4. Besl PJ, McKay ND (1992) Method for registration of 3-d shapes. In: Sensor fusion IV: control paradigms and data structures, vol 1611, pp 586–606

  5. Bian JW, Lin W-Y, Matsushita Y, Yeung S-K, Nguyen T-D, Cheng M-M (2017) Gms: grid-based motion statistics for fast, ultra-robust feature correspondence. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4181–4190

  6. Camps-Valls G, Tuia D, Gómez-Chova L, Jiménez S, Malo J (2011) Remote sensing image processing. Synthesis Lectures on Image, Video, and Multimedia Processing 5(1):1–192

    Article  Google Scholar 

  7. Chui H, Rangarajan A (2003) A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding 89(2–3):114–141

    Article  Google Scholar 

  8. Chum O, Matas J (2005) Matching with prosac-progressive sample consensus. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 220–226

  9. Chum O, Matas J, Kittler J (2003) Locally optimized ransac. In: Joint pattern recognition symposium. Springer, pp 236–243

  10. Dellinger F, Delon J, Gousseau Y, Michel J, Tupin F (2014) Sar-sift: a sift-like algorithm for sar images. IEEE Transactions on Geoscience and Remote Sensing 53(1):453–466

    Article  Google Scholar 

  11. DeTone D, Malisiewicz T, Rabinovich A (2018) Superpoint: self-supervised interest point detection and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 224–236

  12. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6):381–39

    Article  MathSciNet  Google Scholar 

  13. Han X, Leung T, Jia Y, Sukthankar R, Berg AC (2015) Matchnet: unifying feature and metric learning for patch-based matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3279–3286

  14. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  15. Jian B, Vemuri BC (2010) Robust point set registration using gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8):1633–1645

    Article  Google Scholar 

  16. Jiang X, Jiang J, Fan A, Wang Z, Ma J (2019) Multiscale locality and rank preservation for robust feature matching of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 57(9):6462–6472

    Article  Google Scholar 

  17. Jiang X, Ma J, Jiang J, Guo X (2020) Robust feature matching using spatial clustering with heavy outliers. IEEE Transactions on Image Processing 29:736–746

    Article  MathSciNet  Google Scholar 

  18. Jiang X, Ma J, Xiao G, Shao Z, Guo X (2021) A review of multimodal image matching: methods and applications. Information Fusion 73:22–71

    Article  Google Scholar 

  19. Jiang X, Ma J, Chen J (2019) Progressive filtering for feature matching. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 2217–2221

  20. Lenc K, Vedaldi A (2016) Learning covariant feature detectors. In: Proceedings of the european conference on computer vision, pp 100–117

  21. Li X, Zhanyi H (2010) Rejecting mismatches by correspondence function. International Journal of Computer Vision 89(1):1–17

    Article  Google Scholar 

  22. Lin W-Y, Wang F, Cheng M-M, Yeung S-K, Torr PHS, Do MN, Lu J (2017) Code: coherence based decision boundaries for feature correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(1):34–47

    Article  Google Scholar 

  23. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2):91–110

    Article  Google Scholar 

  24. Ma J, Zhao J, Tian J, Bai X, Zhuowen T (2013) Regularized vector field learning with sparse approximation for mismatch removal. Pattern Recognition 46(12):3519–3532

    Article  Google Scholar 

  25. Ma J, Zhao J, Tian J, Yuille AL, Tu Z (2014) Robust point matching via vector field consensus. IEEE Transactions on Image Processing 23(4):1706–1721

    Article  MathSciNet  Google Scholar 

  26. Ma J, Ma Y, Zhao J, Tian J (2014) Image feature matching via progressive vector field consensus. IEEE Signal Processing Letters 22(6):767–771

    Article  Google Scholar 

  27. Ma J, Qiu W, Zhao J, Ma Y, Yuille AL, Tu Z (2015) Robust l2e estimation of transformation for non-rigid registration. IEEE Transactions on Signal Processing 63(5):1115–1129

    Article  MathSciNet  Google Scholar 

  28. Ma J, Zhao J, Yuille AL (2016) Non-rigid point set registration by preserving global and local structures. IEEE Transactions on Image Processing 25(1):53–64

    Article  MathSciNet  Google Scholar 

  29. Ma J, Jiang J, Zhou H, Zhao J, Guo X (2018) Guided locality preserving feature matching for remote sensing image registration. IEEE Transactions on Geoscience and Remote Sensing 56(8):4435–4447

    Article  Google Scholar 

  30. Ma J, Wu J, Zhao J, Jiang J, Zhou H, Sheng QZ (2019) Nonrigid point set registration with robust transformation learning under manifold regularization. IEEE Transactions on Neural Networks and Learning Systems 30(12):3584–3597

    Article  MathSciNet  Google Scholar 

  31. Ma J, Zhao J, Jiang J, Zhou H, Guo X (2019) Locality preserving matching. International Journal of Computer Vision 127(5):512–531

    Article  MathSciNet  Google Scholar 

  32. Ma J, Jiang X, Jiang J, Zhao J, Guo X (2019) Lmr: learning a two-class classifier for mismatch removal. IEEE Transactions on Image Processing 28(8):4045–4059

    Article  MathSciNet  Google Scholar 

  33. Ma J, Jiang X, Fan A, Jiang J, Yan J (2021) Image matching from handcrafted to deep features: a survey. International Journal of Computer Vision 129(1):23–79

    Article  MathSciNet  Google Scholar 

  34. Ma J, Zhao J, Jiang J, Zhou H (2017) Non-rigid point set registration with robust transformation estimation under manifold regularization. In: Proceedings of the AAAI conference on artificial intelligence, pp 4218–4224

  35. Ma J, Zhao J, Jiang J, Zhou H, Zhou Y, Wang Z, Guo X (2018) Visual homing via guided locality preserving matching. In: Proceedings of the IEEE international conference on robotics and automation, pp 7254–7261

  36. Miao S, Wang ZJ, Liao R (2016) A cnn regression approach for real-time 2d/3d registration. IEEE Transactions on Medical Imaging 35(5):1352–1363

    Article  Google Scholar 

  37. Mur-Artal R, Montiel JMM, Tardos JD (2015) Orb-slam: a versatile and accurate monocular slam system. IEEE Transactions on Robotics 31(5):1147–1163

    Article  Google Scholar 

  38. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML

  39. Pan B, Zhang L, Yin H, Lan J, Cao F (2021) An automatic 2d to 3d video conversion approach based on rgb-d images. Multimedia Tools and Applications 80:19179–19201

    Article  Google Scholar 

  40. Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660

  41. Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: Proceedings of the IEEE international conference on computer vision, pp 2564–2571

  42. Schonberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4104–4113

  43. Simonovsky M, Gutiérrez-Becker B, Mateus D, Navab N, Komodakis N (2016) A deep metric for multimodal registration. In: Proceedings of the international conference on medical image computing and computer-assisted intervention, pp 10–18, 2016

  44. Sun W, Jiang W, Trulls E, Tagliasacchi A, Yi KM (2019) Attentive context normalization for robust permutation-equivariant learning. arXiv preprint arXiv:1907.02545

  45. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  46. Thakur S, Singh AK, Ghrera SP, Elhoseny M (2019) Multi-layer security of medical data through watermarking and chaotic encryption for tele-health applications. Multimedia Tools and Applications 78(3):3457–3470

    Article  Google Scholar 

  47. Torr PHS, Zisserman A (2000) Mlesac: a new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding 78(1):138–156

    Article  Google Scholar 

  48. Wang Y, Mei X, Ma Y, Huang J, Fan F, Ma J (2020) Learning to find reliable correspondences with local neighborhood consensus. Neurocomputing

  49. Yang K, Pan A, Yang Y, Hang S, Ong SH, Tang H (2017) Remote sensing image registration using multiple image features. Remote Sensing 9(6):581

    Article  Google Scholar 

  50. Yang X, Kwitt R, Styner M, Niethammer M (2017) Quicksilver: fast predictive image registration-a deep learning approach. NeuroImage 158:378–396

    Article  Google Scholar 

  51. Yi KM, Trulls E, Lepetit V, Fua P (2016) Lift: learned invariant feature transform. In: Proceedings of the European conference on computer vision, pp 467–483

  52. Yi KM, Trulls E, Ono Y, Lepetit V, Salzmann M, Fua P (2018) Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2666–2674

  53. Žbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research 17(1):2287–2318

    MATH  Google Scholar 

  54. Zhang Z, Sun R, Zhao C, Wang J, Chang CK, Gupta BB (2017) Cyvod: a novel trinity multimedia social network scheme. Multimedia Tools and Applications 76(18):18513–18529

    Article  Google Scholar 

  55. Zhang J, Sun D, Luo Z, Yao A, Zhou L, Shen T, Chen Y, Quan L, Liao H (2019) Learning two-view correspondences and geometry using order-aware network. In: Proceedings of the IEEE international conference on computer vision, pp 5845–5854

  56. Zhang X, Yu FX, Kumar S, Chang S-F (2017) Learning spread-out local feature descriptors. In: Proceedings of the IEEE international conference on computer vision, pp 4595–4603

  57. Zhao C, Cao Z, Yang J, Xian K, Li X (2020) Image feature correspondence selection: a comparative study and a new contribution. IEEE Transactions on Image Processing 29:3506–3519

    Article  Google Scholar 

  58. Zhao C, Cao Z, Li C, Li X, Yang J (2019) Nm-net: mining reliable neighbors for robust feature correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 215–224, 2019

  59. Zhao J, Ma J, Tian J, Ma J, Zhang D (2011) A robust method for vector field learning with application to mismatch removing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2977–2984

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China nos. 62073304, 41977242 and 61973283.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chen Jun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jun, C., Yue, G., Linbo, L. et al. Two-view correspondence learning via complex information extraction. Multimed Tools Appl 81, 3939–3957 (2022). https://doi.org/10.1007/s11042-021-11731-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11731-0

Keywords

Navigation