Abstract
Establishing reliable correspondences plays a vital role in many feature-matching based computer vision tasks. Given putative correspondences of feature points in two images, in this paper, we propose a novel network for inferring the probabilities of correspondences being inliers or outliers and regressing the relative pose encoded by the essential matrix. Previous research proposed an end-to-end permutation-equivariant classification network based on multi-layer perceptrons and context normalization. However, the context normalization treats each correspondence equally and ignore the extraction of channel information, as a result the representation capability of potential inliers can be reduced. To solve this problem, we apply attention mechanism in our network to capture complex information of the feature maps. Specifically, we introduce two types of attention blocks. We adopt the spatial attention block to capture complex spatial contextual information, and the rich channel information can be obtained by utilizing the channel attention block. To obtain richer contextual information and feature maps with stronger representative capacity, We combine these attention blocks with the PointCN block to form a new network with strong representative ability. Experimental results on several benchmark datasets show that the performance on outlier removal and camera pose estimation is significantly improved over the state-of-the-arts.
Similar content being viewed by others
References
Amin S, Hamid E (2015) Remote sensing image matching based on adaptive binning sift descriptor. IEEE Transactions on Geoscience and Remote Sensing 53(10):5283–5293
Antoine Maintz JB, Viergever MA (1998) A survey of medical image registration. Medical Image Analysis 2(1):1–36
Bay H, Andreas E, Tinne T, Luc VG (2008) Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3):346–359
Besl PJ, McKay ND (1992) Method for registration of 3-d shapes. In: Sensor fusion IV: control paradigms and data structures, vol 1611, pp 586–606
Bian JW, Lin W-Y, Matsushita Y, Yeung S-K, Nguyen T-D, Cheng M-M (2017) Gms: grid-based motion statistics for fast, ultra-robust feature correspondence. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4181–4190
Camps-Valls G, Tuia D, Gómez-Chova L, Jiménez S, Malo J (2011) Remote sensing image processing. Synthesis Lectures on Image, Video, and Multimedia Processing 5(1):1–192
Chui H, Rangarajan A (2003) A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding 89(2–3):114–141
Chum O, Matas J (2005) Matching with prosac-progressive sample consensus. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 220–226
Chum O, Matas J, Kittler J (2003) Locally optimized ransac. In: Joint pattern recognition symposium. Springer, pp 236–243
Dellinger F, Delon J, Gousseau Y, Michel J, Tupin F (2014) Sar-sift: a sift-like algorithm for sar images. IEEE Transactions on Geoscience and Remote Sensing 53(1):453–466
DeTone D, Malisiewicz T, Rabinovich A (2018) Superpoint: self-supervised interest point detection and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 224–236
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6):381–39
Han X, Leung T, Jia Y, Sukthankar R, Berg AC (2015) Matchnet: unifying feature and metric learning for patch-based matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3279–3286
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Jian B, Vemuri BC (2010) Robust point set registration using gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8):1633–1645
Jiang X, Jiang J, Fan A, Wang Z, Ma J (2019) Multiscale locality and rank preservation for robust feature matching of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 57(9):6462–6472
Jiang X, Ma J, Jiang J, Guo X (2020) Robust feature matching using spatial clustering with heavy outliers. IEEE Transactions on Image Processing 29:736–746
Jiang X, Ma J, Xiao G, Shao Z, Guo X (2021) A review of multimodal image matching: methods and applications. Information Fusion 73:22–71
Jiang X, Ma J, Chen J (2019) Progressive filtering for feature matching. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 2217–2221
Lenc K, Vedaldi A (2016) Learning covariant feature detectors. In: Proceedings of the european conference on computer vision, pp 100–117
Li X, Zhanyi H (2010) Rejecting mismatches by correspondence function. International Journal of Computer Vision 89(1):1–17
Lin W-Y, Wang F, Cheng M-M, Yeung S-K, Torr PHS, Do MN, Lu J (2017) Code: coherence based decision boundaries for feature correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(1):34–47
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2):91–110
Ma J, Zhao J, Tian J, Bai X, Zhuowen T (2013) Regularized vector field learning with sparse approximation for mismatch removal. Pattern Recognition 46(12):3519–3532
Ma J, Zhao J, Tian J, Yuille AL, Tu Z (2014) Robust point matching via vector field consensus. IEEE Transactions on Image Processing 23(4):1706–1721
Ma J, Ma Y, Zhao J, Tian J (2014) Image feature matching via progressive vector field consensus. IEEE Signal Processing Letters 22(6):767–771
Ma J, Qiu W, Zhao J, Ma Y, Yuille AL, Tu Z (2015) Robust l2e estimation of transformation for non-rigid registration. IEEE Transactions on Signal Processing 63(5):1115–1129
Ma J, Zhao J, Yuille AL (2016) Non-rigid point set registration by preserving global and local structures. IEEE Transactions on Image Processing 25(1):53–64
Ma J, Jiang J, Zhou H, Zhao J, Guo X (2018) Guided locality preserving feature matching for remote sensing image registration. IEEE Transactions on Geoscience and Remote Sensing 56(8):4435–4447
Ma J, Wu J, Zhao J, Jiang J, Zhou H, Sheng QZ (2019) Nonrigid point set registration with robust transformation learning under manifold regularization. IEEE Transactions on Neural Networks and Learning Systems 30(12):3584–3597
Ma J, Zhao J, Jiang J, Zhou H, Guo X (2019) Locality preserving matching. International Journal of Computer Vision 127(5):512–531
Ma J, Jiang X, Jiang J, Zhao J, Guo X (2019) Lmr: learning a two-class classifier for mismatch removal. IEEE Transactions on Image Processing 28(8):4045–4059
Ma J, Jiang X, Fan A, Jiang J, Yan J (2021) Image matching from handcrafted to deep features: a survey. International Journal of Computer Vision 129(1):23–79
Ma J, Zhao J, Jiang J, Zhou H (2017) Non-rigid point set registration with robust transformation estimation under manifold regularization. In: Proceedings of the AAAI conference on artificial intelligence, pp 4218–4224
Ma J, Zhao J, Jiang J, Zhou H, Zhou Y, Wang Z, Guo X (2018) Visual homing via guided locality preserving matching. In: Proceedings of the IEEE international conference on robotics and automation, pp 7254–7261
Miao S, Wang ZJ, Liao R (2016) A cnn regression approach for real-time 2d/3d registration. IEEE Transactions on Medical Imaging 35(5):1352–1363
Mur-Artal R, Montiel JMM, Tardos JD (2015) Orb-slam: a versatile and accurate monocular slam system. IEEE Transactions on Robotics 31(5):1147–1163
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML
Pan B, Zhang L, Yin H, Lan J, Cao F (2021) An automatic 2d to 3d video conversion approach based on rgb-d images. Multimedia Tools and Applications 80:19179–19201
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: Proceedings of the IEEE international conference on computer vision, pp 2564–2571
Schonberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4104–4113
Simonovsky M, Gutiérrez-Becker B, Mateus D, Navab N, Komodakis N (2016) A deep metric for multimodal registration. In: Proceedings of the international conference on medical image computing and computer-assisted intervention, pp 10–18, 2016
Sun W, Jiang W, Trulls E, Tagliasacchi A, Yi KM (2019) Attentive context normalization for robust permutation-equivariant learning. arXiv preprint arXiv:1907.02545
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Thakur S, Singh AK, Ghrera SP, Elhoseny M (2019) Multi-layer security of medical data through watermarking and chaotic encryption for tele-health applications. Multimedia Tools and Applications 78(3):3457–3470
Torr PHS, Zisserman A (2000) Mlesac: a new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding 78(1):138–156
Wang Y, Mei X, Ma Y, Huang J, Fan F, Ma J (2020) Learning to find reliable correspondences with local neighborhood consensus. Neurocomputing
Yang K, Pan A, Yang Y, Hang S, Ong SH, Tang H (2017) Remote sensing image registration using multiple image features. Remote Sensing 9(6):581
Yang X, Kwitt R, Styner M, Niethammer M (2017) Quicksilver: fast predictive image registration-a deep learning approach. NeuroImage 158:378–396
Yi KM, Trulls E, Lepetit V, Fua P (2016) Lift: learned invariant feature transform. In: Proceedings of the European conference on computer vision, pp 467–483
Yi KM, Trulls E, Ono Y, Lepetit V, Salzmann M, Fua P (2018) Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2666–2674
Žbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research 17(1):2287–2318
Zhang Z, Sun R, Zhao C, Wang J, Chang CK, Gupta BB (2017) Cyvod: a novel trinity multimedia social network scheme. Multimedia Tools and Applications 76(18):18513–18529
Zhang J, Sun D, Luo Z, Yao A, Zhou L, Shen T, Chen Y, Quan L, Liao H (2019) Learning two-view correspondences and geometry using order-aware network. In: Proceedings of the IEEE international conference on computer vision, pp 5845–5854
Zhang X, Yu FX, Kumar S, Chang S-F (2017) Learning spread-out local feature descriptors. In: Proceedings of the IEEE international conference on computer vision, pp 4595–4603
Zhao C, Cao Z, Yang J, Xian K, Li X (2020) Image feature correspondence selection: a comparative study and a new contribution. IEEE Transactions on Image Processing 29:3506–3519
Zhao C, Cao Z, Li C, Li X, Yang J (2019) Nm-net: mining reliable neighbors for robust feature correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 215–224, 2019
Zhao J, Ma J, Tian J, Ma J, Zhang D (2011) A robust method for vector field learning with application to mismatch removing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2977–2984
Acknowledgements
This work was supported by the National Natural Science Foundation of China nos. 62073304, 41977242 and 61973283.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jun, C., Yue, G., Linbo, L. et al. Two-view correspondence learning via complex information extraction. Multimed Tools Appl 81, 3939–3957 (2022). https://doi.org/10.1007/s11042-021-11731-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11731-0