Beyond homography: nonparametric image alignment via graph convolutional networks

Kim, Mijeong; Chu, Sanghyeok; Han, Bohyung

doi:10.1007/s00138-022-01331-9

Beyond homography: nonparametric image alignment via graph convolutional networks

Original Paper
Published: 19 September 2022

Volume 33, article number 89, (2022)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

295 Accesses
1 Altmetric
Explore all metrics

Abstract

We propose an image alignment algorithm based on weak supervision, which aims to identify the correspondence between a pair of reference and target images with no supervision of individual pixels. Since most existing methods have relied on a predefined geometric model such as homography, they often suffer from a lack of model flexibility and generalizability. To tackle the challenge, we propose a novel nonparametric transformation model based on graph convolutional networks without an explicit geometric constraint. The proposed method is generic and flexible in the sense that it is applicable to the image pairs undergoing diverse local and/or global transformations. To make the algorithm more suitable for real-world scenarios having potential noises from moving objects, we disregard those objects with an off-the-shelf semantic segmentation model. The proposed algorithm is evaluated on the Cityscapes dataset with annotated pixel-level correspondences and outperforms baseline methods relying on global parametric transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RANSAC-Flow: Generic Two-Stage Image Alignment

HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking

GTCaR: Graph Transformer for Camera Re-localization

Notes

Refer to Appendix A in the supplementary document for the list of images used for the construction of test sets.
We set the threshold to 60 pixels, which is a reasonable choice for \(512 \times \) 384 images used in our experiment.

References

Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. In: British Machine Vision Conference (2013)
Balntas, V., Johns, E., Tang, L., Mikolajczyk, K.: PN-Net: conjoined triple deep network for learning local image descriptors. arXiv preprint arXiv:1601.05030 (2016)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision (2006)
Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74, 59–73 (2007)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Article MathSciNet Google Scholar
Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: Matchnet: Unifying feature and metric learning for patch-based matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 85–203 (1981)
Article MATH Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inform. Process. Syst. (2015)
Kanade, T., Okutomi, M.: A stereo matching algorithm with an adaptive window: theory and experiment. IEEE Trans. Pattern Anal. Mach. Intell. 16, 920–932 (1994)
Article Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: International Conference on Computer Vision (2017)
Raguram, R., Frahm, J.M., Pollefeys, M.: A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus. In: European Conference on Computer Vision (2008)
Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
Rocco, I., Cimpoi, M., Arandjelovi’c, R., Torii, A., Pajdla, T., Sivic, J.: Neighbourhood consensus networks. In: Advances in Neural Information Processing Systems (2018)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: International Conference on Computer Vision (2011)
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: Lift: Learned invariant feature transform. In: European Conference on Computer Vision (2016)
Zhang, J., Wang, C., Liu, S., Jia, L., Ye, N., Wang, J., Zhou, J., Sun, J.: Content-aware unsupervised deep homography estimation. In: European Conference on Computer Vision (2020)

Download references

Acknowledgements

This work was conducted by Center for Applied Research in Artificial Intelligence (CARAI) grant funded by Defense Acquisition Program Administration (DAPA) and Agency for Defense Development (ADD) (UD190031RD).

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, Seoul, Korea
Mijeong Kim, Sanghyeok Chu & Bohyung Han

Authors

Mijeong Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sanghyeok Chu
View author publications
You can also search for this author in PubMed Google Scholar
Bohyung Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bohyung Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 61 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kim, M., Chu, S. & Han, B. Beyond homography: nonparametric image alignment via graph convolutional networks. Machine Vision and Applications 33, 89 (2022). https://doi.org/10.1007/s00138-022-01331-9

Download citation

Received: 10 June 2021
Revised: 08 May 2022
Accepted: 10 July 2022
Published: 19 September 2022
DOI: https://doi.org/10.1007/s00138-022-01331-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beyond homography: nonparametric image alignment via graph convolutional networks

Abstract

Access this article

Similar content being viewed by others

RANSAC-Flow: Generic Two-Stage Image Alignment

HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking

GTCaR: Graph Transformer for Camera Re-localization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 61 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Beyond homography: nonparametric image alignment via graph convolutional networks

Abstract

Access this article

Similar content being viewed by others

RANSAC-Flow: Generic Two-Stage Image Alignment

HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking

GTCaR: Graph Transformer for Camera Re-localization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 61 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation