Skip to main content
Log in

Beyond homography: nonparametric image alignment via graph convolutional networks

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

We propose an image alignment algorithm based on weak supervision, which aims to identify the correspondence between a pair of reference and target images with no supervision of individual pixels. Since most existing methods have relied on a predefined geometric model such as homography, they often suffer from a lack of model flexibility and generalizability. To tackle the challenge, we propose a novel nonparametric transformation model based on graph convolutional networks without an explicit geometric constraint. The proposed method is generic and flexible in the sense that it is applicable to the image pairs undergoing diverse local and/or global transformations. To make the algorithm more suitable for real-world scenarios having potential noises from moving objects, we disregard those objects with an off-the-shelf semantic segmentation model. The proposed algorithm is evaluated on the Cityscapes dataset with annotated pixel-level correspondences and outperforms baseline methods relying on global parametric transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Refer to Appendix A in the supplementary document for the list of images used for the construction of test sets.

  2. We set the threshold to 60 pixels, which is a reasonable choice for \(512 \times \) 384 images used in our experiment.

References

  1. Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. In: British Machine Vision Conference (2013)

  2. Balntas, V., Johns, E., Tang, L., Mikolajczyk, K.: PN-Net: conjoined triple deep network for learning local image descriptors. arXiv preprint arXiv:1601.05030 (2016)

  3. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision (2006)

  4. Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74, 59–73 (2007)

    Article  Google Scholar 

  5. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)

  6. DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)

  7. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  8. Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: Matchnet: Unifying feature and metric learning for patch-based matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

  9. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

  11. Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 85–203 (1981)

    Article  MATH  Google Scholar 

  12. Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inform. Process. Syst. (2015)

  13. Kanade, T., Okutomi, M.: A stereo matching algorithm with an adaptive window: theory and experiment. IEEE Trans. Pattern Anal. Mach. Intell. 16, 920–932 (1994)

    Article  Google Scholar 

  14. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)

  15. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  16. Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: International Conference on Computer Vision (2017)

  17. Raguram, R., Frahm, J.M., Pollefeys, M.: A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus. In: European Conference on Computer Vision (2008)

  18. Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)

  19. Rocco, I., Cimpoi, M., Arandjelovi’c, R., Torii, A., Pajdla, T., Sivic, J.: Neighbourhood consensus networks. In: Advances in Neural Information Processing Systems (2018)

  20. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: International Conference on Computer Vision (2011)

  21. Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)

  22. Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: Lift: Learned invariant feature transform. In: European Conference on Computer Vision (2016)

  23. Zhang, J., Wang, C., Liu, S., Jia, L., Ye, N., Wang, J., Zhou, J., Sun, J.: Content-aware unsupervised deep homography estimation. In: European Conference on Computer Vision (2020)

Download references

Acknowledgements

This work was conducted by Center for Applied Research in Artificial Intelligence (CARAI) grant funded by Defense Acquisition Program Administration (DAPA) and Agency for Defense Development (ADD) (UD190031RD).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bohyung Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 61 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, M., Chu, S. & Han, B. Beyond homography: nonparametric image alignment via graph convolutional networks. Machine Vision and Applications 33, 89 (2022). https://doi.org/10.1007/s00138-022-01331-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-022-01331-9

Keywords

Navigation