Learning geometric invariants through neural networks

Rai, Arpit

doi:10.1007/s00371-024-03398-z

Learning geometric invariants through neural networks

Original article
Published: 22 July 2024

Volume 40, pages 7093–7106, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Arpit Rai ORCID: orcid.org/0000-0002-2884-1122¹

151 Accesses
Explore all metrics

Abstract

Convolution neural networks have become a fundamental model for solving various computer vision tasks. However, these operations are only invariant to translations of objects and their performance suffer under rotation and other affine transformations. This work proposes a novel neural network that leverages geometric invariants, including curvature, higher-order differentials of curves extracted from object boundaries at multiple scales, and the relative orientations of edges. These features are invariant to affine transformation and can improve the robustness of shape recognition in neural networks. Our experiments on the smallNORB dataset with a 2-layer network operating over these geometric invariants outperforms a 3-layer convolutional network by 9.69% while being more robust to affine transformations, even when trained without any data augmentations. Notably, our network exhibits a mere 6% degradation in test accuracy when test images are rotated by 40$^{\circ }$, in contrast to significant drops of 51.7 and 69% observed in VGG networks and convolution networks, respectively, under the same transformations. Additionally, our models show superior robustness than invariant feature descriptors such as the SIFT-based bag-of-words classifier, and its rotation invariant extension, the RIFT descriptor that suffer drops of 35 and 14.1% respectively, under similar image transformations. Our experimental results further show improved robustness against scale and shear transformations. Furthermore, the multi-scale extension of our geometric invariant network, that extracts curve differentials of higher orders, show enhanced robustness to scaling and shearing transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Geometric Invariants, Learning, and Recognition of Shapes and Forms

GenHarris-ResNet: A Rotation Invariant Neural Network Based on Elementary Symmetric Polynomials

CNN Architectures for Geometric Transformation-Invariant Feature Representation in Computer Vision: A Review

Article 16 June 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

All the datasets that are analyzed and used in our experiments are freely available for public use and research in the following repositories https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/, and as a Mendeley format at https://data.mendeley.com/datasets/55xv4y25rs and can be cited as: Rai, Arpit (2022), “smallNORB”, Mendeley Data, V1, doi: 10.17632/55xv4y25rs.1. The different transformations that were applied to the datasets during the experiments are part of the Tensorflow, the tensorflow image processing, and the tensorflow-addons library.

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)
Albawi, S., Mohammed, T.A., Al-Zawi, S.: in 2017 international conference on engineering and technology (ICET) (IEEE, 2017), pp 1–6
Marr, D., Nishihara, H.K.: Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London. Series B. Biological Sciences 200(1140), pp 269–294 (1978)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Article Google Scholar
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks (2017)
Mukhopadhyay, P., Chaudhuri, B.B.: A survey of hough transform. Pattern Recogn. 48(3), 993–1010 (2015)
Article Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. Advances in neural information processing systems 13 (2000)
Koushik, J.: Understanding convolutional neural networks. arXiv preprint arXiv:1605.09081 (2016)
Lazebnik, S., Schmid, C., Ponce, J.: in British machine vision conference (BMVC’04) (The British Machine Vision Association (BMVA), 2004), pp. 779–788
Mokhtarian, F., Bober, M.: in Curvature scale space representation: theory, applications, and MPEG-7 Standardization (Springer, 2003), pp. 215–242
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: in International conference on learning representations (2019). https://openreview.net/forum?id=Bygh9j09KX
Cohen, T.S. , Welling, M.: Group equivariant convolutional networks (2016)
Kanopoulos, N., Vasanthavada, N., Baker, R.: Design of an image edge detection filter using the sobel operator. IEEE J. Solid-State Circuits 23(2), 358–367 (1988). https://doi.org/10.1109/4.996
Article Google Scholar
V. Nair, G.E. Hinton,: in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814 (2010)
Simonyan, K., Zisserman, A..: Very deep convolutional networks for large-scale image recognition (2014). https://doi.org/10.48550/ARXIV.1409.1556
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). https://doi.org/10.48550/ARXIV.1207.0580
Ahmed, M., Seraj, R., Islam, S.M.S.: The k-means algorithm: a comprehensive survey and performance evaluation. Electronics 9(8), 1295 (2020)
Article Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

Download references

Acknowledgements

The author would like to thank the University of Edinburgh as a parent institution and resources and the GPU provided by the Google Colaboratory team used for training and testing the models.

Funding

No funding was received from any organization for conducting the study and the experiments.

Author information

Authors and Affiliations

DPDGroup UK, DPDDepot, Broadwell Rd, Oldbury, Birmingham, B69 4DA, Scotland, UK
Arpit Rai

Authors

Arpit Rai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arpit Rai.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Code for the curvature filter operation $C_{x}, C_{y}$ over the image

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Rai, A. Learning geometric invariants through neural networks. Vis Comput 40, 7093–7106 (2024). https://doi.org/10.1007/s00371-024-03398-z

Download citation

Accepted: 30 March 2024
Published: 22 July 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s00371-024-03398-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning geometric invariants through neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Geometric Invariants, Learning, and Recognition of Shapes and Forms

GenHarris-ResNet: A Rotation Invariant Neural Network Based on Elementary Symmetric Polynomials

CNN Architectures for Geometric Transformation-Invariant Feature Representation in Computer Vision: A Review

Explore related subjects

Data Availability Statement

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendices

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now