Rotation-equivariant spherical vector networks for objects recognition with unknown poses

Chen, Hao; Zhao, Jieyu; Zhang, Qiang

doi:10.1007/s00371-023-02904-z

Rotation-equivariant spherical vector networks for objects recognition with unknown poses

Original article
Published: 22 June 2023

Volume 40, pages 2089–2101, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

168 Accesses
1 Altmetric
Explore all metrics

Abstract

Analyzing 3D objects without pose priors using neural networks is challenging. In view of the shortcoming that spherical convolutional networks lack the construction of a part–whole hierarchy with rotation equivariance for 3D object recognition with unknown poses, which generates whole rotation-equivariant features that cannot be effectively preserved, a rotation-equivariant part–whole hierarchy spherical vector network is proposed in this paper. In our experiments, we map a 3D object onto the unit sphere, construct an ordered list of vectors from the convolutional layers of the rotation-equivariant spherical convolutional network and then construct a part–whole hierarchy to classify the 3D object using the proposed rotation-equivariant routing algorithm. The experimental results show that the proposed method improves not only the recognition of 3D objects with known poses, but also the recognition of 3D objects with unknown poses compared to previous spherical convolutional neural networks. This finding validates the construction of the rotation-equivariant part–whole hierarchy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An object recognition system based on convolutional neural networks and angular resolutions

Article 08 February 2021

CNN Architectures for Geometric Transformation-Invariant Feature Representation in Computer Vision: A Review

Article 16 June 2021

Spherical Transformer: Adapting Spherical Signal to Convolutional Networks

Data availability

We evaluate our method on the public MNIST, ModleNet40 and SHREC15 datasets. The MNIST, ModleNet40 and SHREC15 datasets are available at http://yann.lecun.com/exdb/mnist/, http://modelnet.cs.prin-ceton.edu/ and https://www.icst.pku.edu.cn/zlian/rep-resenta/3d15/dataset/index.htm, respectively.

References

Esteves, C., Allen-Blanchette, C., Makadia, A., Daniilidis, K.: Learning SO(3) equivariant representations with spherical cnns. Int. J. Comput. Vis. 128(3), 588–600 (2020)
Article Google Scholar
Spezialetti, R., Stella, F., Marcon, M., Silva, L., Salti, S., di Stefano, L.: “Learning to orient surfaces by self-supervised spherical cnns,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems, (2020)
Lafarge, M.W., Bekkers, E.J., Pluim, J.P.W., Duits, R., Veta, M.: Roto-translation equivariant convolutional networks: application to histopathology image analysis. Medical Image Anal. 68, 101849 (2021)
Article Google Scholar
Han, J., Ding, J., Xue, N., Xia, G.: “Redet: a rotation-equivariant detector for aerial object detection,” in IEEE Conference on Computer Vision and Pattern Recognition, (2021), pp. 2786–2795
Batzner, S., Musaelian, A., Sun, L.: E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022)
Article CAS PubMed PubMed Central ADS Google Scholar
Huang, Y., Peng, X., Ma, J., Zhang, M.: “3dlinker: An E(3) equivariant variational autoencoder for molecular linker design,” in International Conference on Machine Learning, vol. 162, (2022), pp. 9280–9294
Ganea, O., Huang, X., Bunne, C., Bian, Y., Barzilay, R., Jaakkola, T. S., Krause, A.:“Independent se(3)-equivariant models for end-to-end rigid protein docking,” in International Conference on Learning Representations, (2022)
Chen, Y., Liu, L., Phonevilay, V., Gu, K., Xia, R., Xie, J., Zhang, Q., Yang, K.: Image super-resolution reconstruction based on feature map attention mechanism. Appl. Intell. 51(7), 4367–4380 (2021)
Article Google Scholar
Xia, R., Chen, Y., Ren, B.: Improved anti-occlusion object tracking algorithm using unscented rauch-tung-striebel smoother and kernel correlation filter. J. King Saud Univ. Comput. Inf. Sci. 34(8), 6008–6018 (2022)
Google Scholar
Chen, Y., Xia, R., Yang, K., Zou, K.: MFFN: image super-resolution via multi-level features fusion network. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02795-0
Article Google Scholar
Chen, P.Y., Xia, R., Zou, K., Yang, K.: FFTI: image inpainting algorithm via features fusion and two-steps inpainting. J. Vis. Commun. Image Represent. 91, 103776 (2023)
Article Google Scholar
Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., Bennamoun, M.: Deep learning for 3d point clouds: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4338–4364 (2021)
Article PubMed Google Scholar
Zhao, Y., Birdal, T., Lenssen, J. E., Menegatti, E., Guibas, L. J., Tombari, F.:“Quaternion equivariant capsule networks for 3d point clouds,” in European Conference on Computer Vision, ser. Lecture Notes in Computer Science, vol. 12346, (2020), pp. 1–19
Shen, Z., Shen, T., Lin, Z., Ma, J.: “Pdo-es2cnns: Partial differential operator based equivariant spherical cnns,” in AAAI Conference on Artificial Intelligence, (2021), pp. 9585–9593
Mensah, P.K., Adekoya, A.F., Ayidzoe, M.A., Baagyire, E.Y.: Capsule networks–a survey. J. King Saud Univ. Comput. Inf. Sci. 34(1), 1295–1310 (2022)
Google Scholar
Hinton, G.E.: How to represent part-whole hierarchies in a neural network. Neural Comput. 35(3), 413–452 (2023)
Article MathSciNet PubMed Google Scholar
Sabour, S., Frosst, N., Hinton, G. E.: “Dynamic routing between capsules,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, (2017), pp. 3856–3866
Bengio, Y., Lecun, Y., Hinton, G.: Deep learning for AI. Commun. ACM 64(7), 58–65 (2021)
Article Google Scholar
Chen, Y., Zhao, J., Qiu, Q.: A transformer-based capsule network for 3d part-whole relationship learning. Entropy 24(5), 678 (2022)
Article PubMed PubMed Central ADS Google Scholar
Cohen, T., Welling, M.: “Group equivariant convolutional networks,” in Proceedings of the 33nd International Conference on Machine Learning, 2016, pp. 2990–2999
Lenc, K., Vedaldi, A.: Understanding image representations by measuring their equivariance and equivalence. Int. J. Comput. Vis. 127(5), 456–476 (2019)
Article MathSciNet PubMed Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Weiler, M., Cesa, G.: “General e(2)-equivariant steerable cnns,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, 2019, pp. 14,334–14,345
Wiersma, R., Eisemann, E., Hildebrandt, K.: Cnns on surfaces using rotation-equivariant features. ACM Trans. Graph. 39(4), 92 (2020)
Article Google Scholar
Su, Y., Grauman, K.: “Learning spherical convolution for fast features from 360 degree imagery,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, (2017), pp. 529–539
Zhao, Q., Zhu, C., Dai, F., Ma, Y., Jin, G., Zhang, Y.: “Distortion-aware cnns for spherical images,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, (2018), pp. 1198–1204
Lee, Y., Jeong, J., Yun, J., Cho, W., Yoon, K.: Spherephd: applying cnns on 360 degree images with non-euclidean spherical polyhedron representation. IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 834–847 (2022)
Article PubMed Google Scholar
Cohen, T. S., Geiger, M., Köhler, J., Welling, M.: “Spherical cnns,” in International Conference on Learning Representations, (2018)
Jiang, C.M., .Huang, J., Kashinath, K., Prabhat, P.M., Nießner, M., “Spherical cnns on unstructured grids,” in International Conference on Learning Representations, (2019)
Perraudin, N., Defferrard, M., Kacprzak, T., Sgier, R.: Deepsphere: Efficient spherical convolutional neural network with healpix sampling for cosmological applications. Astron. Comput. 27, 130–146 (2019)
Article ADS Google Scholar
McEwen, J.D., Wallis, C.G.R., Mavor-Parker, A.N., “Scattering networks on the sphere for scalable and rotationally equivariant spherical cnns,” in International Conference on Learning Representations, (2022)
Hinton, G.E., Sabour, S., Frosst, N.: “Matrix capsules with EM routing,” in International Conference on Learning Representations, (2018)
Bahadori, M.T.: “Spectral capsule networks,” International Conference on Learning Representations, p. 5, (2018)
Wang, D., Liu, Q.: “An optimization view on dynamic routing between capsules,” in International Conference on Learning Representations, (2018)
Liu, X., Chen, Q., Liu, Y., Siebert, J., Hu, B., Wu, X., Tang, B.: Decomposing word embedding with the capsule network. Knowl. Based Syst. 212, 106611 (2021)
Article Google Scholar
Li, D., Hu, B., Chen, Q., Wang, X., Qi, Q., Wang, L., Liu, H.: Attentive capsule network for click-through rate and conversion rate prediction in online advertising. Knowl. Based Syst. 211, 106522 (2021)
Article Google Scholar
Lian, Y., Gu, D., Hua, J.: Sorcnet: robust non-rigid shape correspondence with enhanced descriptors by shared optimized res-capsuleNet. Vis. Comput. 39(2), 749–763 (2023)
Article Google Scholar
Kostelec, P.J., Rockmore, D.N.: FFTs on the rotation group. J. Fourier Anal. Appl. 14(2), 145–179 (2008)
Article MathSciNet Google Scholar
Kondor, R., Lin, Z., Trivedi, S.: “Clebsch-gordan nets: a fully fourier space spherical convolutional neural network,” in Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, (2018), pp. 10 138–10 147
Lian, Z., Shu, C., et al.: “Shrec’15 track: non-rigid 3d shape retrieval,” in Proceedings of the 8th Eurographics Conf. on 3D Object Retrieval, pp. 107-120, (2015)
Bronstein, M.M., Kokkinos, I.: “Scale-invariant heat kernel signatures for non-rigid shape recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, (2010), pp. 1704–1711
Rusu, R.B., Blodow, N., Beetz, M.: “Fast point feature histograms (FPFH) for 3d registration,” in IEEE International Conference on Robotics and Automation, pp. 3212–3217, (2009)
Zheng, Y., Zhao, J., Chen, Y., Tang, C., Yu, S.: 3D mesh model classification with a capsule network. Algorithms 14(3), 99 (2021)
Article MathSciNet Google Scholar
Chen, Y., Zhao, J., Shi, C., Yuan, D.: Mesh convolution: a novel feature extraction method for 3d nonrigid object classification. IEEE Trans. Multimed. 23, 3098–3111 (2021)
Article Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: “Pointnet: Deep learning on point sets for 3D classification and segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition, pp. 77–85, (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.:“Pointnet++: deep hierarchical feature learning on point sets in a metric space,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp. 5099–5108, (2017)
Kanezaki, A., Matsushita, Y., Nishida, Y.: “Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints,” in IEEE Conference on Computer Vision and Pattern Recognition, pp. 5010–5019, (2018)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38(5), 1–12 (2019)
Article Google Scholar
You, Y., Lou, Y., Shi, R., Liu, Q., Tai, Y., Ma, L., Wang, W., Lu, C.: PRIN/SPRIN: on extracting point-wise rotation invariant features. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9489–9502 (2022)
Article PubMed Google Scholar
You, Y., Lou, Y., Liu,Q., Tai, Y., Ma, L., Lu, C., Wang, W.: “Pointwise rotation-invariant network with adaptive sampling and 3d spherical voxel convolution,” in The Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 717–724, (2020)
Kazhdan, M., Solomon, J., Ben-Chen, M.: Can mean-curvature flow be modified to be non-singular? Comput. Graph. Forum 31(5), 1745–1754 (2012)
Article Google Scholar

Download references

Funding

This research was supported by the National Natural Science Foundation of China (Grant Nos. 62071260 and 62006131) and the Natural Science Foundation of Zhejiang Province (Grant Nos. LZ22F020001 and LQ21F020009).

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, 315000, China
Hao Chen, Jieyu Zhao & Qiang Zhang

Authors

Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jieyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jieyu Zhao.

Ethics declarations

Conflict of interest

We declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, H., Zhao, J. & Zhang, Q. Rotation-equivariant spherical vector networks for objects recognition with unknown poses. Vis Comput 40, 2089–2101 (2024). https://doi.org/10.1007/s00371-023-02904-z

Download citation

Accepted: 28 April 2023
Published: 22 June 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00371-023-02904-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rotation-equivariant spherical vector networks for objects recognition with unknown poses

Abstract

Access this article

Similar content being viewed by others

An object recognition system based on convolutional neural networks and angular resolutions

CNN Architectures for Geometric Transformation-Invariant Feature Representation in Computer Vision: A Review

Spherical Transformer: Adapting Spherical Signal to Convolutional Networks

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rotation-equivariant spherical vector networks for objects recognition with unknown poses

Abstract

Access this article

Similar content being viewed by others

An object recognition system based on convolutional neural networks and angular resolutions

CNN Architectures for Geometric Transformation-Invariant Feature Representation in Computer Vision: A Review

Spherical Transformer: Adapting Spherical Signal to Convolutional Networks

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation