Skip to main content
Log in

A viewpoint-guided prototype network for 3D shape classification

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Multi-view learning methods have achieved remarkable results in 3D shape recognition. However, most of them focus on the visual feature extraction and feature aggregation, while viewpoints (spatial positions of virtual cameras) for generating multiple views are often ignored. In this paper, we deeply explore the correlation between viewpoints and shape descriptor, and propose a novel viewpoint-guided prototype learning network (VGP-Net). We introduce a prototype representation for each class, including viewpoint prototype and feature prototype. The viewpoint prototype is the average weight of each viewpoint learned from a small support set via Score Unit, and stored in a weight dictionary. Our VGP model self-adaptively learns the view-wise weights by dynamically assembling with the viewpoint prototypes in weight dictionary and performing element-wise operation via view pooling layer. Under the guidance of viewpoint prototype, important visual features are enhanced, while those negligible features are eliminated. These refined features are effectively fused to generate compact shape descriptor. All the shape descriptors are clustered in feature embedding space, and the cluster center represents the feature prototype of each class. The classification thus can be performed by searching the nearest distance to feature prototypes. To boost the learning process, we further present a multi-stream regularization mechanism in both feature space and viewpoint space. Extensive experiments demonstrate that our VGP-Net is efficient, and the learned deep features have stronger discrimination ability. Therefore, it can achieve better performance compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The datasets generated and/or analysed during the current study are available from the corresponding author by request.

References

  1. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In Proc. ICCV, 2015.

  2. Wei, X., Yu, R., and Sun, J.: View-GCN: View-based graph convolutional network for 3D shape analysis. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 1850–1859.

  3. Feng, Y., Zhang, Z., Zhao, X., Ji R., Gao, Y.: GVCNN: Group-view convolutional neural networks for 3D shape recognition. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 264–272.

  4. Leng, B., Zhang, C., Zhou, X., Xu, C., Xu, K.: Learning discriminative 3d shape representations by view discerning networks. Inst. Electr. Electr. Eng. (IEEE) 25, 2896 (2019). https://doi.org/10.1109/TVCG.2018.2865317

    Article  Google Scholar 

  5. Wu, Z., Yang, P., Wang, Y.: MVPN: multi-view prototype network for 3D shape recognition. IEEE Access 99, 1–1 (2019)

    Article  Google Scholar 

  6. Hamdi, A., et al.: MVTN Multi-view transformation network for 3D shape recognition. Proceed. IEEE Int. Conf. Comput. Vis. (2020). https://doi.org/10.48550/arXiv.2011.13244

    Article  Google Scholar 

  7. Xu, Y., Zheng, C., Xu, R., et al.: Multi-view 3D shape recognition via correspondence-aware deep learning. IEEE Trans. Image Process 30, 5299 (2021)

    Article  Google Scholar 

  8. Snell, J., Swersky, K., and Zemel, R. S.: Prototypical networks for few-shot learning. In Proc NIPS, 2017, pp. 4080–4090.

  9. Luciano, L., Hamza, A.: Deep learning with geodesic moments for 3D shape classification. Pattern Recog. Lett. 105, 182–190 (2017)

    Article  Google Scholar 

  10. David, K.H., Pierre, V., Rémi, G.: Wavelets on graphs via spectral graph theory’,’. Appl. Comput. Harmon. Anal. 30, 129–150 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  11. Masoumi, M., Li, C., Hamza, A.: A spectral graph wavelet approach for non-rigid 3D shape retrieval. Pattern Recogn. Lett. 83, 339–348 (2016)

    Article  Google Scholar 

  12. Han, Z., Liu, Z., Vong, C.M., Liu, Y.S., Bu, S., Han, J., Chen,: CLP: BoSCC: bag of spatial context correlations for spatially enhanced 3Dshape representation. IEEE Trans. Image Process 26(8), 3707–3720 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  13. Qi,C. R., Su, H., Mo, K.,Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proc. CVPR, 2017.

  14. Qi,C. R., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Proc. CVPR, 2018.

  15. Hugues, T., Qi C. R., Deschaud, J.E., Marcotegui, B., Goulette, F., Leonidas, J. Guibas. : Kpconv: Flexible and deformable convolution for point clouds. In ICCV, 2019.

  16. Xu, M., Ding, R. , Zhao, H.,Qi, X.: PAConv: position adaptive convolution with dynamic kernel assembling on point clouds. In CVPR, 2021.

  17. Wu, W., Qi, Z., and Li, F.: Pointconv: Deep convolutional networks on 3d point clouds. In CVPR, 2019.

  18. Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In ICML, 2014–2023, 2016.

  19. Hamilton, W., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In NIPS. 2017.

  20. Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. Computer Science, 2013.

  21. Scarselli, F., Gori, M., Tsoi, A., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009)

    Article  Google Scholar 

  22. Simonovsky, M. and Komodakis, N.: Dynamic edge-conditioned filters in convolutional neural networks on graphs. In CVPR, 2017.

  23. Kipf, TN., Welling, M. (2017): Semi-supervised classification with graph convolutional networks. In: Proc. of International Conference on Learning Representations (ICLR)

  24. Wang, Y., Sun, Y., Liu, Z., Sanjay, E., Sarma, M., Bronstein, M., Justin, M.: Solomon dynamic graph cnn for learning on point clouds. ACM Trans. Graph 38(5), 1–12 (2019)

    Article  Google Scholar 

  25. Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-Time object recognition. In Proc. International Conference on Intelligent Robots & Systems (IROS), 2015.

  26. Tran, M., Vo-Ho, V. K., Le, N.: 3DConvCaps: 3DUnet with convolutional capsule encoder for medical image segmentation. arXiv e-prints, 2022.

  27. Mohammadi, S. S., Wang, Y. and Bue, A. D.:Pointview-GCN: 3D shape classification with multi-view point clouds. In Proc. IEEE conference on Image Processing IEEE, 2021.

  28. Qin, S., Li, Z., Liu, L. Robust 3D shape classification via non-local graph attention network. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 5374–5383.

  29. Xu C., Li Z., Qiu Q., Leng B., and Jiang J.: Enhancing 2D representation via adjacent views for 3D shape retrieval. In Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019.

  30. Ma, C., Guo, Y., Yang, J., An, W.: Learning multi-view representation with LSTM for 3-D shape recognition and retrieval. IEEE Trans. Multimedia 21(5), 1169–1182 (2019)

    Article  Google Scholar 

  31. Guo, Y., Wang, H., Hu, Q., et al.: Deep learning for 3d point clouds: a survey. Proc. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4338–4364 (2020)

    Article  Google Scholar 

  32. He, K. et al. Masked autoencoders are scalable vision learners. In Proc. IEEE conference on computer vision and pattern recognition (CVPR). 2022.

  33. Afham, M., Dissanayake, I., Dissanayake, D., et al. CrossPoint: self-supervised cross-modal contrastive learning for 3D point cloud understanding. In Proc. IEEE conference on computer vision and pattern recognition (CVPR), 2022.

  34. Shi, B.G., Bai, S., Zhou, Z., et al.: DeepPano: deep panoramic representation for 3D shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)

    Article  Google Scholar 

  35. van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J Mach Learn Res 9, 2579–2605 (2008)

    MATH  Google Scholar 

  36. Liu, X., Han, Z., Liu, Y.S., Zwicker, M.: Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In Proceed AAAI Conf Artif Intell 33, 8778–8785 (2019)

    Google Scholar 

  37. Yan, X., Zheng, C., Li, Z., Wang, S., Cui, S. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling, CPVR, 2021.

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their helpful comments, and thank Chuanming Song, Bo Fu, and Yang Liu for valuable discussion. The research presented in this paper is supported by a grant from NSFC (61702246), grants from research projects of Liaoning province (2019lsktyb-084, LJ2020015, 2020JH4/10100045, 2023020196-JH2/1013) and a fund of Dalian Science and Technology (2019J12GX038).

Author information

Authors and Affiliations

Authors

Contributions

HL, HJ and DF wrote the main manuscript text. MH and XX provided some experimental results of Figures 5, 6, 7. And YW prepared Fig. 8, 9. All authors reviewed the manuscript.

Corresponding author

Correspondence to Li Han.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Communicated by T. Li.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, L., He, J., Dou, F. et al. A viewpoint-guided prototype network for 3D shape classification. Multimedia Systems 29, 3531–3547 (2023). https://doi.org/10.1007/s00530-023-01177-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01177-9

Keywords

Navigation