Skip to main content
Log in

Principal views selection based on growing graph convolution network for multi-view 3D model recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

With the development of 3D technologies, 3D model recognition has attracted substantial attention in various areas, such as automatic driving, virtual/augmented reality, and computer-aided design. Many researchers are devoted to 3D model recognition and obtain some achievements in research. However, the abundant structure information of the 3D model also brings a huge challenge in model representation. In recent years, many researchers focus on classical computer vision technologies, which are utilized to represent the multi-view information of the 3D model. However, redundant visual information also brings a new challenge in model representation. In this paper, we focus on the multi-view 3D model data and propose a novel growing graph convolution network (GGCN) to handle the principal views selection problem, which can guarantee the performance of 3D model representation and effectively reduce the cost time. The proposed method mainly includes two modules: 1) principal views selection module: we utilize the selected views to describe the 3D model, which can effectively remove the redundant information and reduce computational complexity. 2) growing GCN module: we propose an effective growing GCN model, which focuses on gathering nodes that were less related to each other to ensure the result of multi-view fusion. It can indirectly retain the structure information and also reduce redundant information. In the process of graph growing, the GGCN model gradually adds view information to make up for the lack of characterization and guarantee the final performance. More specially, these two modules can guide each other to improve the performance of the principal views module and indirectly increase the final recognition accuracy. To evaluate the effectiveness of our proposed method, we test the classification accuracy and retrieval performance on the ModelNet40 dataset and ShapeNet dataset. The experimental results demonstrate the superiority of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://github.com/tjuliangqi/opengl_render_img_in_a_specific_view

References

  1. Song D, Nie W -Z, Li W -H, Kankanhalli M, Liu A -A (2021) Monocular image-based 3-d model retrieval: a benchmark. IEEE Trans Cybern

  2. Zhao S, Yao H, Gao Y, Ding G, Chua T (2018) Predicting personalized image emotion perceptions in social networks. IEEE Trans Affect Comput 9(4):526–540

    Article  Google Scholar 

  3. Zhou H, Liu A, Nie W, Nie J (2020) Multi-view saliency guided deep neural network for 3-d object retrieval and classification. IEEE Trans Multim 22(6):1496–1506

    Article  Google Scholar 

  4. Gao Z, Li Y, Wan S (2020) Exploring deep learning for view-based 3d model retrieval. ACM Trans Multimed Comput Commun Appl 16(1)

  5. Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) Shape distributions. ACM Trans Graph 21(4):807–832

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen X, Liu L, Zhang L, Zhang H, Meng L, Liu D (2021) Group-pair deep feature learning for multi-view 3d model retrieval. Appl Intell 1–10

  7. Nguyen V S, Tran H M, Maleszka M (2021) Geometric modeling: background for processing the 3d objects. Appl Intell 51(8):6182–6201

    Article  Google Scholar 

  8. Qi S, Ning X, Yang G, Zhang L, Long P, Cai W, Li W (2021) Review of multi-view 3d object recognition methods based on deep learning. Displays 69:102053

    Article  Google Scholar 

  9. Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: ICCV 2015, pp 945–953

  10. Dai G, Xie J, Fang Y (2018) Siamese cnn-bilstm architecture for 3d shape representation learning. In: IJCAI 2018, pp 670– 676

  11. Han Z, Shang M, Liu Z, Vong C, Liu Y, Zwicker M, Han J, Chen C L P (2019) Seqviews2seqlabels: learning 3d global features via aggregating sequential views by RNN with attention. IEEE Trans Image Process 28(2):658–672

    Article  MathSciNet  MATH  Google Scholar 

  12. Liu A -A, Hu N, Song D, Guo F -B, Zhou H, Hao T (2019) Multi-view hierarchical fusion network for 3d object retrieval and classification. IEEE Access PP:1–1

    Google Scholar 

  13. Sun K, Zhang J, Liu J, Yu R, Song Z (2021) Drcnn: dynamic routing convolutional neural network for multi-view 3d object recognition. IEEE Trans Image Process 30:868–877

    Article  Google Scholar 

  14. Wang D, Wang B, Zhao S, Yao H, liu H (2017) View-based 3d object retrieval with discriminative views. Neurocomput 252(C):58–66

    Article  Google Scholar 

  15. Nie W, Jia W, Li W, Liu A, Zhao S (2021) 3d pose estimation based on reinforce learning for 2d image-based 3d model retrieval. IEEE Trans Multim 23:1021–1034

    Article  Google Scholar 

  16. Socher R, Huval B, Bath B P, Manning C D, Ng A Y (2012) Convolutional-recursive deep learning for 3d object classification. In: NeurIPS 2012, pp 665–673

  17. Han Z, Liu Z, Han J, Vong C M, Bu S, Chen C L P (2017) Unsupervised learning of 3-d local features from raw voxels based on a novel permutation voxelization strategy. IEEE Trans Cybern PP (99):1–14

    Google Scholar 

  18. Han Z, Liu Z, Han J, Vong C M, Bu S, Chen C L (2017) Mesh convolutional restricted Boltzmann machines for unsupervised learning of features with structure preservation on 3d meshes. IEEE Trans Neural Netw Learn Syst 28(10):2268–2281

    Article  MathSciNet  Google Scholar 

  19. Han Z, Liu Z, Vong C, Liu Y, Bu S, Han J, Chen C L P (2018) Deep spatiality: unsupervised learning of spatially-enhanced global and local 3d features by deep neural network with coupled softmax. IEEE Trans Image Process 27(6):3049–3063

    Article  MathSciNet  Google Scholar 

  20. Feng Y, Feng Y, You H, Zhao X, Gao Y (2019) Meshnet: Mesh neural network for 3d shape representation. In: IAAI 2019, pp 8279–8286

  21. Cai W, Liu D, Ning X, Wang C, Xie G (2021) Voxel-based three-view hybrid parallel network for 3d object classification. Displays 69:102076

    Article  Google Scholar 

  22. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: CVPR 2015, pp 1912–1920

  23. Qi C R, Su H, Mo K, Guibas L J (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: CVPR 2017, pp 77–85

  24. Qi C R, Yi L, Su H, Guibas L J (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Guyon I, Von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) NeurIPS 2017, pp 5099– 5108

  25. Klokov R, Lempitsky V S (2017) Escape from cells: deep kd-networks for the recognition of 3d point cloud models. In: ICCV 2017, pp 863–872

  26. Feng Y, Zhang Z, Zhao X, Ji R, Gao Y (2018) GVCNN: group-view convolutional neural networks for 3d shape recognition. In: CVPR 2018, pp 264–272

  27. Yu T, Meng J, Yuan J (2018) Multi-view harmonized bilinear network for 3d object recognition. In: CVPR 2018, pp 186–194

  28. Ma C, Guo Y, Yang J, An W (2019) Learning multi-view representation with LSTM for 3-d shape recognition and retrieval. IEEE Trans Multim 21(5):1169–1182

    Article  Google Scholar 

  29. Han Z, Shang M, Liu Z, Vong C, Liu Y, Zwicker M, Han J, Chen C L P (2019) Seqviews2seqlabels: learning 3d global features via aggregating sequential views by RNN with attention. IEEE Trans Image Process 28(2):658–672

    Article  MathSciNet  MATH  Google Scholar 

  30. Sfikas K, Theoharis T, Pratikakis I (2017) Exploiting the PANORAMA representation for convolutional neural network classification and retrieval. In: Pratikakis I, Dupont F, Ovsjanikov M (eds) Eurographics workshop on 3d object retrieval

  31. Sfikas K, Pratikakis I, Theoharis T (2018) Ensemble of panorama-based convolutional neural networks for 3d model classification and retrieval. Comput Graph 71:208–218

    Article  Google Scholar 

  32. Yang Z, Wang L (2019) Learning relationships for multi-view 3d object recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7505–7514

  33. Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3d object retrieval via multi-modal graph learning. Signal Process 112(C):110–118

    Article  Google Scholar 

  34. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Graph 38(5):146–114612

    Article  Google Scholar 

  35. Wei X, Yu R, Sun J (2020) View-gcn: View-based graph convolutional network for 3d shape analysis. In: CVPR 2020, pp 1847–1856

  36. Zeng H, Zhao T, Cheng R, Wang F, Liu J (2021) Hierarchical graph attention based multi-view convolutional neural network for 3d object recognition. IEEE Access 9:33323–33335

    Article  Google Scholar 

  37. Liu A -A, Nie W -Z, Gao Y, Su Y -T (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116

    Article  MathSciNet  MATH  Google Scholar 

  38. Papadakis P, Pratikakis I, Perantonis S J, Theoharis T (2007) Efficient 3d shape matching and retrieval using a concrete radialized spherical projection representation. Pattern Recognit 40(9):2437–2452

    Article  MATH  Google Scholar 

  39. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR 2016, pp 770–778

  40. Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) Imagenet: A large-scale hierarchical image database. In: CVPR 2009, pp 248–255

  41. Grabner A, Roth P M, Lepetit V (2018) 3d pose estimation and 3d model retrieval for objects in the wild. In: CVPR 2018, pp 3022–3031

  42. Savva M, Yu F, Su H, et al. (2017) Large-scale 3d shape retrieval from shapenet core55. In: 10th Eurographics workshop on 3d object retrieval, 3DOR@eurographics 2017, Lyon, France, April 23–24, 2017

  43. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. In: NIPS-W

  44. Bengio Y (2012) Practical recommendations for gradient-based training of deep architectures. In: Neural networks: tricks of the trade—2nd edn. Lecture notes in computer science, vol 7700, pp 437–478

  45. Liu A -A, Nie W -Z, Gao Y, Su Y -T (2018) View-based 3-d model retrieval: a benchmark. IEEE Trans Cybern 48(3):916–928

    Article  Google Scholar 

  46. Rauber P E, Falcão A X, Telea A C (2016) Visualizing time-dependent data using dynamic t-sne, pp 73–77

  47. Allen M, Girod L, Newton R, Madden S, Blumstein D T, Estrin D (2008) Voxnet: an interactive, rapidly-deployable acoustic monitoring platform. In: IPSN 2008, pp 371–382

  48. Qi C R, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: CVPR 2016, pp 5648–5656

  49. Kazhdan M M, Funkhouser T A, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3d shape descriptors. In: Kobbelt L, Schröder P, Hoppe H (eds) First eurographics symposium on geometry processing, Aachen, Germany, June 23–25, 2003. ACM international conference proceeding series, vol (43), pp 156–164

  50. Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: Convolution on x-transformed points. In: Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) NeurIPS 2018, pp 828–838

  51. Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: CVPR 2019, pp 8895–8904

  52. Su J, Gadelha M, Wang R, Maji S (2018) A deeper look at 3d shape classifiers. In: ECCV 2018. Lecture Notes in computer science, vol 11131, pp 645–661

  53. Wang C, Pelillo M, Siddiqi K (2017) Dominant set clustering and pooling for multi-view 3d object recognition. In: BMVC 2017

  54. Han Z, Lu H, Liu Z, Vong C, Liu Y, Zwicker M, Han J, Chen CLP (2019) 3d2seqviews: aggregating sequential views for 3d global feature learning by CNN, with hierarchical attention aggregation. IEEE Trans Image Process 28(8):3986–3999

    Article  MathSciNet  MATH  Google Scholar 

  55. Chen S, Zheng L, Zhang Y, Sun Z, Xu K (2019) VERAM: view-enhanced recurrent attention model for 3d shape classification. IEEE Trans Vis Comput Graph 25(12):3244–3257

    Article  Google Scholar 

  56. Zhang Z, Lin H, Zhao X, Ji R, Gao Y (2018) Inductive multi-hypergraph learning and its application on view-based 3d object classification. IEEE Trans Image Process 27(12):5957–5968

    Article  MathSciNet  Google Scholar 

  57. Huang Z, Zhao Z, Zhou H, Zhao X, Gao Y (2019) Deepccfv: camera constraint-free multi-view convolutional neural network for 3d object retrieval. In: AAAI 2019, pp 8505–8512

  58. Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: AAAI 2019, pp 3558–3565

  59. Esteves C, Xu Y, Allen-Blanchette C, Daniilidis K (2019) Equivariant multi-view networks. In: ICCV 2019, pp 1568–1577

  60. Yang Z, Wang L (2019) Learning relationships for multi-view 3d object recognition. In: ICCV 2019, pp 7504–7513

  61. Yu Q, Yang C, Fan H, Wei H (2020) Latent-mvcnn: 3d shape recognition using multiple views from pre-defined or random viewpoints. Neural Process Lett 52(1):581–602

    Article  Google Scholar 

  62. Huang Q, Wang Y, Yin Z (2020) View-based weight network for 3d object recognition. Image Vis Comput 93:103828

    Article  Google Scholar 

  63. Kanezaki A, Matsushita Y, Nishida Y (2018) Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: CVPR (2018), pp 5010–5019

  64. Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ (2016) GIFT: a real-time and scalable 3d shape search engine. In: CVPR 2016, pp 5023–5032

  65. Kanezaki A, Matsushita Y, Nishida Y (2018) Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: CVPR 2018, pp 5010–5019

  66. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in neural information processing systems 29: annual conference on neural information processing systems 2016, December 5–10, 2016, Barcelona, Spain, pp 3837–3845

Download references

Acknowledgments

This work was supported in part by the National Key Research and Development Program of China (2020YFB1711704), the National Natural Science Foundation of China (61872267), the Tianjin New Generation Artificial Intelligence Major Program (19ZXZNGX00110), the Tianjin Science Foundation for Young Scientists (19JCQNJC00500). Here, we especially thank Ruidong Chen and Ruixin Ma for their contributions in the process of revision. Ruidong Chen write the response to the reviewers and revised the draft. Ruixin Ma conducts the experiments in the process of revising the draft including visualizing the results of the experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weizhi Nie.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, Q., Li, Q., Nie, W. et al. Principal views selection based on growing graph convolution network for multi-view 3D model recognition. Appl Intell 53, 5320–5336 (2023). https://doi.org/10.1007/s10489-022-03775-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03775-y

Keywords

Navigation