Skip to main content
Log in

3D shape recognition based on multi-modal information fusion

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The classification and retrieval of 3D models have been widely used in the field of multimedia and computer vision. With the rapid development of computer graphics, different algorithms corresponding to different representations of 3D models have achieved the best performance. The advances in deep learning also encourage various deep models for 3D feature representation. For multi-view, point cloud, and PANORAMA-view, different models have shown significant performance on 3D shape classification. However, There’s not a way to consider utilizing the fusion information of multi-modal for 3D shape classification. In our opinion, We propose a novel multi-modal information fusion method for 3D shape classification, which can fully utilize the advantage of different modal to predict the label of class. More specifically, the proposed can effectively fuse more modal information. it is easy to utilize in other similar applications. We have evaluated our framework on the popular dataset ModelNet40 for the classification task on 3D shape. Series experimental results and comparisons with state-of-the-art methods demonstrate the validity of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv:1608.04236

  2. Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks, Computer Science

  3. Chen X, Ma H, Wan J, Li B, Xia T (2016) Multi-view 3d object detection network for autonomous driving. arXiv:1611.07759

  4. Chen DY, Tian XP, Shen YT, Ming O (2010) On visual similarity based 3d model retrieval. Computer Graphics Forum 22(3):223–232

    Article  Google Scholar 

  5. Dyn N, Levine D, Gregory JA (1990) A butterfly subdivision scheme for surface interpolation with tension control. ACM Transaction on Graphics 9:160– . https://doi.org/10.1145/78956.78958

  6. Enzweiler M, Gavrila DM (2011) A multilevel mixture-of-experts framework for pedestrian classification. IEEE Trans Image Process 20(10):2967–2979. https://doi.org/10.1109/TIP.2011.2142006

    Article  MathSciNet  MATH  Google Scholar 

  7. Feng Y, Feng Y, You H, Zhao X, Gao Y (2018) Meshnet: Mesh neural network for 3d shape representation. arXiv:1811.11424

  8. Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Cazorla M, Azorin-Lopez J (2016) Pointnet: a 3d convolutional neural network for real-time object class recognition. In: 2016 International joint conference on neural networks (IJCNN), pp 1578–1584. https://doi.org/10.1109/IJCNN.2016.7727386

  9. González A, Vázquez D, López AM, Amores J (2017) On-board object detection: Multicue, multimodal, and multiview random forest of local experts. IEEE Trans Cyber 47(11):3980–3990. https://doi.org/10.1109/TCYB.2016.2593940

    Article  Google Scholar 

  10. Hubeli A, Gross M (2001) Multiresolution feature extraction from unstructured meshes, vol 1. https://doi.org/10.1109/VISUAL.2001.964523

  11. Kanezaki A (2016) Rotationnet: Learning object classification using unsupervised viewpoint estimation. arXiv:1603.06208

  12. Kazhdan M, Funkhouser T, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on geometry processing

  13. Klokov R, Lempitsky VS (2017) Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. arXiv:1704.01222

  14. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Neural Information Processing Systems, vol 25. https://doi.org/10.1145/3065386

  15. Kokkinos I, Bronstein M, Litman R, Bronstein A (2012) Intrinsic shape context descriptors for deformable shapes, pp 159–166. https://doi.org/10.1109/CVPR.2012.6247671

  16. Li Y, Bu R, Sun M, Chen B (2018) Pointcnn. arXiv:1801.07791

  17. Murase H, Nayar SK (1995) Visual learning and recognition of 3-d objects from appearance. Int J Comput Vis 14(1):5–24. https://doi.org/10.1007/BF01421486

    Article  Google Scholar 

  18. Novotný D., Larlus D, Vedaldi A (2017) Learning 3d object categories by looking around them. arXiv:1705.03951

  19. Papadakis P, Pratikakis I, Perantonis S, Theoharis T (2007) Efficient 3d shape matching and retrieval using a concrete radialized spherical projection representation. Pattern Recogn 40(9):2437–2452. https://doi.org/10.1016/j.patcog.2006.12.026

    Article  MATH  Google Scholar 

  20. Qi CR, Su H, Mo K, Guibas LJ (2016) Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv:1612.00593

  21. Qi CR, Hao S, Niessner M, Dai A, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data

  22. Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv:1706.02413

  23. Qi CR, Li Y, Hao S, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space

  24. Schneider RG, Tuytelaars T (2014) Sketch classification and classification-driven analysis using fisher vectors. ACM Trans Graph 33(6):174:1–174:9. https://doi.org/10.1145/2661229.2661231

    Article  Google Scholar 

  25. Sfikas K, Pratikakis I, Theoharis T (2018) Ensemble of panorama-based convolutional neural networks for 3d model classification and retrieval. Computers & Graphics 71:208–218. https://doi.org/10.1016/j.cag.2017.12.001

    Article  Google Scholar 

  26. Socher R, Huval B, Bath BP, Manning CD, Ng AY (2012) Convolutional-recursive deep learning for 3d object classification, pp 665–673

  27. Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. arXiv:1505.00880

  28. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE international conference on computer vision (ICCV), pp 945–953. https://doi.org/10.1109/ICCV.2015.114

  29. Sfikas K, Theoharis T, Pratikakis I (2017) Exploiting the PANORAMA representation for convolutional neural network classification and retrieval. In: Pratikakis I, Dupont F, Ovsjanikov M (eds) Eurographics workshop on 3D object retrieval, The Eurographics association. https://doi.org/10.2312/3dor.20171045

  30. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes, pp 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801

  31. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J, Wu Z, Song S, Khosla A (2015) 3d shapenets A deep representation for volumetric shapes. In: IEEE conference on computer vision & pattern recognition

  32. You H, Feng Y, Ji R, Gao Y (2018) Pvnet: a joint convolutional network of point cloud and multi-view for 3d shape recognition. arXiv:1808.07659

  33. Yue W, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph cnn for learning on point clouds

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (61472275, 61170239, 61303208, 61502337).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mengmeng Xiao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, Q., Xiao, M. & Song, D. 3D shape recognition based on multi-modal information fusion. Multimed Tools Appl 80, 16173–16184 (2021). https://doi.org/10.1007/s11042-019-08552-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08552-7

Keywords

Navigation