Abstract
The classification and retrieval of 3D models have been widely used in the field of multimedia and computer vision. With the rapid development of computer graphics, different algorithms corresponding to different representations of 3D models have achieved the best performance. The advances in deep learning also encourage various deep models for 3D feature representation. For multi-view, point cloud, and PANORAMA-view, different models have shown significant performance on 3D shape classification. However, There’s not a way to consider utilizing the fusion information of multi-modal for 3D shape classification. In our opinion, We propose a novel multi-modal information fusion method for 3D shape classification, which can fully utilize the advantage of different modal to predict the label of class. More specifically, the proposed can effectively fuse more modal information. it is easy to utilize in other similar applications. We have evaluated our framework on the popular dataset ModelNet40 for the classification task on 3D shape. Series experimental results and comparisons with state-of-the-art methods demonstrate the validity of our approach.
Similar content being viewed by others
References
Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv:1608.04236
Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks, Computer Science
Chen X, Ma H, Wan J, Li B, Xia T (2016) Multi-view 3d object detection network for autonomous driving. arXiv:1611.07759
Chen DY, Tian XP, Shen YT, Ming O (2010) On visual similarity based 3d model retrieval. Computer Graphics Forum 22(3):223–232
Dyn N, Levine D, Gregory JA (1990) A butterfly subdivision scheme for surface interpolation with tension control. ACM Transaction on Graphics 9:160– . https://doi.org/10.1145/78956.78958
Enzweiler M, Gavrila DM (2011) A multilevel mixture-of-experts framework for pedestrian classification. IEEE Trans Image Process 20(10):2967–2979. https://doi.org/10.1109/TIP.2011.2142006
Feng Y, Feng Y, You H, Zhao X, Gao Y (2018) Meshnet: Mesh neural network for 3d shape representation. arXiv:1811.11424
Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Cazorla M, Azorin-Lopez J (2016) Pointnet: a 3d convolutional neural network for real-time object class recognition. In: 2016 International joint conference on neural networks (IJCNN), pp 1578–1584. https://doi.org/10.1109/IJCNN.2016.7727386
González A, Vázquez D, López AM, Amores J (2017) On-board object detection: Multicue, multimodal, and multiview random forest of local experts. IEEE Trans Cyber 47(11):3980–3990. https://doi.org/10.1109/TCYB.2016.2593940
Hubeli A, Gross M (2001) Multiresolution feature extraction from unstructured meshes, vol 1. https://doi.org/10.1109/VISUAL.2001.964523
Kanezaki A (2016) Rotationnet: Learning object classification using unsupervised viewpoint estimation. arXiv:1603.06208
Kazhdan M, Funkhouser T, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on geometry processing
Klokov R, Lempitsky VS (2017) Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. arXiv:1704.01222
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Neural Information Processing Systems, vol 25. https://doi.org/10.1145/3065386
Kokkinos I, Bronstein M, Litman R, Bronstein A (2012) Intrinsic shape context descriptors for deformable shapes, pp 159–166. https://doi.org/10.1109/CVPR.2012.6247671
Li Y, Bu R, Sun M, Chen B (2018) Pointcnn. arXiv:1801.07791
Murase H, Nayar SK (1995) Visual learning and recognition of 3-d objects from appearance. Int J Comput Vis 14(1):5–24. https://doi.org/10.1007/BF01421486
Novotný D., Larlus D, Vedaldi A (2017) Learning 3d object categories by looking around them. arXiv:1705.03951
Papadakis P, Pratikakis I, Perantonis S, Theoharis T (2007) Efficient 3d shape matching and retrieval using a concrete radialized spherical projection representation. Pattern Recogn 40(9):2437–2452. https://doi.org/10.1016/j.patcog.2006.12.026
Qi CR, Su H, Mo K, Guibas LJ (2016) Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv:1612.00593
Qi CR, Hao S, Niessner M, Dai A, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv:1706.02413
Qi CR, Li Y, Hao S, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space
Schneider RG, Tuytelaars T (2014) Sketch classification and classification-driven analysis using fisher vectors. ACM Trans Graph 33(6):174:1–174:9. https://doi.org/10.1145/2661229.2661231
Sfikas K, Pratikakis I, Theoharis T (2018) Ensemble of panorama-based convolutional neural networks for 3d model classification and retrieval. Computers & Graphics 71:208–218. https://doi.org/10.1016/j.cag.2017.12.001
Socher R, Huval B, Bath BP, Manning CD, Ng AY (2012) Convolutional-recursive deep learning for 3d object classification, pp 665–673
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. arXiv:1505.00880
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE international conference on computer vision (ICCV), pp 945–953. https://doi.org/10.1109/ICCV.2015.114
Sfikas K, Theoharis T, Pratikakis I (2017) Exploiting the PANORAMA representation for convolutional neural network classification and retrieval. In: Pratikakis I, Dupont F, Ovsjanikov M (eds) Eurographics workshop on 3D object retrieval, The Eurographics association. https://doi.org/10.2312/3dor.20171045
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes, pp 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J, Wu Z, Song S, Khosla A (2015) 3d shapenets A deep representation for volumetric shapes. In: IEEE conference on computer vision & pattern recognition
You H, Feng Y, Ji R, Gao Y (2018) Pvnet: a joint convolutional network of point cloud and multi-view for 3d shape recognition. arXiv:1808.07659
Yue W, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph cnn for learning on point clouds
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61472275, 61170239, 61303208, 61502337).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liang, Q., Xiao, M. & Song, D. 3D shape recognition based on multi-modal information fusion. Multimed Tools Appl 80, 16173–16184 (2021). https://doi.org/10.1007/s11042-019-08552-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08552-7