Abstract
In recent years, with the development of 3D technologies, 3D model retrieval has become a hot topic. The key point of 3D model retrieval is to extract robust feature for 3D model representation. In order to improve the effectiveness of method on 3D model retrieval, this paper proposes a feature extraction model based on convolutional neural networks (CNN). First, we extract a set of 2D images from 3D model to represent each 3D object. SIFT detector is utilized to detect interesting points from each 2D image and extract interesting patches to represent local information of each 3D model. X-means is leveraged to generate the CNN filters. Second, a single CNN layer learns low-level features which are then given as inputs to multiple recursive neural networks (RNN) in order to compose higher order features. RNNs can generate the final feature for 2D image representation. Finally, nearest neighbor is used to compute the similarity between different 3D models in order to handle the retrieval problem. Extensive comparison experiments were on the popular ETH and MV-RED 3D model datasets. The results demonstrate the superiority of the proposed method.






Similar content being viewed by others
References
Ankerst, M., Kastenmüller, G., Kriegel, H.-P., Seidl, T.: 3D shape histograms for similarity search and classification in spatial databases. In: Advances in spatial databases, pp. 207–226. Springer, Berlin (1999)
Ansary, T.F., Daoudi, M., Vandeborre, J.P.: A bayesian 3-d search engine using adaptive views clustering. IEEE Trans. Multimed. 9(1), 78–88 (2007)
Chen, D.-Y., Tian, X.-P., Shen, Y.-T., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)
Chen, J.-Y., Lin, C.-H., Hsu, P.-C., Chen, C.-H.: Point cloud encoding for 3D building model retrieval. IEEE Trans. Multimed. 16(2), 337–345 (2014)
Daras, P., Axenopoulos, A.: A 3D shape retrieval framework supporting multimodal queries. Int. J. Comput. Vis. 89(2–3), 229–247 (2010)
Gao, Y., Dai, Q.: View-based 3D object retrieval: challenges and approaches. IEEE MultiMed. 21(3), 52–57 (2014)
Gao, Y., Dai, Q., Zhang, N.Y.: 3D model comparison using spatial structure circular descriptor. Pattern Recognit. 43(3), 1142–1151 (2010)
Gao, Y., Wang, M., Zha, Z.J., Tian, Q., Dai, Q., Zhang, N.: Less is more: efficient 3-D object retrieval with query view selection. IEEE Trans. Multimed. 13(5), 1007–1018 (2011)
Gao, Y., Dai, Q., Wang, M., Zhang, N.: 3D model retrieval using weighted bipartite graph matching. Image Commun. 26(1), 39–47 (2011)
Gao, Y., Tang, J., Hong, R., Yan, S., Dai, Q., Zhang, N.Y., Chua, T.S.: Camera constraint-free view-based 3-D object retrieval. IEEE Trans. Image Process. 21(4), 2269–2281 (2012)
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)
Gao, X., Lin, S., Wong, T.Y.: Automatic feature learning to grade nuclear cataracts based on deep learning. In: Computer Vision—ACCV 2014, pp. 632–642. Springer, Switzerland (2015)
Gao, Z., Zhang, H., Liu, A.A., Xu, G., Xue, Y.: Human action recognition on depth dataset. Neural Comput. Appl. (2015). doi:10.1007/s00521-015-2002-0
Guo, Y., Sohel, F., Bennamoun, M., Wan, J., Lu, M.: A novel local surface feature for 3D object recognition under clutter and occlusion. Inf. Sci. 293, 196–213 (2015)
Hilaga, M., Shinagawa, Y., Komura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3d shapes. In SIGGRAPH, pp. 203–212 (2001)
Leibe, B., Schiele, B.: Analyzing appearance and contour based methods for object categorization. In: Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, vol. 2, pp. II–409. IEEE (2003)
Li, B., Lu, Y., Li, C., Godil, A., Schreck, T., Aono, M., Burtscher, M., Fu, H., Furuya, T., Johan, H., et al.: Extended large scale sketch-based 3D shape retrieval. Eurograph. Assoc. 73(4), 128–139 (2014)
Liu, A., Han, D.: Spatiotemporal sparsity induced similarity measure for human action recognition. JDCTA 4(8), 143–149 (2010)
Liu, S., Liu, S., Cai, W., Che, H., Pujol, S., Kikinis, R., Fulham, M., Feng, D.: High-level feature based pet image retrieval with deep learning architecture. J. Nucl. Med. 55(supplement 1), 2028–2028 (2014)
Liu, A., Su, Y., Nie, W., Yang, Z.: Jointly learning multiple sequential dynamics for human action recognition. PLoS ONE 10(7), 1–21 (2014). doi:10.1371/journal.pone.013088
Liu, A., Su, Y., Jia, P., Gao, Z., Hao, T., Yang, Z.: Multiple/single-view human action recognition via part-induced multi-task structural learning. IEEE Trans. Cybern. 45(6), 1194–1208 (2015)
Liu, A., Wang, Z., Nie, W., Su, Y.: Graph-based characteristic view set extraction and matching for 3D model retrieval. Inf. Sci. 320, 429–442 (2015)
Liu, A., Nie, W., Su, Y., Ma, L., Hao, T., Yang, Z.: Coupled hidden conditional random fields for RGB-D human action recognition. Signal Process. 112, 74–82 (2015)
Murugappan, S., Liu, H., Ramani, K.: Shape-it-up: hand gesture based creative expression of 3D shapes using intelligent generalized cylinders. Comput. Aided Des. 45(2), 277–287 (2013)
Neverova, N., Wolf, C., Taylor, G.W., Nebout, F.: Multi-scale deep learning for gesture detection and localization. In: Computer Vision-ECCV 2014 Workshops, pp. 474–490. Springer (2014)
Paquet, E., Rioux, M., Murching, A.M., Naveen, T., Tabatabai, A.J.: Description of shape information for 2-D and 3-D objects. Signal Process. Image Commun. 16(1–2), 103–122 (2000)
Richter, R.M., Mulvany, M.J.: Comparison of hCRF and oCRF effects on cardiovascular responses after central, peripheral, and in vitro application. Peptides 16(5), 843–849 (1995)
Shih, J.L., Lee, C.H., Wang, J.T.: A new 3D model retrieval approach based on the elevation descriptor. Pattern Recognit. 40(1), 283–295 (2007)
Socher, R., Huval, B., Bath, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: Advances in Neural Information Processing Systems. In: NIPS, pp. 665–673 (2012)
Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: IEEE conference on computer vision and pattern recognition (CVPR), 2014 IEEE pp. 1891–1898 (2014)
Vandeborre, J.P., Couillet, V., Daoudi, M.: A practical approach for 3D model indexing by combining local and global invariants. In: 3DPVT, pp. 644–647 (2002)
Vranic, D.V.: An improvement of rotation invariant 3D-shape based on functions on concentric spheres. ICIP 3, 757–760 (2003)
Wang, F., Lin, L., Tang, M.: A new sketch-based 3D model retrieval approach by using global and local features. Graph. Models 76(3), 128–139 (2014)
Wang, X., Nie, W.: 3D model retrieval with weighted locality-constrained group sparse coding. Neurocomputing 151, 620–625 (2015)
Xu, Q., Liu, Y., Li, X., Yang, Z., Wang, J., Sbert, M., Scopigno, R.: Browsing and exploration of video sequences: a new scheme for key frame extraction and 3D visualization using entropy based Jensen divergence. Inf. Sci. 278, 736–756 (2014)
Zhao, S., Yao, H., Yang, Y., Zhang, Y.: Affective image retrieval via multi-graph learning. In: Proceedings of the ACM international conference on multimedia, MM ’14, Orlando, FL, USA, November 03–07, 2014, pp. 1025–1028 (2014)
Zhou, J.L., Zhou, M.Q., Geng, G.H.: 3D model retrieval based on distance classification histogram. Appl. Mech. Mater. 733, 931–934 (2015)
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61472275, 61170239, 61303208), the Tianjin Research Program of Application Foundation and Advanced Technology (15JCYBJC16200), and the Grant of Elite Scholar Program of Tianjin University (2014XRG-0046).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nie, W., Cao, Q., Liu, A. et al. Convolutional deep learning for 3D object retrieval. Multimedia Systems 23, 325–332 (2017). https://doi.org/10.1007/s00530-015-0485-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0485-2