Abstract
Recent advancements in low-cost 3D sensors and mobile devices of virtual 3D models have additionally facilitated the accessibility of 3D data. 3D model retrieval is becoming an indispensable function for modern search engines. An effective retrieval model is at the core of computer vision. With the continuous improvement of 3D data, there are large number of methods to solve this problem. Existing works proposed numerous works to deal with feature extraction and object matching. Most of them are unable to fully exploit the information of 3D representations. To address this problem, we propose a novel multi-layer deep network in this paper. First, multiple rendered images are extracted from a 3D object, and combined into one representative view, which is the actual input of the network. Then, the novel multi-layer network structure is trained and tested on these representative views, generating the feature leaning model, which owns the local and global information of a 3D object. Finally, simple Euclidean metric is used to compute the similarity between two different 3D models to complete the retrieval problem. Extensive experiments and corresponding experimental results have demonstrated the superiority of our approach.



Similar content being viewed by others
References
Akgül CB, Sankur B, Yemez Y, Schmitt FJM (2009) 3D model retrieval using probability density-based shape descriptors. IEEE Trans Pattern Anal Mach Intell 31(6):1117–1133
Ansary TF, Daoudi M, Vandeborre J (2007) A Bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88
Bu S, Liu Z, Han J, Wu J, Ji R (2014) Learning high-level feature by deep belief networks for 3-d model retrieval and recognition. IEEE Trans Multimedia 16(8):2154–2167
Bustos B, Keim DA, Saupe D, Schreck T, Vranic DV (2005) Feature-based similarity search in 3d object databases. ACM Comput Surv 37(4):345–387
Cao B, Kang Y, Lin S, Luo X, Xu S, Lv Z (2016) Style-sensitive 3d model retrieval through sketch-based queries. J Intell Fuzzy Syst 31(5):2637–2644
Chen D, Tian X, Shen Y, Ouhyoung M (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232
Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE computer society conference on computer vision and pattern recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA, pp 248–255
Ding G, Zhou J, Guo Y, Lin Z, Zhao S, Han J (2017) Large-scale image retrieval with sparse embedded hashing. Neurocomputing 257:24–36
Gao Y, Dai Q (2014) View-based 3d object retrieval: challenges and approaches. IEEE MultiMedia 21(3):52–57
Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua T (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Processing 21(4):2269–2281
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Processing 21(9):4290–4303
Gao Y, Zhen Y, Li H, Chua TS (2016) Filtering of brand-related microblogs using social-smooth multiview embedding. IEEE Trans Multimedia 18(10):2115–2126
Gao Y, Zhang H, Zhao X, Yan S (2017) Event classification in microblogs via social tracking. ACM Trans Intell Syst Technol (TIST) 8(3):35
Hong R, Hu Z, Wang R, Wang M, Tao D (2016) Multi-view object retrieval via multi-scale topic models. IEEE Trans Image Processing 25(12):5814–5827
Hu F, Xia G, Hu J, Zhang L (2015) Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens 7(11):14680–14707
Irfanoglu MO, Gökberk B, Akarun L (2004) 3D shape-based face recognition using automatically registered facial surfaces. In: 17th international conference on pattern recognition, ICPR 2004, Cambridge, UK, August 23–26, 2004, pp 183–186
Kalogerakis E, Averkiou M, Maji S, Chaudhuri S (2016) 3D shape segmentation with projective convolutional networks. arXiv:1612.02808
Kazhdan MM, Funkhouser TA, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3d shape descriptors. In: First eurographics symposium on geometry processing, Aachen, Germany, June 23–25, 2003, pp 156–164
LeCun Y, Haffner P, Bottou L, Bengio Y (1999) Object recognition with gradient-based learning. In: Shape, contour and grouping in computer vision, p 319
LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: 2004 IEEE computer society conference on computer vision and pattern recognition (CVPR 2004), with CD-ROM, 27 June–2 July 2004, Washington, DC, USA, pp 97–104
Liu A, Wang Z, Nie W, Su Y (2015) Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf Sci 320:429–442
Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Processing 25(5):2103–2116
Liu AA, Nie WZ, Gao Y et al (2017) View-based 3-d model retrieval: a benchmark. IEEE Transactions on Cybernetics PP(99):1–13
Liu Q (2012) A survey of recent view-based 3d model retrieval methods. arXiv:1208.3670
Maturana D, Scherer S (2015) Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International conference on intelligent robots and systems, IROS 2015, Hamburg, Germany, September 28–October 2, 2015, pp 922–928
Nie L, Wang M, Zha Z, Li G, Chua TS (2011) Multimedia answering: enriching text qa with media information. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. SIGIR ’11. ACM, pp 695–704
Nie L, Wang M, Zha ZJ, Chua TS (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13:1–13:23
Saupe D, Vranic DV (2001) 3D model retrieval with spherical harmonics and moments. In: Pattern recognition, 23rd DAGM-symposium, Munich, Germany, September 12–14, 2001, proceedings, pp 392–397
Shi B, Bai S, Zhou Z, Bai X (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE international conference on computer vision, ICCV 2015, Santiago, Chile, December 7–13, 2015, pp 945–953
Tangelder JWH, Veltkamp RC (2003) Polyhedral model retrieval using weighted point sets. Int J Image Graphics 3(1):209
Wang D, Wang B, Zhao S, Yao H, Liu H (2017) View-based 3d object retrieval with discriminative views. Neurocomputing 252:58–66
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D shapenets: a deep representation for volumetric shapes. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, pp 1912–1920
Xie J, Dai G, Zhu F, Wong EK, Fang Y (2017) Deepshape: deep-learned shape descriptor for 3d shape retrieval. IEEE Trans Pattern Anal Mach Intell 39(7):1335–1345
Xu X, Corrigan D, Dehghani A, Caulfield S, Moloney D (2016) 3D object recognition based on volumetric representation using convolutional neural networks. In: Articulated motion and deformable objects - 9th international conference, AMDO 2016, Palma de Mallorca, Spain, July 13–15, 2016, proceedings, pp 147–156
Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: 2015 IEEE international conference on computer vision, ICCV 2015, Santiago, Chile, December 7–13, 2015, pp 1215–1223
Zhao X, Si S, Dui H, Cai Z, Sun S (2013) Integrated importance measure for multi-state coherent systems of k level. J Syst Eng Electron 24(6):1029–1037
Zhao X, Zhang H, Jiang Y et al (2013) An effective heuristic-based approach for partitioning. J Appl Math 2013(9):289–325
Zhao S, Chen L, Yao H, Zhang Y, Sun X (2015) Strategy for dynamic 3d depth data matching towards robust action retrieval. Neurocomputing 151:533–543
Zhao S, Yao H, Zhang Y et al (2015) View-based 3d object retrieval via multi-modal graph learning. Signal Process 112(C):110–118
Zhao X, Si S, Dui H, Cai Z, Wang J, Song X (2015) Compositional performance evaluation with importance measures. Communications in Statistics-Theory and Methods 44(24):5240–5253
Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous probability distribution prediction of image emotions via multitask shared sparse regression. IEEE Trans Multimedia 19(3):632–645
Zhao X, Wang N, Zhang Y et al (2017) Beyond pairwise matching: Person reidentification via high-order relevance learning. IEEE Transactions on Neural Networks and Learning Systems PP(99):1–14
Acknowledgments
The work is partially supported by the National Natural Science Foundation of China (No. 61502337).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nie, W., Xiang, S. & Liu, A. Multi-scale CNNs for 3D model retrieval. Multimed Tools Appl 77, 22953–22963 (2018). https://doi.org/10.1007/s11042-018-5641-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5641-1