Abstract
3D object retrieval is a hot research field in computer vision and multimedia analysis domain. Since the appearance feature and points of view of 3D objects are very different, thus, the distribution of the training set and test set are variant which is very suitable for transfer learning or cross-domain learning. In the transfer learning or cross-domain learning, the feature extraction is very important which should have good robust for different domains. Thus, in this work, we pay attention to the feature extraction of 3D objects. So far, different feature representations and object retrieval approaches have been proposed. Among them, view-based deep learning retrieval methods achieve state-of-the-art performance, but the existing deep learning retrieval methods only simply use a deep neural network to extract features from each view and directly obtain the view-level shape descriptors without utilizing the spatial relationship between the views. In order to mine the spatial relationship among different views and obtain more discriminative 3D shape descriptors, in this work, 3D object retrieval based on non-local graph neural networks (NGNN) is proposed. In detail, the residual network is firstly utilized as the infrastructure, and then the non-local structure is embedded in the resnet to learn the intrinsic relationship between the views. Finally, the view pooling layer is employed to further fuse the information from different views, and obtain the discriminate feature for the 3D object. Experimental results on two public MVRED and NTU 3D datasets show that the non-local graph network is very efficient for exploring the latent relationship among different views, and the performance of NGNN significantly outperforms state-of-the-art approaches whose improvement can reaches 12.4%-22.7% on ANMRR.
Similar content being viewed by others
References
Ansary TF, Daoudi M, Vandeborre JP (2006) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88
Chen DY, Tian XP, Shen YT, Ming O (2010) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232
Chen D, Tian X, Shen Y, Ouhyoung M (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. 1:886–893
Deng J, Dong W, Socher R, Li LJ, Li K, Fei Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, pp 248–255
Gao Y, Dai Q, Meng W, Naiyao Z (2011) 3d model retrieval using weighted bipartite graph matching. Signal Process Image Commun 26(1):39–47
Gao Z, Kaixin X, Wan S (2020) Multiple discrimination and pairwise cnn for view-based 3d object retrieval. Neural Netw 125(1):290–302
Gao Z, Li Y, Shaohua W (2020) Exploring deep learning for view-based 3d model retrieval. TOMM 16(1):1–21
Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua TS (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process 21(4):2269–2281
Gao Z, Wang Y, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: In the association for the advancement of artificial intelligence (AAAI) , pp 2223–2231
Gao Z, Wang D, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: The thirty-second AAAI conference on artificial intelligence, pp 1–8
Gao Y, Wang M, Zha Z, Qi T, Dai Q, Zhang N (2011) Less is more: efficient 3-d object retrieval with query view selection. IEEE Trans Multimedia 13(5):1007–1018
Gao Z, Xuan H, Zhang H, Wan S, Choo KR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet Things J 1–1
Gao Z, Xue KX, Zhang H (2017) Multi-view and multivariate gaussian descriptor for 3d object retrieval. Multimed Tools Appl 1:1–18
Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Azorin-Lopez J (2016) Pointnet: A 3d convolutional neural network for real-time object class recognition. In: International joint conference on neural networks
He K, Xiangyu Z, Ren S, Sun J (2016) Deep residual learning for image recognition. Comput Vis Pattern Recognit, pp 770–778
Huttenlocher DP, Klanderman GA, Rucklidge W (1993) Comparing images using the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15 (9):850–863
Ke L, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713
Khotanzad A, Hong YH (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497
Li J, Lu K, Huang Z, Zhu L, Shen HT (2019) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 49 (6):2144–2155
Li J, Lu K, Huang Z, Zhu L, Shen H (2019) Heterogeneous domain adaptation through progressive alignment. IEEE Trans Neural Netw Learning Syst 30(5):1381–1391
Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Liu AA, Nie WZ, Gao Y, Su YT (2017) View-based 3-d model retrieval: a benchmark. IEEE Trans Cybern 48(3):916–928
Liu X, Wang M, Yin BC, Huet B, Li X (2015) Event-based media enrichment using an adaptive probabilistic hypergraph model. IEEE Trans Cybern 45(11):2461
Lu K, He N, Xue J, Dong J, Shao L (2015) Learning view-model joint relevance for 3d object retrieval. IEEE Trans Image Process 24 (5):1449–1459
Lu K, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713
Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: IEEE/RSJ international conference on intelligent robots & systems
Mihael A, Kastenmüller G, Hans PK, Thomas S (1999) 3d shape histograms for similarity search and classification in spatial databases. In: Proc Int symposium on spatial databases
Minsu C, Jungmin L, Kyoung ML (2010) Reweighted random walks for graph matching. In: European conference on computer vision
Muller H, Muller W, Squire DMG, Marchandmaillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recogn Lett 22(5):593–601
Nie W, Cao Q, Liu A, Su Y (2017) Convolutional deep learning for 3d object retrieval. Multimedia Sys 23(3):325–332
Nie W, Liu A, Su Y (2016) 3d object retrieval based on sparse coding in weak supervision. J Vis Commun Image Represent 37:40–45
Ohbuchi R, Furuya T (2009) Scale-weighted dense bag of visual features for 3d model retrieval from a partial view 3d model. In: IEEE international conference on computer vision workshops
Ohbuchi R, Osada K, Furuya T, Banno T (2008) Salient local visual features for shape-based 3d model retrieval. In: IEEE international conference on shape modeling & applications
Osada R, Funkhouser T, Chazelle B, Dobkin D (2001) Matching 3d models with shape distributions. Proc.of Int.conf.on Shape Modeling & Applications Usa pp 154–166
Persoon E, Fu KS (1977) Shape discrimination using fourier descriptors. IEEE Trans Sys Man Cy 7(3):170–179
Polewski P, Yao W, Heurich M, Krzystek P, Stilla U (2015) Detection of fallen trees in als point clouds using a normalized cut approach trained by simulation. Isprs J Photogramm Remote Sens 105:252–271
Shih JL, Lee CH, Wang JT (2007) A new 3d model retrieval approach based on the elevation descriptor. Pattern Recogn 40(1):283–295
Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. In: KDD Workshop on Text Mining
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proc. ICCV
Tatsuma A, Aono M (2009) Multi-fourier spectra descriptor and augmentation with spectral clustering for 3d shape retrieval. Vis Comput 25(8):785–804
Wang D, Wang B, Zhao S, Yao H, Liu H (2016) Exploring discriminative views for 3d object retrieval. In: International conference on multimedia modeling, pp 755–766
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. arXiv:1901.00596
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J, Wu Z, Song S, Khosla A (2015) 3d shapenets: a deep representation for volumetric shapes. In: IEEE conference on computer vision & pattern recognition
Yang L, Albregtsen F (1994) Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn 29(7):1061–1073
Yifan F, Zizhao Z, Xibin Z, Rongrong J, Yue G (2018) Gvcnn: group-view convolutional neural networks for 3d shape recognition. 264–272
Yue G, Meng W, Rongrong J, Xindong W, Qionghai D (2014) 3-d object retrieval with hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098
Zan G, Deyu W, Shaohua W, Hua Z, Yinglong W (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3d object retrieval. Future Gener Comp Sys 94(C):641–653
Zan G, Deyu W, Xue YB, Xu GP, Zhang H, Wang YL (2018) 3d object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56(C):305–315
Zan G, Yinming L, Weili G, Weizhi N, Zhiyong C, Hua Z (2020) Pairwise view weighted graph network for view-based 3d model retrieval. In: The 43rd international ACM SIGIR conference on research and development in information retrieval
Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Sun M (2018) Graph neural networks: a review of methods and applications.arXiv:1812.08434
Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans Cybern 45(12):2756–2769
Zhu L, Zi H, Li Z, Xie L, Shen Tao H (2018) Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learning Syst 29(11):5264–5276
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (No.61872270, No.61572357, No.61971309). National Key R&D Program of China (No.2019YFBB1404700). Jinan 20 projects in universities (No.2018GXRC014). Young creative team in universities of Shandong Province (No.2020KJN012), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, China. Tianjin Municipal Natural Science Foundation (No.18JCYBJC85500), NSF project of Tianjin (No. 17JCYBJC15600).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, Ym., Gao, Z., Tao, Yb. et al. 3D Object retrieval based on non-local graph neural networks. Multimed Tools Appl 79, 34011–34027 (2020). https://doi.org/10.1007/s11042-020-09248-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09248-z