Skip to main content
Log in

3D Object retrieval based on non-local graph neural networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

3D object retrieval is a hot research field in computer vision and multimedia analysis domain. Since the appearance feature and points of view of 3D objects are very different, thus, the distribution of the training set and test set are variant which is very suitable for transfer learning or cross-domain learning. In the transfer learning or cross-domain learning, the feature extraction is very important which should have good robust for different domains. Thus, in this work, we pay attention to the feature extraction of 3D objects. So far, different feature representations and object retrieval approaches have been proposed. Among them, view-based deep learning retrieval methods achieve state-of-the-art performance, but the existing deep learning retrieval methods only simply use a deep neural network to extract features from each view and directly obtain the view-level shape descriptors without utilizing the spatial relationship between the views. In order to mine the spatial relationship among different views and obtain more discriminative 3D shape descriptors, in this work, 3D object retrieval based on non-local graph neural networks (NGNN) is proposed. In detail, the residual network is firstly utilized as the infrastructure, and then the non-local structure is embedded in the resnet to learn the intrinsic relationship between the views. Finally, the view pooling layer is employed to further fuse the information from different views, and obtain the discriminate feature for the 3D object. Experimental results on two public MVRED and NTU 3D datasets show that the non-local graph network is very efficient for exploring the latent relationship among different views, and the performance of NGNN significantly outperforms state-of-the-art approaches whose improvement can reaches 12.4%-22.7% on ANMRR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ansary TF, Daoudi M, Vandeborre JP (2006) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88

    Article  Google Scholar 

  2. Chen DY, Tian XP, Shen YT, Ming O (2010) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232

    Article  Google Scholar 

  3. Chen D, Tian X, Shen Y, Ouhyoung M (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232

    Article  Google Scholar 

  4. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. 1:886–893

  5. Deng J, Dong W, Socher R, Li LJ, Li K, Fei Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, pp 248–255

  6. Gao Y, Dai Q, Meng W, Naiyao Z (2011) 3d model retrieval using weighted bipartite graph matching. Signal Process Image Commun 26(1):39–47

    Article  Google Scholar 

  7. Gao Z, Kaixin X, Wan S (2020) Multiple discrimination and pairwise cnn for view-based 3d object retrieval. Neural Netw 125(1):290–302

    Article  Google Scholar 

  8. Gao Z, Li Y, Shaohua W (2020) Exploring deep learning for view-based 3d model retrieval. TOMM 16(1):1–21

    Article  Google Scholar 

  9. Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua TS (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process 21(4):2269–2281

    Article  MathSciNet  MATH  Google Scholar 

  10. Gao Z, Wang Y, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: In the association for the advancement of artificial intelligence (AAAI) , pp 2223–2231

  11. Gao Z, Wang D, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: The thirty-second AAAI conference on artificial intelligence, pp 1–8

  12. Gao Y, Wang M, Zha Z, Qi T, Dai Q, Zhang N (2011) Less is more: efficient 3-d object retrieval with query view selection. IEEE Trans Multimedia 13(5):1007–1018

    Article  Google Scholar 

  13. Gao Z, Xuan H, Zhang H, Wan S, Choo KR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet Things J 1–1

  14. Gao Z, Xue KX, Zhang H (2017) Multi-view and multivariate gaussian descriptor for 3d object retrieval. Multimed Tools Appl 1:1–18

    Google Scholar 

  15. Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Azorin-Lopez J (2016) Pointnet: A 3d convolutional neural network for real-time object class recognition. In: International joint conference on neural networks

  16. He K, Xiangyu Z, Ren S, Sun J (2016) Deep residual learning for image recognition. Comput Vis Pattern Recognit, pp 770–778

  17. Huttenlocher DP, Klanderman GA, Rucklidge W (1993) Comparing images using the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15 (9):850–863

    Article  Google Scholar 

  18. Ke L, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713

    MathSciNet  Google Scholar 

  19. Khotanzad A, Hong YH (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497

    Article  Google Scholar 

  20. Li J, Lu K, Huang Z, Zhu L, Shen HT (2019) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 49 (6):2144–2155

    Article  Google Scholar 

  21. Li J, Lu K, Huang Z, Zhu L, Shen H (2019) Heterogeneous domain adaptation through progressive alignment. IEEE Trans Neural Netw Learning Syst 30(5):1381–1391

    Article  MathSciNet  Google Scholar 

  22. Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116

    Article  MathSciNet  MATH  Google Scholar 

  23. Liu AA, Nie WZ, Gao Y, Su YT (2017) View-based 3-d model retrieval: a benchmark. IEEE Trans Cybern 48(3):916–928

    Google Scholar 

  24. Liu X, Wang M, Yin BC, Huet B, Li X (2015) Event-based media enrichment using an adaptive probabilistic hypergraph model. IEEE Trans Cybern 45(11):2461

    Article  Google Scholar 

  25. Lu K, He N, Xue J, Dong J, Shao L (2015) Learning view-model joint relevance for 3d object retrieval. IEEE Trans Image Process 24 (5):1449–1459

    Article  MathSciNet  MATH  Google Scholar 

  26. Lu K, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713

    MathSciNet  Google Scholar 

  27. Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: IEEE/RSJ international conference on intelligent robots & systems

  28. Mihael A, Kastenmüller G, Hans PK, Thomas S (1999) 3d shape histograms for similarity search and classification in spatial databases. In: Proc Int symposium on spatial databases

  29. Minsu C, Jungmin L, Kyoung ML (2010) Reweighted random walks for graph matching. In: European conference on computer vision

  30. Muller H, Muller W, Squire DMG, Marchandmaillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recogn Lett 22(5):593–601

    Article  MATH  Google Scholar 

  31. Nie W, Cao Q, Liu A, Su Y (2017) Convolutional deep learning for 3d object retrieval. Multimedia Sys 23(3):325–332

    Article  Google Scholar 

  32. Nie W, Liu A, Su Y (2016) 3d object retrieval based on sparse coding in weak supervision. J Vis Commun Image Represent 37:40–45

    Article  Google Scholar 

  33. Ohbuchi R, Furuya T (2009) Scale-weighted dense bag of visual features for 3d model retrieval from a partial view 3d model. In: IEEE international conference on computer vision workshops

  34. Ohbuchi R, Osada K, Furuya T, Banno T (2008) Salient local visual features for shape-based 3d model retrieval. In: IEEE international conference on shape modeling & applications

  35. Osada R, Funkhouser T, Chazelle B, Dobkin D (2001) Matching 3d models with shape distributions. Proc.of Int.conf.on Shape Modeling & Applications Usa pp 154–166

  36. Persoon E, Fu KS (1977) Shape discrimination using fourier descriptors. IEEE Trans Sys Man Cy 7(3):170–179

    Article  MathSciNet  Google Scholar 

  37. Polewski P, Yao W, Heurich M, Krzystek P, Stilla U (2015) Detection of fallen trees in als point clouds using a normalized cut approach trained by simulation. Isprs J Photogramm Remote Sens 105:252–271

    Article  Google Scholar 

  38. Shih JL, Lee CH, Wang JT (2007) A new 3d model retrieval approach based on the elevation descriptor. Pattern Recogn 40(1):283–295

    Article  MATH  Google Scholar 

  39. Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. In: KDD Workshop on Text Mining

  40. Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proc. ICCV

  41. Tatsuma A, Aono M (2009) Multi-fourier spectra descriptor and augmentation with spectral clustering for 3d shape retrieval. Vis Comput 25(8):785–804

    Article  Google Scholar 

  42. Wang D, Wang B, Zhao S, Yao H, Liu H (2016) Exploring discriminative views for 3d object retrieval. In: International conference on multimedia modeling, pp 755–766

  43. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. arXiv:1901.00596

  44. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J, Wu Z, Song S, Khosla A (2015) 3d shapenets: a deep representation for volumetric shapes. In: IEEE conference on computer vision & pattern recognition

  45. Yang L, Albregtsen F (1994) Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn 29(7):1061–1073

    Article  Google Scholar 

  46. Yifan F, Zizhao Z, Xibin Z, Rongrong J, Yue G (2018) Gvcnn: group-view convolutional neural networks for 3d shape recognition. 264–272

  47. Yue G, Meng W, Rongrong J, Xindong W, Qionghai D (2014) 3-d object retrieval with hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098

    Article  Google Scholar 

  48. Zan G, Deyu W, Shaohua W, Hua Z, Yinglong W (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3d object retrieval. Future Gener Comp Sys 94(C):641–653

    Google Scholar 

  49. Zan G, Deyu W, Xue YB, Xu GP, Zhang H, Wang YL (2018) 3d object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56(C):305–315

    Google Scholar 

  50. Zan G, Yinming L, Weili G, Weizhi N, Zhiyong C, Hua Z (2020) Pairwise view weighted graph network for view-based 3d model retrieval. In: The 43rd international ACM SIGIR conference on research and development in information retrieval

  51. Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Sun M (2018) Graph neural networks: a review of methods and applications.arXiv:1812.08434

  52. Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans Cybern 45(12):2756–2769

    Article  Google Scholar 

  53. Zhu L, Zi H, Li Z, Xie L, Shen Tao H (2018) Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learning Syst 29(11):5264–5276

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (No.61872270, No.61572357, No.61971309). National Key R&D Program of China (No.2019YFBB1404700). Jinan 20 projects in universities (No.2018GXRC014). Young creative team in universities of Shandong Province (No.2020KJN012), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, China. Tianjin Municipal Natural Science Foundation (No.18JCYBJC85500), NSF project of Tianjin (No. 17JCYBJC15600).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zan Gao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Ym., Gao, Z., Tao, Yb. et al. 3D Object retrieval based on non-local graph neural networks. Multimed Tools Appl 79, 34011–34027 (2020). https://doi.org/10.1007/s11042-020-09248-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09248-z

Keywords

Navigation