3D Object retrieval based on non-local graph neural networks

Li, Yin-min; Gao, Zan; Tao, Ya-bin; Wang, Li-li; Xue, Yan-bing

doi:10.1007/s11042-020-09248-z

3D Object retrieval based on non-local graph neural networks

Published: 11 July 2020

Volume 79, pages 34011–34027, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yin-min Li^1,2,
Zan Gao²,
Ya-bin Tao³,
Li-li Wang⁴ &
…
Yan-bing Xue¹

361 Accesses
2 Citations
Explore all metrics

Abstract

3D object retrieval is a hot research field in computer vision and multimedia analysis domain. Since the appearance feature and points of view of 3D objects are very different, thus, the distribution of the training set and test set are variant which is very suitable for transfer learning or cross-domain learning. In the transfer learning or cross-domain learning, the feature extraction is very important which should have good robust for different domains. Thus, in this work, we pay attention to the feature extraction of 3D objects. So far, different feature representations and object retrieval approaches have been proposed. Among them, view-based deep learning retrieval methods achieve state-of-the-art performance, but the existing deep learning retrieval methods only simply use a deep neural network to extract features from each view and directly obtain the view-level shape descriptors without utilizing the spatial relationship between the views. In order to mine the spatial relationship among different views and obtain more discriminative 3D shape descriptors, in this work, 3D object retrieval based on non-local graph neural networks (NGNN) is proposed. In detail, the residual network is firstly utilized as the infrastructure, and then the non-local structure is embedded in the resnet to learn the intrinsic relationship between the views. Finally, the view pooling layer is employed to further fuse the information from different views, and obtain the discriminate feature for the 3D object. Experimental results on two public MVRED and NTU 3D datasets show that the non-local graph network is very efficient for exploring the latent relationship among different views, and the performance of NGNN significantly outperforms state-of-the-art approaches whose improvement can reaches 12.4%-22.7% on ANMRR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

End-to-End Object Detection with Transformers

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Jiayi Ma, Xingyu Jiang, … Junchi Yan

References

Ansary TF, Daoudi M, Vandeborre JP (2006) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88
Article Google Scholar
Chen DY, Tian XP, Shen YT, Ming O (2010) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232
Article Google Scholar
Chen D, Tian X, Shen Y, Ouhyoung M (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. 1:886–893
Deng J, Dong W, Socher R, Li LJ, Li K, Fei Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, pp 248–255
Gao Y, Dai Q, Meng W, Naiyao Z (2011) 3d model retrieval using weighted bipartite graph matching. Signal Process Image Commun 26(1):39–47
Article Google Scholar
Gao Z, Kaixin X, Wan S (2020) Multiple discrimination and pairwise cnn for view-based 3d object retrieval. Neural Netw 125(1):290–302
Article Google Scholar
Gao Z, Li Y, Shaohua W (2020) Exploring deep learning for view-based 3d model retrieval. TOMM 16(1):1–21
Article Google Scholar
Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua TS (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process 21(4):2269–2281
Article MathSciNet MATH Google Scholar
Gao Z, Wang Y, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: In the association for the advancement of artificial intelligence (AAAI) , pp 2223–2231
Gao Z, Wang D, He X, Zhang H (2018) Group-pair convolutional neural networks for multi-view based 3d object retrieval. In: The thirty-second AAAI conference on artificial intelligence, pp 1–8
Gao Y, Wang M, Zha Z, Qi T, Dai Q, Zhang N (2011) Less is more: efficient 3-d object retrieval with query view selection. IEEE Trans Multimedia 13(5):1007–1018
Article Google Scholar
Gao Z, Xuan H, Zhang H, Wan S, Choo KR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet Things J 1–1
Gao Z, Xue KX, Zhang H (2017) Multi-view and multivariate gaussian descriptor for 3d object retrieval. Multimed Tools Appl 1:1–18
Google Scholar
Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Azorin-Lopez J (2016) Pointnet: A 3d convolutional neural network for real-time object class recognition. In: International joint conference on neural networks
He K, Xiangyu Z, Ren S, Sun J (2016) Deep residual learning for image recognition. Comput Vis Pattern Recognit, pp 770–778
Huttenlocher DP, Klanderman GA, Rucklidge W (1993) Comparing images using the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15 (9):850–863
Article Google Scholar
Ke L, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713
MathSciNet Google Scholar
Khotanzad A, Hong YH (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497
Article Google Scholar
Li J, Lu K, Huang Z, Zhu L, Shen HT (2019) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 49 (6):2144–2155
Article Google Scholar
Li J, Lu K, Huang Z, Zhu L, Shen H (2019) Heterogeneous domain adaptation through progressive alignment. IEEE Trans Neural Netw Learning Syst 30(5):1381–1391
Article MathSciNet Google Scholar
Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Article MathSciNet MATH Google Scholar
Liu AA, Nie WZ, Gao Y, Su YT (2017) View-based 3-d model retrieval: a benchmark. IEEE Trans Cybern 48(3):916–928
Google Scholar
Liu X, Wang M, Yin BC, Huet B, Li X (2015) Event-based media enrichment using an adaptive probabilistic hypergraph model. IEEE Trans Cybern 45(11):2461
Article Google Scholar
Lu K, He N, Xue J, Dong J, Shao L (2015) Learning view-model joint relevance for 3d object retrieval. IEEE Trans Image Process 24 (5):1449–1459
Article MathSciNet MATH Google Scholar
Lu K, Wang Q, Xue J, Pan W (2014) 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf Sci Int J 281:703–713
MathSciNet Google Scholar
Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: IEEE/RSJ international conference on intelligent robots & systems
Mihael A, Kastenmüller G, Hans PK, Thomas S (1999) 3d shape histograms for similarity search and classification in spatial databases. In: Proc Int symposium on spatial databases
Minsu C, Jungmin L, Kyoung ML (2010) Reweighted random walks for graph matching. In: European conference on computer vision
Muller H, Muller W, Squire DMG, Marchandmaillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recogn Lett 22(5):593–601
Article MATH Google Scholar
Nie W, Cao Q, Liu A, Su Y (2017) Convolutional deep learning for 3d object retrieval. Multimedia Sys 23(3):325–332
Article Google Scholar
Nie W, Liu A, Su Y (2016) 3d object retrieval based on sparse coding in weak supervision. J Vis Commun Image Represent 37:40–45
Article Google Scholar
Ohbuchi R, Furuya T (2009) Scale-weighted dense bag of visual features for 3d model retrieval from a partial view 3d model. In: IEEE international conference on computer vision workshops
Ohbuchi R, Osada K, Furuya T, Banno T (2008) Salient local visual features for shape-based 3d model retrieval. In: IEEE international conference on shape modeling & applications
Osada R, Funkhouser T, Chazelle B, Dobkin D (2001) Matching 3d models with shape distributions. Proc.of Int.conf.on Shape Modeling & Applications Usa pp 154–166
Persoon E, Fu KS (1977) Shape discrimination using fourier descriptors. IEEE Trans Sys Man Cy 7(3):170–179
Article MathSciNet Google Scholar
Polewski P, Yao W, Heurich M, Krzystek P, Stilla U (2015) Detection of fallen trees in als point clouds using a normalized cut approach trained by simulation. Isprs J Photogramm Remote Sens 105:252–271
Article Google Scholar
Shih JL, Lee CH, Wang JT (2007) A new 3d model retrieval approach based on the elevation descriptor. Pattern Recogn 40(1):283–295
Article MATH Google Scholar
Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. In: KDD Workshop on Text Mining
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proc. ICCV
Tatsuma A, Aono M (2009) Multi-fourier spectra descriptor and augmentation with spectral clustering for 3d shape retrieval. Vis Comput 25(8):785–804
Article Google Scholar
Wang D, Wang B, Zhao S, Yao H, Liu H (2016) Exploring discriminative views for 3d object retrieval. In: International conference on multimedia modeling, pp 755–766
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. arXiv:1901.00596
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J, Wu Z, Song S, Khosla A (2015) 3d shapenets: a deep representation for volumetric shapes. In: IEEE conference on computer vision & pattern recognition
Yang L, Albregtsen F (1994) Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn 29(7):1061–1073
Article Google Scholar
Yifan F, Zizhao Z, Xibin Z, Rongrong J, Yue G (2018) Gvcnn: group-view convolutional neural networks for 3d shape recognition. 264–272
Yue G, Meng W, Rongrong J, Xindong W, Qionghai D (2014) 3-d object retrieval with hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098
Article Google Scholar
Zan G, Deyu W, Shaohua W, Hua Z, Yinglong W (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3d object retrieval. Future Gener Comp Sys 94(C):641–653
Google Scholar
Zan G, Deyu W, Xue YB, Xu GP, Zhang H, Wang YL (2018) 3d object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56(C):305–315
Google Scholar
Zan G, Yinming L, Weili G, Weizhi N, Zhiyong C, Hua Z (2020) Pairwise view weighted graph network for view-based 3d model retrieval. In: The 43rd international ACM SIGIR conference on research and development in information retrieval
Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Sun M (2018) Graph neural networks: a review of methods and applications.arXiv:1812.08434
Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans Cybern 45(12):2756–2769
Article Google Scholar
Zhu L, Zi H, Li Z, Xie L, Shen Tao H (2018) Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learning Syst 29(11):5264–5276
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (No.61872270, No.61572357, No.61971309). National Key R&D Program of China (No.2019YFBB1404700). Jinan 20 projects in universities (No.2018GXRC014). Young creative team in universities of Shandong Province (No.2020KJN012), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, China. Tianjin Municipal Natural Science Foundation (No.18JCYBJC85500), NSF project of Tianjin (No. 17JCYBJC15600).

Author information

Authors and Affiliations

Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Key Laboratory of Computer Vision and System, Ministry of Education Tianjin University of Technology, Tianjin, 300384, China
Yin-min Li & Yan-bing Xue
Shandong Artifical Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, People’s Republic of China
Yin-min Li & Zan Gao
Jiangxi Vocational Technical College of Industry Trade, Nanchang, 330038, People’s Republic of China
Ya-bin Tao
China Unicom Yantai branch, Yantai, 264006, People’s Republic of China
Li-li Wang

Authors

Yin-min Li
View author publications
You can also search for this author in PubMed Google Scholar
Zan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Ya-bin Tao
View author publications
You can also search for this author in PubMed Google Scholar
Li-li Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yan-bing Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zan Gao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Ym., Gao, Z., Tao, Yb. et al. 3D Object retrieval based on non-local graph neural networks. Multimed Tools Appl 79, 34011–34027 (2020). https://doi.org/10.1007/s11042-020-09248-z

Download citation

Received: 22 June 2019
Revised: 07 June 2020
Accepted: 24 June 2020
Published: 11 July 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-020-09248-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

3D Object retrieval based on non-local graph neural networks

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Image Matching from Handcrafted to Deep Features: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D Object retrieval based on non-local graph neural networks

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Image Matching from Handcrafted to Deep Features: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation