3D model retrieval using constructive-learning for cross-model correlation

doi:10.1016/j.neucom.2017.01.030

Neurocomputing

Volume 275, 31 January 2018, Pages 1-9

https://doi.org/10.1016/j.neucom.2017.01.030 Get rights and content

Abstract

With the advance of 3D technology and digital image processing technique, there have been a great number of applications of 3D models, such as virtual reality, computed aided design, and entertainment. Under such circumstance, much research attention has been spent on 3D model retrieval in recent decades. Although extensive research efforts have been dedicated to this task, it is a difficult task to explore the correlation among 3D models, which is the key issue in 3D model retrieval. In this paper, we design and implement a constructive-learning for cross-model correlation algorithm for 3D model retrieval. In this method, we first extract view features from multi-views of 3D models. To exploit the cross-model correlation, we formulate the correlation of 3D models in a hypergraph structure, where both the vertex correlation and the edge correlation are simultaneously learned in a constructive-learning process. Then, the correlation of each model to the query can be used for retrieval. To justify the performance of our proposed algorithm, we have implemented the method and tested on two datasets. We have compared it with recent state-of-the-art methods, and the results have shown superior performance of our proposed method.

Introduction

In recent decade, 3D technology has rapid progress and the digital image processing technique has also been developed extensively. With the advance on both hardware and software, a great number of applications have employed 3D models, such as virtual reality, entertainment, computed aided design [1], molecular biology [2], and other applications [3]. 3D movies, tele-medicine and 3D games have become much popular in recent years. All these applications lead to a booming increase of 3D models [4], [5], [6], which make it a urgent requirement to conduct effective 3D model retrieval from large scale dataset [7], [8], [9], [10]. In recent decades, multimedia information retrieval [11], [12], [13], [14], [15], [16] has attracted much attention. In such 3D era, 3D model retrieval [17], [18], [19] becomes even more important, and the importance of retrieving 3D models can be illustrated in the example of industrial design. Previous study shows that only 20% of designs require completely new designs, while other 80% of designs can be combined or revised from existing designs. Therefore, an accurate 3D model retrieval method can significantly improve the industrial design performance and reduce the cost.

In recent decades, much research attention has been spent on 3D model retrieval, and thus it becomes a hot research topic nowadays. The task of retrieving 3D models can be defined as follows: For the query 3D model, the objective of 3D model retrieval is measuring the similarity/distance between each model and the query. Therefore, how to calculate the distance/similarity between two 3D models is the key in the 3D model retrieval task. Regarding this task, existing methods [20], [21] can be mainly classified into two types, model-based 3D model retrieval methods [22], [23], [24] and view-based 3D model retrieval methods [25], [26], [27], [28], [29], [30], based on the different 3D model representation methods.

In model-based 3D model retrieval method, each model is described by a corresponding virtual 3D model, such as point cloud data or mesh data. In this type of methods, the features of 3D models are extracted from the 3D model and the comparison is based on the feature matching. In model-based method, typical representative features include low-level features [23], [31], [32], [22], [33], [34], [24], and high-level features. In model-based 3D model retrieval method, each model is described by a corresponding virtual 3D model, such as point cloud data or mesh data. In this type of methods, the features of 3D models are extracted from the 3D model and the comparison is conducted using feature matching. In model-based method, typical representative features include low-level features [23], [31], [32], [22], [33], [34], [24], and high-level features. Low level features mainly employ the direct description of 3D model information, such as the distribution of surface [23], the geometric moments [31] and volumetric information of 3D model. High level features represent 3D models from a context level, such as skeletons [35]. The advantage of model-based methods comes from the direct representation of 3D model information. While, the main drawback of these methods is the mandatory requirement of 3D models. In many practical applications, the 3D models are not explicit available, which limits the application of model-based methods.

With the development of cameras and image processing methods, it has been much easier to acquire multi-views of 3D models, which leads to the progress of 3D model retrieval methods using multiple views [25], [27], [36], [37]. In these methods, a group of multiple views are used for 3D model representation, captured from different directions by real or virtual cameras. Different from model-based methods, view-based methods do not need the virtual model information, making these methods easier to be applied in various of applications. In view-based 3D model retrieval methods, first a group of views are generated and then the visual features are extracted on these views. The comparison between 3D models is based on the matching between two sets of multi-views. Although there have been much work on view-based 3D model retrieval, it is difficult to explore the correlation among 3D models, which is the key issue in 3D model comparison. Recently, hypergraph-based methods have been introduced into 3D model retrieval, in which the correlation among 3D models is modeled by a hypergraph. Although these methods have shown better performance compared with existing methods, all these methods have just build the initial level model relevance, which is not optimal to reflect the underneath correlation among 3D models.

Under such circumstance, it is important to jointly explore the high-order correlation among 3D models and the relationship among links on the hypergraph, which can bring in deeper investigation on the data modelling. Hypergraph is one type of graph. In hypergraph, each edge is able to link two or more vertices. The flexible structure of hypergraph makes it fit for high order relationship modelling. Regarding the hypergraph based data modelling [38], [39], [40], [41], it has been employed in plenty of computer vision tasks, such as image retrieval [42], [43], [44], model segmentation [45], and hyperspectral image classification. To conduct model recognition, Xia et al. [46] presented a class-specific hypergraph (CSHG) to jointly employ local SIFT features and global geometric constraints. In this work, a selected category of models with multiple appearance instances was modeled by hypergraph. Huang et al. [43] proposed to employ hypergraph structure to formulate the relationship among images, and the transductive learning was conducted to retrieve images. In this method, each vertex denotes one image and the visual feature-based distance is used for edge construction. Zhu et al. [47] presented a multimodal hypergraph learning method for landmark analysis. In this method, the edges were generated based on the visual features of landmark images. It is noted that the initial edges may be not optimal for data representation. We note that the edge weights are just simple set in the learning objective function, indicating that the correlation among edges has not been taken into consideration.

In our task, to handle the issue of high-order correlation among 3D models, we design and implement a constructive-learning for cross-model correlation algorithm for 3D model retrieval. In this method, we first extract view features from multi-views of 3D models. To exploit the cross-model correlation, we formulate the correlation of 3D models in a hypergraph structure. More specifically, the vertex on the hypergraph denotes on 3D model, and the corresponding edges on the hypergraph are built based on the feature-based distance among 3D models. On this hypergraph structure, both the vertex correlation and the edge correlation are simultaneously learned in a constructive-learning process. Then, the correlation of each model to the query can be used for retrieval. To justify the performance of our proposed algorithm, we have conducted experiments on two datasets, including the National Taiwan University dataset and the ETH-80 dataset. We have compared it with recent state-of-the-art methods, and the results have shown superior performance of our proposed method.

The main contributions of our work are two-fold:

1.
We propose a constructive-learning for cross-model correlation targeting the task of 3D model retrieval. This method is able to take both the model correlation and the correlation of model connections into consideration simultaneously and yet achieves better performance on 3D model comparison.
2.
To measure the performance of the proposed constructive-learning method, we have conducted experiments on two datasets. The experimental results and comparison with existing methods have shown superior results of proposed method.

The rest of this paper is organized as follows. Section 2 provides related work on 3D model retrieval. Section 3 introduces the proposed method and Section 4 provides detailed experimental results and the comparisons with state-of-the-art methods. We finally conclude this paper in Section 5.

Section snippets

Related work

In this part, we first introduce the view-extraction methods, and the provide the view-based model matching method. Besides the direct view extraction using real and virtual cameras for 3D models, generating synectics views for 3D models is also important. Papadakis et al. [48] introduced a panoramic view method, called panoramic model representation for accurate model attributing (PANORAMA). Different from PANORAMA, a spatial structure circular descriptor (SSCD) was presented in [26], which

Constructive-learning for cross-model correlation

We introduce our proposed constructive-learning method for cross-model correlation targeting the task of 3D model retrieval. Our method comprises of three stages, i.e., pairwise 3D model distance measurement, cross-model structure construction and constructive-learning for cross-model correlation, which will be detailed introduced in this section.

Experimental settings

In our experiments, two public 3D model benchmarks are employed, including National Taiwan University 3D Model database (NTU) [49] and ETH-80 3D model dataset (ETH) [62]. The NTU dataset contains 549 3D models from different categories, such as aqua, boat, bed, and bomb. The 3D models in the NTU dataset contains model data, and the multi-views are captured from 60 equally distributed directions. And thus, for each 3D model, there are 60 images. The ETH dataset is composed of 80 models belonging

Conclusion

Retrieving 3D models has been an important task in research society. In this paper, targeting on exploring the high order 3D model correlation for accurate 3D model retrieval, we have proposed a constructive-learning for cross-model correlation algorithm. In this method, we first extract view features from multi-views of 3D models, and then the correlation among 3D models is formulated by hypergraph. On this hypergraph structure, both the vertex correlation and the edge correlation are

Jianbai Yang was born in Heilongjiang in 1986. He received his BS degree from China University of Petroleum, Qingdao, China, in 2009. Now he is a Ph.D. candidate in University of Chinese Academy of Sciences. His current research interests include image processing, computer vision.

References (64)

S. Jayanti et al.
Developing an engineering shape benchmark for CAD models
Comput. Aided Des.
(2006)
S. Zhao et al.
Strategy for dynamic 3d depth data matching towards robust action retrieval
Neurocomputing
(2015)
H. Wu et al.
Computing invariants of tchebichef moments for shape based image retrieval
Neurocomputing
(2016)
Y. Liu et al.
Margin-based two-stage supervised hashing for image retrieval
Neurocomputing
(2016)
G.-L. Sun et al.
Part-based clothing image annotation by visual neighbor retrieval
Neurocomputing
(2016)
X. Shen et al.
Semi-paired hashing for cross-view retrieval
Neurocomputing
(2016)
M. Ji et al.
Efficient semi-supervised multiple feature fusion with out-of-sample extension for 3d model retrieval
Neurocomputing
(2015)
E. Paquet et al.
Nefertitia query by content system for three-dimensional model and image databases management
Image Vis. Comput.
(1999)
Y. Gao et al.
3D model comparison using spatial structure circular descriptor
Pattern Recognit.
(2010)
S. Zhao et al.
View-based 3d object retrieval via multi-modal graph learning
Signal Process.
(2015)

E. Paquet et al.

Description of shape information for 2D and 3D objects

Signal Process. Image Commun.

(2000)

H. Chen et al.

Efficient recognition of highly similar 3D objects in range images

IEEE Trans. Pattern Anal. Mach. Intell.

(2009)

J.L. Shih et al.

A new 3D model retrieval approach based on the elevation descriptor

Pattern Recognit.

(2007)

W.-Z. Nie et al.

3d object retrieval based on sparse coding in weak supervision

J. Vis. Commun. Image Represent.

(2016)

W.Y. Kim et al.

A region-based shape descriptor using zernike moments

Signal Process. Image Commun.

(2000)

P. Daras et al.

Three-dimensional shape-structure comparison method for protein classification

IEEE/ACM Trans. Comput. Biol. Bioinform.

(2006)

A. del Bimbo et al.

Content-based retrieval of 3D models, ACM Transactions on Multimedia Computing

Commun., Appl.

(2006)

B. Bustos et al.

Feature-based similarity search in 3D object databases

ACM Comput. Surv.

(2005)

Y. Gao et al.

View-based 3d object retrievalchallenges and approaches

IEEE Multimed.

(2014)

H. Zeng, H. Wang, S. Li, W. Zeng, Non-rigid 3d model retrieval based on weighted bags-of-phrases and lda, in: Chinese...

T. Furuya, R. Ohbuchi, Accurate aggregation of local features by using k-sparse autoencoder for 3d model retrieval, in:...

Z. Yasseen et al.

View selection for sketch-based 3d model retrieval using visual part shape description

Vis. Comput.

(2016)

C.-T. Tu et al.

Free-hand sketches for 3d model retrieval using cascaded lsda

Multimed. Tools Appl.

(2016)

G. Ding et al.

Large-scale cross-modality search via collective matrix factorization hashing

IEEE Trans. Image Process.

(2016)

Z. Lin et al.

Cross-view retrieval via probability-based semantics-preserving hashing

IEEE Trans. Cybern. PP

(2016)

H. Haj Mohamed et al.

Algorithm boss (bag-of-salient local spectrums) for non-rigid and partial 3d object retrieval

Neurocomputing

(2015)

H. Haj Mohamed et al.

3d model retrieval with weighted locality-constrained group sparse coding

Neurocomputing

(2015)

J.W.H. Tangelder et al.

A survey of content based 3D shape retrieval methods

Multimed. Tools Appl.

(2008)

Y. Yang et al.

Content-based 3D model retrievala survey

IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.

(2007)

A.E. Johnson et al.

Using spin images for efficient object recognition in cluttered 3D scenes

IEEE Trans. Pattern Anal. Mach. Intell.

(1999)

R. Osada et al.

Shape distributions

ACM Trans. Graph.

(2002)

T. Filali Ansary et al.

3D search engine using adaptive views clustering

IEEE Trans. Multimed.

(2007)

Cited by (7)

3D Model classification based on few-shot learning
2020, Neurocomputing
Citation Excerpt :
Meanwhile, the recent introduction of depth cameras has made 3D model collection easier. The proliferation of 3D data has promoted the research and interest of automatic classification [9] and retrieval [10–13] algorithms for 3D models. This is a long-term research task, until recently the introduction of deep learning has achieved satisfactory results.
With the development of multimedia technology, 3D model has been applied in many fields such as mechanical design, construction industry, entertainment industry, medical treatment and so on. The number of 3D model is becoming more and more in our lives. Therefore, effective automatic management and classification of 3D models become more and more important. In this paper, we propose a dual-meta-learner model based on LSTM to learn the exact optimization algorithm used to train another two learner neural network classifier in the few-shot regime. The parametrization of our model allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while it can also achieve a general initialization of the learner (classifier) network that allows for quick convergence of training. Our method attains state-of-the-art performance by significant margins.
Multi-view-based siamese convolutional neural network for 3D object retrieval
2019, Computers and Electrical Engineering
Citation Excerpt :
Multi-view-based convolutional neural network (MVCNN) architecture was first proposed by Su et al. [15] for 3D object classification and retrieval, where multiple views could be simultaneously employed by passing through the max-pooling layer to integrate features. Yang et al. [16] proposed a cross-model correlation algorithm for 3D object retrieval. They extracted multi-view features of 3D models, and then formulated correlation among models by a hypergraph structure.
Representing three-dimensional (3D) objects by multiple views has become a common solution to the problem of 3D object retrieval. It has gained excellent achievements because of its remarkable adaptability and flexibility. In this paper, we develop a multi-view-based Siamese convolutional neural network for 3D object retrieval. It consists of two sub-networks which have the same architecture and also share the same set of weights. First, we generate a set of RGB images and binary images for each 3D object to capture local and global features. Second, the two sub-networks take corresponding images as input and avoid camera constraint by using average fusion layers. The final compact descriptors are learned by integrating features of sub-networks and then used for retrieval. Experimental results on two benchmarks, the PSB dataset and ETH dataset, testify that our proposed method receives superior retrieval performance compared to state-of-the-art methods.
A CAD model retrieval framework based on correlation network and relevance ranking
2023, Journal of Mechanical Science and Technology
A Review on Machine Learning Styles in Computer Vision - Techniques and Future Directions
2022, IEEE Access
3D model similarity evaluation for mechanical design reuse based on spatial correlated shape-word clique
2020, Multimedia Tools and Applications
An Approach to 3D Building Model Retrieval Based on Topology Structure and View Feature
2018, IEEE Access

View all citing articles on Scopus

Jian Zhao was born in Jilin in 1967. She received her BS degree from Jilin University of Technology, Changchun, China, in 1991 and her MS degree from Changchun Institute of Optics, Fine Mechanics, and Physics, Chinese Academy of Sciences, Changchun, China, in 2002. She is a Professor in Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences. Her research interests include image processing, computer vision.

Qiang Sun was born in Heilongjiang in 1971. He received his Ph.D. degree in optical engineering from Nankai University, Tianjin, in 2003. He is currently a Professor with the Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun, China. His current research interests include reliability analysis on electron mechanical products, optical design, and infrared optics.

View full text

3D model retrieval using constructive-learning for cross-model correlation

Abstract

Introduction

Section snippets

Related work

Constructive-learning for cross-model correlation

Experimental settings

Conclusion

Comput. Aided Des.

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Image Vis. Comput.

Pattern Recognit.

Signal Process.

Signal Process. Image Commun.

IEEE Trans. Pattern Anal. Mach. Intell.

Pattern Recognit.

J. Vis. Commun. Image Represent.

Signal Process. Image Commun.

Three-dimensional shape-structure comparison method for protein classification

IEEE/ACM Trans. Comput. Biol. Bioinform.

Content-based retrieval of 3D models, ACM Transactions on Multimedia Computing

Commun., Appl.

Feature-based similarity search in 3D object databases

ACM Comput. Surv.

View-based 3d object retrievalchallenges and approaches

IEEE Multimed.

View selection for sketch-based 3d model retrieval using visual part shape description

Vis. Comput.

Free-hand sketches for 3d model retrieval using cascaded lsda

Multimed. Tools Appl.

Large-scale cross-modality search via collective matrix factorization hashing

IEEE Trans. Image Process.

Cross-view retrieval via probability-based semantics-preserving hashing

IEEE Trans. Cybern. PP

Algorithm boss (bag-of-salient local spectrums) for non-rigid and partial 3d object retrieval

Neurocomputing

3d model retrieval with weighted locality-constrained group sparse coding

Neurocomputing

A survey of content based 3D shape retrieval methods

Multimed. Tools Appl.

Content-based 3D model retrievala survey

IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.

Using spin images for efficient object recognition in cluttered 3D scenes

IEEE Trans. Pattern Anal. Mach. Intell.

Shape distributions

ACM Trans. Graph.

3D search engine using adaptive views clustering

IEEE Trans. Multimed.