3D model retrieval using weighted bipartite graph matching

doi:10.1016/j.image.2010.10.006

Signal Processing: Image Communication

Volume 26, Issue 1, January 2011, Pages 39-47

https://doi.org/10.1016/j.image.2010.10.006 Get rights and content

Abstract

In this paper, we propose a view-based 3D model retrieval algorithm, where many-to-many matching method, weighted bipartite graph matching, is employed for comparison between two 3D models. In this work, each 3D model is represented by a set of 2D views. Representative views are first selected from the query model and the corresponding initial weights are provided. These initial weights are further updated based on the relationship among these representative views. The weighted bipartite graph is built with these selected 2D views, and the matching result is used to measure the similarity between two 3D models. Experimental results and comparison with existing methods show the effectiveness of the proposed algorithm.

Introduction

Recently, large databases of 3D models are rapidly increasing, and 3D models have been widely used in CAD, virtual reality, medicine, and entertainment. Effective and efficient 3D model retrieval algorithms are required in wide applications. 3D model retrieval [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12] has received great attentions and became a very active research domain in recent years.

Early 3D model retrieval methods [13], [14], [15], [16] employed low-level features, and high-level structure-based methods to describe 3D models. Recently, view-based 3D model descriptors [17], [18], [19], [20] came out. These view-based 3D model descriptors represent 3D models using 2D views, and 3D model comparison is based on 2D views matching.

The state-of-the-art view-based 3D object retrieval methods are highly depended on the methods of view acquisition. The light field descriptor (LFD) [17] was computed from 10 silhouettes obtained from the vertices of a dodecahedron over a hemisphere. This image set described the spatial structure information from different views. In LFD, Zernike moments and Fourier descriptors of the 3D model were employed as the features of each image. This method found the best match between two LFDs as the similarity between two 3D models. Elevation descriptor (ED) [21] was a global spatial information descriptor. ED represented 3D models by the spatial information from six directions. It was invariant to translation, rotation, and scaling of 3D models. The comparison between two EDs were based on the distance between two groups of six elevation views. Five circular camera arrays, including four vertical and one horizonal camera arrays, are employed to acquire representative views of 3D models in [22]. Each group of views (acquired by a circle set of cameras) were modelled as a Markov chain (MC). In MC, 3D model comparison included two stages: comparison in the view set level and comparison in the model level. In the MC framework, 3D model retrieval was to find the maximal a posterior (MAP) in the 3D database given the query model.

In compact multi-view descriptor (CMVD) [23], camera arrays were set at the 18 vertices of a 32-hedron to capture multi-views. These multi-views were uniformly distributed and both the binary images and the depth images were taken. Then the comparison between 3D models was based on the feature matching between selected views using 2D features, such as 2D Polar-Fourier Transform, 2D Zernike Moments, and 2D Krawtchouk Moments. For the query object, the testing object rotated and found the best matched direction for the query object. The minimal sum of distance from the selected rotation direction was calculated to measure the distance between two objects.

Adaptive views clustering (AVC) [24] was a Bayesian 3D object search engine method, where 320 initial views were captured from 3D models. A two-unit icosahedron centered on the origin was divided twice by using the Loop-subdivision schema to achieve a 320 faceted polyhedron, and the 320 initial views were captured from these directions. X-means and Bayesian information criteria were used to cluster these 320 initial views and select representative views. For retrieving 3D models from database, AVC found the 3D model with the highest posterior probability value given the query object. In [25], seven representative views from three principal and four secondary directions were acquired to index objects. The contour-based feature was extracted for each view for multi-view matching. In [26], query views were re-weighted using the relevance feedback information by multi-bipartite graph reinforcement model. In this method, the weights of query views were generated using the information propagation from the labelled retrieval results.

Some methods employed the generated view to represent 3D models. Panoramic object representation for accurate model attributing (PANORAMA) [27] employed panoramic views to capture the position of the model's surface information as well as its orientation as the 3D model descriptor. The panoramic view of a 3D model was obtained by projecting the 3D model to the lateral surface of a cylinder aligned with one of the object's three principal axes and centered at the centroid of the object. The spatial structure circular descriptor (SSCD) [28] can preserve the global spatial structure of 3D models, and it was invariant to rotation and scaling. All spatial information of 3D model can be represented by an SSCD which included several SSCD images. In SSCD, a minimal bounding sphere of the 3D model was computed, and all points on the 3D model surface were projected to the bounding sphere. Attribute values are provided with each point to represent the surface spatial information. The bounding sphere was further projected to a circular region of a plane. It can preserve the spatial structure of the original 3D model. This circular image was employed by each SSCD image to describe the surface information of a 3D model. Each spatial part of a 3D model is represented by one part of the SSCD individually. Histogram information was employed by SSCD as the feature of SSCDs to compare two 3D models.

The bag-of-visual-feature (BoVF) method [29] was recently employed in view-based 3D model retrieval. In [29], local SIFT features [30] were extracted from each view and quantized into visual words using a pre-trained visual vocabulary, which was trained using k-means clustering method of local features. These local features from multiple images were then accumulated into a single histogram as the feature vector for the 3D model. Kullback–Leibler divergence (KLD) was employed as the distance measure between two 3D objects. More BoVF related methods [31], [32], [33] have been proposed these years. A bag-of-region-words (BoRW) 3D object representation method [34] was introduced to add the region information for BoVF method concerning the spatial information of view patches.

As shown in the existing works, there are two main stages for 3D model retrieval, including 3D model representation and 3D model matching. Most of existing works focused on 3D model representation methods. For 3D model matching, view-based 3D model retrieval is based on the comparison between two groups of 2D views. Thus, it can be modelled as a many-to-many matching task. In this work, we propose a view-based 3D model retrieval algorithm, in which many-to-many matching method, the weighted bipartite graph matching (WBGM), is employed for comparison between two 3D models. In the view-based 3D model retrieval, each 3D model is represented by a set of 2D views. Representative views are first selected and the corresponding initial weights are provided and further updated using the relationship among representative views. The weighted bipartite graph is built with these selected 2D views, and the proportional max-weighted bipartite matching method [35] is employed to find the best match in the weighted bipartite graph. The matching result is used as the similarity between two 3D models. Experimental results and comparison with existing methods show the effectiveness of the proposed algorithm.

The remainder of this paper is organized as follows. The proposed 3D model retrieval algorithm using WBGM is presented in Section 2. Experimental results and discussions are shown in Section 3. Conclusions are given in Section 4.

Section snippets

3D model retrieval using weighted bipartite graph matching

In this section, the proposed 3D model retrieval using WBGM is presented in details. First the framework is introduced, and following the detail algorithm will be given.

Database

In our experiments, the NTU (National Taiwan University) 3D model database [17] is selected as the testing database. In the NTU database, there are totally 10 911 3D models, and 300 3D models of them are chosen as the testing database. These 300 3D models include 30 classes of 3D models, where each class includes 10 3D models. Some example 3D models in the NTU database are shown in Fig. 2.

In our experiments, virtual cameras are employed to capture initial views for 3D objects using 3D process

Conclusions and future works

In this paper, we have presented a view-based 3D model retrieval algorithm using WBGM. The proposed WBGM first selects representative views and updates the weight values for each representative view. To compare two 3D models, a weighted bipartite graph is constructed, and the matching on this weighted bipartite graph is employed to measure the similarity between the two 3D models. The proposed WBGM-based 3D model retrieval algorithm has been tested on the NTU database with different camera

Acknowledgments

This work was supported by the National Basic Research Project (No. 2010CB731800) and the Project of NSFC (No. 61035002 and U0935001).

References (36)

B. Hu et al.
Parallel relevance feedback for 3d model retrieval based on fast weighted-center particle swarm optimization
Pattern Recognition
(2010)
E. Paquet et al.
Description of shape information for 2-d and 3-d objects
Signal Processing and Image Communication
(2000)
F. Li et al.
Statistical modeling and many-to-many matching for view-based 3d object retrieval
Signal Processing: Image Communication
(2010)
J. Shih et al.
A new 3d model retrieval approach based on the elevation descriptor
Pattern Recognition
(2007)
Y. Gao et al.
View-based 3d model retrieval with probabilistic graph model
Neurocomputing
(2010)
Y. Gao et al.
3d model comparison using spatial structure circular descriptor
Pattern Recognition
(2010)
Y. Yang et al.
Content-based 3-d model retrieval: a survey
IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews
(2007)
J. Tangelder et al.
A survey of content based 3d shape retrieval methods
Multimedia Tools and Applications
(2008)
J. Assfalg et al.
Content-based retrieval of 3-d objects using spin image signatures
IEEE Transactions on Multimedia
(2007)
A.D. Bimbo et al.
Content-based retrieval of 3d models
ACM Transactions on Multimedia Computing, Communications and Applications
(2007)

B. Bustos et al.

Feature-based similarity search in 3d object databases

ACM Computing Surveys

(2005)

F. Rothganger et al.

3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints

International Journal of Computer Vision

(2006)

Y. Liu, X.-L. Wang, H.-Y. Wang, H. Zha, H. Qin, Learning robust similarity measures for 3d partial shape retrieval,...

R. Ohbuchi, T. Shimizu, Ranking on semantic manifold for shape-based 3d model retrieval, in: Proceedings of ACM...

C.B. Akgl et al.

Similarity learning for 3d object retrieval using relevance feedback and risk minimization

International Journal of Computer Vision

(2010)

A. Makadia et al.

Spherical correlation of visual representations for 3d model retrieval

International Journal of Computer Vision

(2010)

A. Ferreira et al.

Thesaurus-based 3d object retrieval with part-in-whole matching

International Journal of Computer Vision

(2010)

D. Vranic, An improvement of rotation invariant 3d-shape descriptor based on functions on concentric spheres, in:...

Cited by (116)

LD-GAN: Learning perturbations for adversarial defense based on GAN structure
2022, Signal Processing: Image Communication
Deep neural networks achieve outstanding performance in many tasks, so they have been widely used in many applications. However, the vulnerability of deep neural networks will produce many security threats, which drives us to provide sufficient attention to adversarial robustness. Many researchers have paid attention to addressing this problem based on the perturbation injection method, which may fail to consider the content of images that correspond to the perturbed feature while only focusing on their classification scores. In general, the existing methods often improve the robustness of the model at the expense of accuracy. In this paper, we propose LD-GAN, a novel framework to improve the adversarial robustness by learning perturbations and guaranteeing classification accuracy. The classic GAN structure is employed in this work. First, we utilize a generative model to reconstruct a training image from the corresponding perturbed feature. Then, the discriminative model is utilized to control the category. The purpose is to control the magnitude of noise addition and ensure that the noise addition does not fundamentally change the feature distribution of the original category. More specifically, we utilize the soft-attention model in the perturbation-injection module, which generates noise according to different layer concerns and improves the flexibility of the noise parameters. Extensive white-box and black-box attack experiments on CIFAR-10 and CIF-100 with state-of-the-art defense methods show the effectiveness of our method.
Sparse intrinsic decomposition and applications
2021, Signal Processing: Image Communication
This paper proposes an intrinsic decomposition method from a single RGB-D image. To remedy the highly ill-conditioned problem, the reflectance component is regularized by a sparsity term, which is weighted by a bilateral kernel to exploit non-local structural correlation. As shading images are piece-wise smooth and have sparse gradient fields, the sparse-induced $ℓ_{1}$ -norm is used to regularize the finite difference of the direct irradiance component, which is the most dominant sub-component of shading and describes the light directly received by the surfaces of the objects from the light source. To derive an efficient algorithm, the proposed model is transformed into an unconstrained minimization of the augmented Lagrangian function, which is then optimized via the alternating direction method. The stability of the proposed method with respect to parameter perturbation and its robustness to noise are investigated by experiments. Quantitative and qualitative evaluation demonstrates that our method has better performance than state-of-the-art methods. Our method can also achieve intrinsic decomposition from a single color image by integrating existed depth estimation methods. We also present a depth refinement method based on our intrinsic decomposition method, which obtains more geometry details without texture artifacts. Other application, e.g., texture editing, also demonstrates the effectiveness of our method.
Multiple Discrimination and Pairwise CNN for view-based 3D object retrieval
2020, Neural Networks
With the rapid development and wide application of computer, camera device, network and hardware technology, 3D object (or model) retrieval has attracted widespread attention and it has become a hot research topic in the computer vision domain. Deep learning features already available in 3D object retrieval have been proven to be better than the retrieval performance of hand-crafted features. However, most existing networks do not take into account the impact of multi-view image selection on network training, and the use of contrastive loss alone only forcing the same-class samples to be as close as possible. In this work, a novel solution named Multi-view Discrimination and Pairwise CNN (MDPCNN) for 3D object retrieval is proposed to tackle these issues. It can simultaneously input multiple batches and multiple views by adding the Slice layer and the Concat layer. Furthermore, a highly discriminative network is obtained by training samples that are not easy to be classified by clustering. Lastly, we deploy the contrastive-center loss and contrastive loss as the optimization objective that has better intra-class compactness and inter-class separability. Large-scale experiments show that the proposed MDPCNN can achieve a significant performance over the state-of-the-art algorithms in 3D object retrieval.
Multi-view-based siamese convolutional neural network for 3D object retrieval
2019, Computers and Electrical Engineering
Representing three-dimensional (3D) objects by multiple views has become a common solution to the problem of 3D object retrieval. It has gained excellent achievements because of its remarkable adaptability and flexibility. In this paper, we develop a multi-view-based Siamese convolutional neural network for 3D object retrieval. It consists of two sub-networks which have the same architecture and also share the same set of weights. First, we generate a set of RGB images and binary images for each 3D object to capture local and global features. Second, the two sub-networks take corresponding images as input and avoid camera constraint by using average fusion layers. The final compact descriptors are learned by integrating features of sub-networks and then used for retrieval. Experimental results on two benchmarks, the PSB dataset and ETH dataset, testify that our proposed method receives superior retrieval performance compared to state-of-the-art methods.
Clustering and retrieval of mechanical CAD assembly models based on multi-source attributes information
2019, Robotics and Computer-Integrated Manufacturing
Content-based CAD assembly model retrieval focuses more on the similarity measure of geometry and topology information, which can hardly meet designers’ design requirements in the product conceptual design process. To search quickly and effectively the CAD assembly models related with product design requirements information for assembly model reuse, inspire designers’ design thought and product design innovation in the product conceptual design phase, a novel approach for CAD assembly model clustering and retrieval based on multi-source attributes information is proposed. First, the CAD assembly model is represented by attribute adjacency graph (AAG) with multiple attributes information, and the similarity between assembly models is evaluated comprehensively by considering part information and assembly relationship information. Then, a weighted graph is constructed for assembly clustering in model repository. Meanwhile, an improved spectral clustering algorithm is given to divide CAD assembly models into different assembly clusters. Subsequently, indexing model is established for each assembly cluster. After that, the process of assembly retrieval based on indexing model mechanism is given in detail. Finally, experiments on the prototype system verify effectiveness and feasibility of the proposed approach for assembly clustering and retrieval.
Clustering based one-to-one hypergraph matching with a large number of feature points
2019, Signal Processing: Image Communication
Hypergraph matching is a useful technique for multiple feature point matching. In the last decade, hypergraph matching has shown great potential for solving many challenging problems of computer vision. The matching of a large number of feature points in hypergraph constraints is an NP-hard problem. It requires high computational complexity in many algorithms such as spectral graph matching, tensor graph matching and reweighted random walk matching. In this paper, we propose a computationally efficient clustering based algorithm for one-to-one hypergraph matching, which clusters a large hypergraph into many sub-hypergraphs. These sub-hypergraphs can be matched based on a tensor model, which guarantees the maximum matching score. The results from the sub-hypergraphs are then used to match all feature points in the entire hypergraph. Simulation results on real and synthetic data sets validates the efficiency of the proposed method.

View all citing articles on Scopus

View full text

3D model retrieval using weighted bipartite graph matching

Abstract

Introduction

Section snippets

3D model retrieval using weighted bipartite graph matching

Database

Conclusions and future works

Acknowledgments

Pattern Recognition

Signal Processing and Image Communication

Signal Processing: Image Communication

Pattern Recognition

Neurocomputing

Pattern Recognition

Content-based 3-d model retrieval: a survey

IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews

A survey of content based 3d shape retrieval methods

Multimedia Tools and Applications

Content-based retrieval of 3-d objects using spin image signatures

IEEE Transactions on Multimedia

Content-based retrieval of 3d models

ACM Transactions on Multimedia Computing, Communications and Applications

Feature-based similarity search in 3d object databases

ACM Computing Surveys

3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints

International Journal of Computer Vision

Similarity learning for 3d object retrieval using relevance feedback and risk minimization

International Journal of Computer Vision

Spherical correlation of visual representations for 3d model retrieval

International Journal of Computer Vision

Thesaurus-based 3d object retrieval with part-in-whole matching

International Journal of Computer Vision