3D model retrieval using weighted bipartite graph matching

https://doi.org/10.1016/j.image.2010.10.006Get rights and content

Abstract

In this paper, we propose a view-based 3D model retrieval algorithm, where many-to-many matching method, weighted bipartite graph matching, is employed for comparison between two 3D models. In this work, each 3D model is represented by a set of 2D views. Representative views are first selected from the query model and the corresponding initial weights are provided. These initial weights are further updated based on the relationship among these representative views. The weighted bipartite graph is built with these selected 2D views, and the matching result is used to measure the similarity between two 3D models. Experimental results and comparison with existing methods show the effectiveness of the proposed algorithm.

Introduction

Recently, large databases of 3D models are rapidly increasing, and 3D models have been widely used in CAD, virtual reality, medicine, and entertainment. Effective and efficient 3D model retrieval algorithms are required in wide applications. 3D model retrieval [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12] has received great attentions and became a very active research domain in recent years.

Early 3D model retrieval methods [13], [14], [15], [16] employed low-level features, and high-level structure-based methods to describe 3D models. Recently, view-based 3D model descriptors [17], [18], [19], [20] came out. These view-based 3D model descriptors represent 3D models using 2D views, and 3D model comparison is based on 2D views matching.

The state-of-the-art view-based 3D object retrieval methods are highly depended on the methods of view acquisition. The light field descriptor (LFD) [17] was computed from 10 silhouettes obtained from the vertices of a dodecahedron over a hemisphere. This image set described the spatial structure information from different views. In LFD, Zernike moments and Fourier descriptors of the 3D model were employed as the features of each image. This method found the best match between two LFDs as the similarity between two 3D models. Elevation descriptor (ED) [21] was a global spatial information descriptor. ED represented 3D models by the spatial information from six directions. It was invariant to translation, rotation, and scaling of 3D models. The comparison between two EDs were based on the distance between two groups of six elevation views. Five circular camera arrays, including four vertical and one horizonal camera arrays, are employed to acquire representative views of 3D models in [22]. Each group of views (acquired by a circle set of cameras) were modelled as a Markov chain (MC). In MC, 3D model comparison included two stages: comparison in the view set level and comparison in the model level. In the MC framework, 3D model retrieval was to find the maximal a posterior (MAP) in the 3D database given the query model.

In compact multi-view descriptor (CMVD) [23], camera arrays were set at the 18 vertices of a 32-hedron to capture multi-views. These multi-views were uniformly distributed and both the binary images and the depth images were taken. Then the comparison between 3D models was based on the feature matching between selected views using 2D features, such as 2D Polar-Fourier Transform, 2D Zernike Moments, and 2D Krawtchouk Moments. For the query object, the testing object rotated and found the best matched direction for the query object. The minimal sum of distance from the selected rotation direction was calculated to measure the distance between two objects.

Adaptive views clustering (AVC) [24] was a Bayesian 3D object search engine method, where 320 initial views were captured from 3D models. A two-unit icosahedron centered on the origin was divided twice by using the Loop-subdivision schema to achieve a 320 faceted polyhedron, and the 320 initial views were captured from these directions. X-means and Bayesian information criteria were used to cluster these 320 initial views and select representative views. For retrieving 3D models from database, AVC found the 3D model with the highest posterior probability value given the query object. In [25], seven representative views from three principal and four secondary directions were acquired to index objects. The contour-based feature was extracted for each view for multi-view matching. In [26], query views were re-weighted using the relevance feedback information by multi-bipartite graph reinforcement model. In this method, the weights of query views were generated using the information propagation from the labelled retrieval results.

Some methods employed the generated view to represent 3D models. Panoramic object representation for accurate model attributing (PANORAMA) [27] employed panoramic views to capture the position of the model's surface information as well as its orientation as the 3D model descriptor. The panoramic view of a 3D model was obtained by projecting the 3D model to the lateral surface of a cylinder aligned with one of the object's three principal axes and centered at the centroid of the object. The spatial structure circular descriptor (SSCD) [28] can preserve the global spatial structure of 3D models, and it was invariant to rotation and scaling. All spatial information of 3D model can be represented by an SSCD which included several SSCD images. In SSCD, a minimal bounding sphere of the 3D model was computed, and all points on the 3D model surface were projected to the bounding sphere. Attribute values are provided with each point to represent the surface spatial information. The bounding sphere was further projected to a circular region of a plane. It can preserve the spatial structure of the original 3D model. This circular image was employed by each SSCD image to describe the surface information of a 3D model. Each spatial part of a 3D model is represented by one part of the SSCD individually. Histogram information was employed by SSCD as the feature of SSCDs to compare two 3D models.

The bag-of-visual-feature (BoVF) method [29] was recently employed in view-based 3D model retrieval. In [29], local SIFT features [30] were extracted from each view and quantized into visual words using a pre-trained visual vocabulary, which was trained using k-means clustering method of local features. These local features from multiple images were then accumulated into a single histogram as the feature vector for the 3D model. Kullback–Leibler divergence (KLD) was employed as the distance measure between two 3D objects. More BoVF related methods [31], [32], [33] have been proposed these years. A bag-of-region-words (BoRW) 3D object representation method [34] was introduced to add the region information for BoVF method concerning the spatial information of view patches.

As shown in the existing works, there are two main stages for 3D model retrieval, including 3D model representation and 3D model matching. Most of existing works focused on 3D model representation methods. For 3D model matching, view-based 3D model retrieval is based on the comparison between two groups of 2D views. Thus, it can be modelled as a many-to-many matching task. In this work, we propose a view-based 3D model retrieval algorithm, in which many-to-many matching method, the weighted bipartite graph matching (WBGM), is employed for comparison between two 3D models. In the view-based 3D model retrieval, each 3D model is represented by a set of 2D views. Representative views are first selected and the corresponding initial weights are provided and further updated using the relationship among representative views. The weighted bipartite graph is built with these selected 2D views, and the proportional max-weighted bipartite matching method [35] is employed to find the best match in the weighted bipartite graph. The matching result is used as the similarity between two 3D models. Experimental results and comparison with existing methods show the effectiveness of the proposed algorithm.

The remainder of this paper is organized as follows. The proposed 3D model retrieval algorithm using WBGM is presented in Section 2. Experimental results and discussions are shown in Section 3. Conclusions are given in Section 4.

Section snippets

3D model retrieval using weighted bipartite graph matching

In this section, the proposed 3D model retrieval using WBGM is presented in details. First the framework is introduced, and following the detail algorithm will be given.

Database

In our experiments, the NTU (National Taiwan University) 3D model database [17] is selected as the testing database. In the NTU database, there are totally 10 911 3D models, and 300 3D models of them are chosen as the testing database. These 300 3D models include 30 classes of 3D models, where each class includes 10 3D models. Some example 3D models in the NTU database are shown in Fig. 2.

In our experiments, virtual cameras are employed to capture initial views for 3D objects using 3D process

Conclusions and future works

In this paper, we have presented a view-based 3D model retrieval algorithm using WBGM. The proposed WBGM first selects representative views and updates the weight values for each representative view. To compare two 3D models, a weighted bipartite graph is constructed, and the matching on this weighted bipartite graph is employed to measure the similarity between the two 3D models. The proposed WBGM-based 3D model retrieval algorithm has been tested on the NTU database with different camera

Acknowledgments

This work was supported by the National Basic Research Project (No. 2010CB731800) and the Project of NSFC (No. 61035002 and U0935001).

References (36)

  • B. Bustos et al.

    Feature-based similarity search in 3d object databases

    ACM Computing Surveys

    (2005)
  • F. Rothganger et al.

    3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints

    International Journal of Computer Vision

    (2006)
  • Y. Liu, X.-L. Wang, H.-Y. Wang, H. Zha, H. Qin, Learning robust similarity measures for 3d partial shape retrieval,...
  • R. Ohbuchi, T. Shimizu, Ranking on semantic manifold for shape-based 3d model retrieval, in: Proceedings of ACM...
  • C.B. Akgl et al.

    Similarity learning for 3d object retrieval using relevance feedback and risk minimization

    International Journal of Computer Vision

    (2010)
  • A. Makadia et al.

    Spherical correlation of visual representations for 3d model retrieval

    International Journal of Computer Vision

    (2010)
  • A. Ferreira et al.

    Thesaurus-based 3d object retrieval with part-in-whole matching

    International Journal of Computer Vision

    (2010)
  • D. Vranic, An improvement of rotation invariant 3d-shape descriptor based on functions on concentric spheres, in:...
  • Cited by (116)

    • Sparse intrinsic decomposition and applications

      2021, Signal Processing: Image Communication
    View all citing articles on Scopus
    View full text