Elsevier

Signal Processing

Volume 120, March 2016, Pages 702-713
Signal Processing

Fast view-based 3D model retrieval via unsupervised multiple feature fusion and online projection learning

https://doi.org/10.1016/j.sigpro.2014.11.020Get rights and content

Highlights

  • An Unsupervised Multiple Feature Fusion (UMFF) algorithm is proposed to fuse multiple features for 3D model representation.

  • The 1-graph method is used to learn roust and datum-adaptive graph to preserve local geometric structure information.

  • An efficient Online Projection Learning (OPL) algorithm is designed to solve the out-of-sample problem.

Abstract

Since each visual feature only reflects a unique characteristic about a 3-dimensional (3D) model and different visual features have diverse discriminative power in model representation, it would be beneficial to fuse multiple visual features in 3D model retrieval. To this end, we propose a fast view-based 3D model retrieval framework in this article. This framework comprises two parts: the first one is an Unsupervised Multiple Feature Fusion algorithm (UMFF), which is used to learn a compact yet discriminative feature representation from the original multiple visual features; and the second one is an efficient Online Projection Learning algorithm (OPL), which is designed to fast transfer the input multiple visual features of a newcome model into its corresponding low-dimensional feature representation. In this framework, many existing ranking algorithms such as the simple distance-based ranking method can be directly adopted for sorting all 3D models in the database using the learned new feature representation and returning the top ranked models to the user. Extensive experiments on two public 3D model databases demonstrate the efficiency and the effectiveness of the proposed approach over its competitors. The proposed framework cannot only dramatically improve the retrieval performance but also reduce the computational cost in dealing with the newcome models.

Introduction

With the rapid development of computer hardware and computer graphics techniques, especially the modeling and rendering techniques, more and more 3-dimensional (3D) models have been created and used in a wide range of applications [1], [2]. Several public large-scale 3D model databases [3], [4] are also available on the internet. To manage and reuse the abundant 3D models, efficient and effective 3D model retrieval techniques and systems become crucially important.

In some early researches, the keyword-based methods were applied to retrieve 3D models whose tags are similar to the query from database. These methods require the user to label all 3D models in the database in advance, which is very tedious and time-consuming. Moreover, due to the high ambiguity and subjectivity of keyword, it is difficult to choose a correct and meaningful keyword as the tag for a given 3D model. It is also hard to measure the similarity between two visual or geometric similar 3D models with different keywords. So, these methods are inefficient and of low accuracy in many cases.

To overcome the shortcomings of keyword-based methods, a growing number of researchers have focused their attentions on the content-based 3D model retrieval techniques [5], [6], [7], [8]. In [9], the authors have shown that the content-based methods consistently outperform the keyword-based methods. Since efficient feature representation is the cornerstone of these content-based methods, a great deal of effort has been devoted to extracting or constructing various visual and geometric features including the low-level features (e.g., geometric moment [10] and surface features [11]) and the high-level structure features (e.g., MATE feature [12]) from 3D models as feature representation. However, many existing methods tended to use only one type of visual or geometric feature representation to characterize the 3D model, which results in poor performance. Recently, more and more researchers have found that correctly combining multiple features can significantly improve the algorithms’ performance in a large variety of areas [13], [14], [15], [16], [17], [18], [19]. In these work, how to combine/fuse multiple features becomes an important problem. An intuitive way is to concatenate multiple features into a new high-dimensional feature vector, which is very simple while lacks of physical meaning. It would be more smart to take both the complementary information between different feature representation and the intrinsic geometric structure information of each feature representation into account in fusing these multiple features. More importantly, how to deal with the newcome samples, which is named as the out-of-sample problem, is another open problem. However, not much work has been done to solve these problems well, especially for the task of 3D model retrieval.

To address the above mentioned issues, we propose a fast view-based 3D model retrieval framework. The framework comprises two parts: one is an Unsupervised Multiple Feature Fusion algorithm (UMFF) and the other one is an efficient Online Projection Learning algorithm (OPL). We first use UMFF to learn a more compact yet discriminative feature representation from the original multiple features. Then, OPL is designed to reduce the high computational cost in solving the objective function of UMFF for the newcome models by learning a projection matrix. Once the projection matrix is obtained, the input multiple high-dimensional features of the newcome 3D model can be fast transferred into the new compact yet discriminative feature representation. So the out-of-sample problem can be overcome via OPL in this work. In the proposed framework, many ranking algorithms such as the distance-based ranking method can be directly used to sort the 3D models and return the top ranked models to the user.

The rest of this article is organized as follows. Section 2 reviews some related work on the topic of 3D model retrieval. Then, the details of our proposed framework is introduced in Section 3. Section 4 presents the experimental results on two public 3D model databases. Finally, Section 5 concludes our work and sketches several direction of future work.

Section snippets

Related work

The content-based 3D model retrieval methods can be roughly divided into two categories: 3D model based and view based [8], [20]. In 3D model based methods, the model representation includes the low-level features like the geometric moment [10] and surface distribution [11], and the high-level structure-based features [12]. All of these feature representations are directly extracted from the 3D model data. This kind of methods requires less computational cost for feature extraction, but it

Framework overview

Fig. 1 illustrates our proposed framework for 3D model retrieval via Unsupervised Multiple Feature Fusion and Online Projection Learning. The framework comprises two parts. The first part is an Unsupervised Multiple Feature Fusion algorithm (UMFF), which is used to learn a low-dimensional and discriminative feature representation from the original multiple visual features of the 3D models in the database. With the help of UMFF, the feature size can be extremely reduced and feature quality can

Experimental setup

To evaluate the performance of our proposed framework, two publicly available databases of 3D model are used in our experiments. The first one is the National Taiwan University 3D Model Benchmark (NTU) [5]. The NTU benchmark contains 549 3D models, which is clustered into 47 classes. The second one is the Princeton Shape Benchmark (PSB) [3] containing two sets of classified 3D models: one is the train set including 907 models belonging to 90 classes and the other is the test set with 907 models

Conclusion

In this paper, we have presented a fast view-based 3D model retrieval framework comprising an unsupervised multiple feature fusion (UMFF) algorithm and online projection learning (OPL) algorithm. UMFF exploits the complementary information between different visual feature and hold the embedded data geometric structure information. OPL significantly reduces the computational cost for dealing with the newcome 3D models. Experiments on two public 3D models databases suggest that our framework is

Acknowledgments

This research is supported by the National High Technology Research and Development Program (2012AA011502), the National Key Technology R&D Program (2013BAH59F00), the Zhejiang Provincial Natural Science Foundation of China (LY13F020001), Zhejiang Province Public Technology Applied Research Projects (No. 2014C33090).

References (55)

  • R. Fang, A. Godil, X. Li, A. Wagan, A New Shape Benchmark for 3D Object Retrieval, vol. 5358 of ISVC ׳08, Springer,...
  • D. Chen et al.

    On visual similarity based 3D model retrieval

    Comput. Gr. Forum

    (2003)
  • A.D. Bimbo et al.

    Content-based retrieval of 3D models

    ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP)

    (2006)
  • Y. Yang et al.

    Content-based 3-D model retrievala survey

    IEEE Trans. Syst. Man Cybern., Part C: Appl. Rev.

    (2007)
  • Y. Gao et al.

    View-based 3-D object retrievalchallenges and approaches

    IEEE MultiMed.

    (2014)
  • P. Min, M. Kazhdan, T. Funkhouser, A comparison of text and shape matching for retrieval of online 3D models, in:...
  • M. Novotni, R. Klein, A geometric approach to 3D object comparison, in: International Conference on Shape Modeling and...
  • R. Osada et al.

    Shape distributions

    ACM Trans. Gr.

    (2002)
  • B. Leng et al.

    Mate: a visual based 3d shape descriptor

    Chin. J. Electron.

    (2009)
  • I. Atmosukarto, W.K. Leow, Z. Huang, Feature combination and relevance feedback for 3D model retrieval, in: The 11th...
  • P. Daras, A. Axenopoulos, A compact multi-view descriptor for 3D object retrieval, in: Seventh International Workshop...
  • M. Wang et al.

    Unified video annotation via multigraph learning

    IEEE Trans. Circuits Syst. Video Technol.

    (2009)
  • Y. Feng, J. Xiao, Y. Zhuang, X. Liu, Adaptive unsupervised multi-view feature selection for visual concept recognition,...
  • M. Wang et al.

    Multimodal graph-based reranking for web image search

    IEEE Trans. Image Process.

    (2012)
  • Y. Yang et al.

    Multi-feature fusion via hierarchical regression for multimedia analysis

    IEEE Trans. Multimed.

    (2013)
  • N.S. Zhigang Ma, Yi Yang, A.G. Hauptmann, Mutliple features but few labels? A symbiotic solution exemplified for video...
  • M. Wang et al.

    View-based discriminative probabilistic modeling for 3d object retrieval and recognition

    IEEE Trans. Image Process.

    (2013)
  • View full text