Fast view-based 3D model retrieval via unsupervised multiple feature fusion and online projection learning

doi:10.1016/j.sigpro.2014.11.020

Signal Processing

Volume 120, March 2016, Pages 702-713

https://doi.org/10.1016/j.sigpro.2014.11.020 Get rights and content

Highlights

•
An Unsupervised Multiple Feature Fusion (UMFF) algorithm is proposed to fuse multiple features for 3D model representation.
•
The $ℓ_{1}$ -graph method is used to learn roust and datum-adaptive graph to preserve local geometric structure information.
•
An efficient Online Projection Learning (OPL) algorithm is designed to solve the out-of-sample problem.

Abstract

Since each visual feature only reflects a unique characteristic about a 3-dimensional (3D) model and different visual features have diverse discriminative power in model representation, it would be beneficial to fuse multiple visual features in 3D model retrieval. To this end, we propose a fast view-based 3D model retrieval framework in this article. This framework comprises two parts: the first one is an Unsupervised Multiple Feature Fusion algorithm (UMFF), which is used to learn a compact yet discriminative feature representation from the original multiple visual features; and the second one is an efficient Online Projection Learning algorithm (OPL), which is designed to fast transfer the input multiple visual features of a newcome model into its corresponding low-dimensional feature representation. In this framework, many existing ranking algorithms such as the simple distance-based ranking method can be directly adopted for sorting all 3D models in the database using the learned new feature representation and returning the top ranked models to the user. Extensive experiments on two public 3D model databases demonstrate the efficiency and the effectiveness of the proposed approach over its competitors. The proposed framework cannot only dramatically improve the retrieval performance but also reduce the computational cost in dealing with the newcome models.

Introduction

With the rapid development of computer hardware and computer graphics techniques, especially the modeling and rendering techniques, more and more 3-dimensional (3D) models have been created and used in a wide range of applications [1], [2]. Several public large-scale 3D model databases [3], [4] are also available on the internet. To manage and reuse the abundant 3D models, efficient and effective 3D model retrieval techniques and systems become crucially important.

In some early researches, the keyword-based methods were applied to retrieve 3D models whose tags are similar to the query from database. These methods require the user to label all 3D models in the database in advance, which is very tedious and time-consuming. Moreover, due to the high ambiguity and subjectivity of keyword, it is difficult to choose a correct and meaningful keyword as the tag for a given 3D model. It is also hard to measure the similarity between two visual or geometric similar 3D models with different keywords. So, these methods are inefficient and of low accuracy in many cases.

To overcome the shortcomings of keyword-based methods, a growing number of researchers have focused their attentions on the content-based 3D model retrieval techniques [5], [6], [7], [8]. In [9], the authors have shown that the content-based methods consistently outperform the keyword-based methods. Since efficient feature representation is the cornerstone of these content-based methods, a great deal of effort has been devoted to extracting or constructing various visual and geometric features including the low-level features (e.g., geometric moment [10] and surface features [11]) and the high-level structure features (e.g., MATE feature [12]) from 3D models as feature representation. However, many existing methods tended to use only one type of visual or geometric feature representation to characterize the 3D model, which results in poor performance. Recently, more and more researchers have found that correctly combining multiple features can significantly improve the algorithms’ performance in a large variety of areas [13], [14], [15], [16], [17], [18], [19]. In these work, how to combine/fuse multiple features becomes an important problem. An intuitive way is to concatenate multiple features into a new high-dimensional feature vector, which is very simple while lacks of physical meaning. It would be more smart to take both the complementary information between different feature representation and the intrinsic geometric structure information of each feature representation into account in fusing these multiple features. More importantly, how to deal with the newcome samples, which is named as the out-of-sample problem, is another open problem. However, not much work has been done to solve these problems well, especially for the task of 3D model retrieval.

To address the above mentioned issues, we propose a fast view-based 3D model retrieval framework. The framework comprises two parts: one is an Unsupervised Multiple Feature Fusion algorithm (UMFF) and the other one is an efficient Online Projection Learning algorithm (OPL). We first use UMFF to learn a more compact yet discriminative feature representation from the original multiple features. Then, OPL is designed to reduce the high computational cost in solving the objective function of UMFF for the newcome models by learning a projection matrix. Once the projection matrix is obtained, the input multiple high-dimensional features of the newcome 3D model can be fast transferred into the new compact yet discriminative feature representation. So the out-of-sample problem can be overcome via OPL in this work. In the proposed framework, many ranking algorithms such as the distance-based ranking method can be directly used to sort the 3D models and return the top ranked models to the user.

The rest of this article is organized as follows. Section 2 reviews some related work on the topic of 3D model retrieval. Then, the details of our proposed framework is introduced in Section 3. Section 4 presents the experimental results on two public 3D model databases. Finally, Section 5 concludes our work and sketches several direction of future work.

Section snippets

Related work

The content-based 3D model retrieval methods can be roughly divided into two categories: 3D model based and view based [8], [20]. In 3D model based methods, the model representation includes the low-level features like the geometric moment [10] and surface distribution [11], and the high-level structure-based features [12]. All of these feature representations are directly extracted from the 3D model data. This kind of methods requires less computational cost for feature extraction, but it

Framework overview

Fig. 1 illustrates our proposed framework for 3D model retrieval via Unsupervised Multiple Feature Fusion and Online Projection Learning. The framework comprises two parts. The first part is an Unsupervised Multiple Feature Fusion algorithm (UMFF), which is used to learn a low-dimensional and discriminative feature representation from the original multiple visual features of the 3D models in the database. With the help of UMFF, the feature size can be extremely reduced and feature quality can

Experimental setup

To evaluate the performance of our proposed framework, two publicly available databases of 3D model are used in our experiments. The first one is the National Taiwan University 3D Model Benchmark (NTU) [5]. The NTU benchmark contains 549 3D models, which is clustered into 47 classes. The second one is the Princeton Shape Benchmark (PSB) [3] containing two sets of classified 3D models: one is the train set including 907 models belonging to 90 classes and the other is the test set with 907 models

Conclusion

In this paper, we have presented a fast view-based 3D model retrieval framework comprising an unsupervised multiple feature fusion (UMFF) algorithm and online projection learning (OPL) algorithm. UMFF exploits the complementary information between different visual feature and hold the embedded data geometric structure information. OPL significantly reduces the computational cost for dealing with the newcome 3D models. Experiments on two public 3D models databases suggest that our framework is

Acknowledgments

This research is supported by the National High Technology Research and Development Program (2012AA011502), the National Key Technology R&D Program (2013BAH59F00), the Zhejiang Provincial Natural Science Foundation of China (LY13F020001), Zhejiang Province Public Technology Applied Research Projects (No. 2014C33090).

References (55)

E. Paquet et al.
Description of shape information for 2-D and 3-D objects
Signal Process.: Image Commun.
(2000)
J.L. Shih et al.
A new 3D model retrieval approach based on the elevation descriptor
Pattern Recognit.
(2007)
D.F. Huber et al.
Fully automatic registration of multiple 3D data sets
Image Vis. Comput.
(2003)
Z. Liang et al.
Retrieval-based cartoon gesture recognition and applications via semi-supervised heterogeneous classifiers learning
Pattern Recognit.
(2013)
Y. Gao et al.
3D model retrieval using weighted bipartite graph matching
Signal Process.: Image Commun.
(2011)
Y. Feng et al.
Exploiting temporal stability and low-rank structure for motion capture data refinement
Inf. Sci.
(2014)
Y. Feng et al.
Active learning for social image retrieval using locally regressive optimal design
Neurocomputing
(2012)
T. Funkhouser et al.
A search engine for 3D models
ACM Trans. Gr.
(2003)
J.W. Tangelder et al.
A survey of content based 3D shape retrieval methods
Multimed. Tools Appl.
(2008)
P. Shilane, P. Min, M. Kazhdan, T. Funkhouser, The Princeton shape benchmark, in: Proceedings of IEEE Shape Modeling...

R. Fang, A. Godil, X. Li, A. Wagan, A New Shape Benchmark for 3D Object Retrieval, vol. 5358 of ISVC ׳08, Springer,...

D. Chen et al.

On visual similarity based 3D model retrieval

Comput. Gr. Forum

(2003)

A.D. Bimbo et al.

Content-based retrieval of 3D models

ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP)

(2006)

Y. Yang et al.

Content-based 3-D model retrievala survey

IEEE Trans. Syst. Man Cybern., Part C: Appl. Rev.

(2007)

Y. Gao et al.

View-based 3-D object retrievalchallenges and approaches

IEEE MultiMed.

(2014)

P. Min, M. Kazhdan, T. Funkhouser, A comparison of text and shape matching for retrieval of online 3D models, in:...

M. Novotni, R. Klein, A geometric approach to 3D object comparison, in: International Conference on Shape Modeling and...

R. Osada et al.

Shape distributions

ACM Trans. Gr.

(2002)

B. Leng et al.

Mate: a visual based 3d shape descriptor

Chin. J. Electron.

(2009)

I. Atmosukarto, W.K. Leow, Z. Huang, Feature combination and relevance feedback for 3D model retrieval, in: The 11th...

P. Daras, A. Axenopoulos, A compact multi-view descriptor for 3D object retrieval, in: Seventh International Workshop...

M. Wang et al.

Unified video annotation via multigraph learning

IEEE Trans. Circuits Syst. Video Technol.

(2009)

Y. Feng, J. Xiao, Y. Zhuang, X. Liu, Adaptive unsupervised multi-view feature selection for visual concept recognition,...

M. Wang et al.

Multimodal graph-based reranking for web image search

IEEE Trans. Image Process.

(2012)

Y. Yang et al.

Multi-feature fusion via hierarchical regression for multimedia analysis

IEEE Trans. Multimed.

(2013)

N.S. Zhigang Ma, Yi Yang, A.G. Hauptmann, Mutliple features but few labels? A symbiotic solution exemplified for video...

M. Wang et al.

View-based discriminative probabilistic modeling for 3d object retrieval and recognition

IEEE Trans. Image Process.

(2013)

Cited by (6)

Cf3d: Category Fused 3d Point Cloud Retrieval
2022, SSRN
3d radon transform for shape retrieval using bag-of-visual-features
2020, International Arab Journal of Information Technology
A fast and efficient 3D reflection symmetry detector based on neural networks
2019, Multimedia Tools and Applications
Neuroimaging Retrieval via Adaptive Ensemble Manifold Learning for Brain Disease Diagnosis
2019, IEEE Journal of Biomedical and Health Informatics
Generating significant subassemblies from 3D assembly models for design reuse
2018, International Journal of Production Research
A cross-media distance metric learning framework based on multi-view correlation mining and matching
2016, World Wide Web

View full text

Fast view-based 3D model retrieval via unsupervised multiple feature fusion and online projection learning

Highlights

Abstract

Introduction

Section snippets

Related work

Framework overview

Experimental setup

Conclusion

Acknowledgments

Signal Process.: Image Commun.

Pattern Recognit.

Image Vis. Comput.

Pattern Recognit.

Signal Process.: Image Commun.

Inf. Sci.

Neurocomputing

A search engine for 3D models

ACM Trans. Gr.

A survey of content based 3D shape retrieval methods

Multimed. Tools Appl.

On visual similarity based 3D model retrieval

Comput. Gr. Forum

Content-based retrieval of 3D models

ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP)

Content-based 3-D model retrievala survey

IEEE Trans. Syst. Man Cybern., Part C: Appl. Rev.

View-based 3-D object retrievalchallenges and approaches

IEEE MultiMed.

Shape distributions

ACM Trans. Gr.

Mate: a visual based 3d shape descriptor

Chin. J. Electron.

Unified video annotation via multigraph learning

IEEE Trans. Circuits Syst. Video Technol.

Multimodal graph-based reranking for web image search

IEEE Trans. Image Process.

Multi-feature fusion via hierarchical regression for multimedia analysis

IEEE Trans. Multimed.

View-based discriminative probabilistic modeling for 3d object retrieval and recognition

IEEE Trans. Image Process.