A Correlation-Based Semantic Model for Text Search

Sun, Jing; Wang, Bin; Yang, Xiaochun

doi:10.1007/978-3-319-08010-9_75

Jing Sun²⁰,
Bin Wang²⁰ &
Xiaochun Yang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8485))

Included in the following conference series:

International Conference on Web-Age Information Management

5991 Accesses

Abstract

With the exponential growth of texts on the Internet, text search is considered a crucial problem in many fields. Most of the traditional text search approaches are based on “bag of words” text representation based on frequency statics. However, these approaches ignore the semantic correlation of words in the text. So this may lead to inaccurate ranking of the search results. In this paper, we propose a new Wikipedia-based similar text search approach that the words in the texts and query text could be semantic correlated in Wikipedia. We propose a new text representation model and a new text similarity metric. Finally, the experiments on the real dataset demonstrate the high precision, recall and efficiency of our approach.

The work is partially supported by the National Natural Science Foundation of China (Nos. 61322208, 61272178, 61129002), the Doctoral Fund of Ministry of Education of China (No. 20110042110028), and the Fundamental Research Funds for the Central Universities (No. N120504001, N110804002).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improving Cross-Document Knowledge Discovery Through Content and Link Analysis of Wikipedia Knowledge

Use of N-grams Model and Semantic Similarity to Improve the Results of Search Engine

Short Text Feature Extraction via Node Semantic Coupling and Graph Structures

References

Hotho, A., Staab, S., Stummme, G.: Wordnet inproves text doucument clustering. In: SIGIR, pp. 143–152 (2003)
Google Scholar
Hu, X., Zhang, X., Lu, C.: Exploiting wikipedia as external knowledge for document clustering. In: KDD, pp. 389–396 (2009)
Google Scholar
Ribeiro, B., de Arajo, N., Yates, B.: Modern information retrieval. Addison-Wesley Longman (1999)
Google Scholar
Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1957)
Article Google Scholar
Wang, P., Hu, J., Zeng, H.: Improving text classification by using encyclopedia knowledge. In: ICDM, pp. 332–341 (2007)
Google Scholar
Zhu, H., Yang, X., Wang, B., Wang, Y.: Improving text search on hybrid data. In: Bao, Z., et al. (eds.) WAIM 2012 Workshops. LNCS, vol. 7419, pp. 192–203. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Northeastern University, Liaoning, 110819, China
Jing Sun, Bin Wang & Xiaochun Yang

Authors

Jing Sun
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochun Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, University of Utah, 50 S. Central Campus Drive, 84112, Salt Lake City,, UT, USA
Feifei Li
Department of Computer Science, Tsinghua University, 100084, Beijing, China
Guoliang Li
POSTECH, Republic of Korea
Seung-won Hwang
Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering,, Shanghai Jiao Tong University, China
Bin Yao
Advanced Digital Sciences Center (ADSC), 138632, Singapore, Singapore
Zhenjie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, J., Wang, B., Yang, X. (2014). A Correlation-Based Semantic Model for Text Search. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_75

Download citation

DOI: https://doi.org/10.1007/978-3-319-08010-9_75
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08009-3
Online ISBN: 978-3-319-08010-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics