New Techniques for Relevant Word Ranking and Extraction

Ventura, João; Ferreira da Silva, Joaquim

doi:10.1007/978-3-540-77002-2_58

João Ventura¹ &
Joaquim Ferreira da Silva¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4874))

Included in the following conference series:

Portuguese Conference on Artificial Intelligence

1434 Accesses
4 Citations

Abstract

In this paper we first propose two new metrics to rank the relevance of words in a text. The metrics presented are purely statistic and language independent and are based in the analysis of each word’s neighborhood. Typically, a relevant word is more strongly connected to some of its neighbors in despite of others. We also present a new technique based on the syllable analysis and show that despite it can be a metric by itself, it can also improve the quality of the proposed methods as also greatly improve the quality of other proposed methods (such as Tf-idf). Finally, based on the rankings previously obtained and using another neighborhood analysis, we present a new method to decide about the relevance of words on a yes/no basis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development 2, 159–165 (1958)
Article MathSciNet Google Scholar
Zhou, H., Slater, G.W.: A metric to search for relevant words. Physica A: Statistical Mechanics and its Applications 329(1-2), 309–327
Google Scholar
Silva, J.F., Mexia, J.T., Coelho, C.A., Lopes, G.P.: Multilingual document clustering, topic extraction and data transformation. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 74–87. Springer, Heidelberg (2001)
Google Scholar
Ortuño, M., Carpena, P., Bernaola-Galván, P., Muñoz, E., Somoza, A.M.: Europhys. Lett. 57(5), 759–764 (2002)
Article Google Scholar
Salton, G., Buckley, C.: Term-weighing approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523
Google Scholar

Download references

Author information

Authors and Affiliations

DI/FCT Universidade Nova de Lisboa, Quinta da Torre, 2829-516 Caparica, Portugal
João Ventura & Joaquim Ferreira da Silva

Authors

João Ventura
View author publications
You can also search for this author in PubMed Google Scholar
Joaquim Ferreira da Silva
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Neves Manuel Filipe Santos José Manuel Machado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ventura, J., Ferreira da Silva, J. (2007). New Techniques for Relevant Word Ranking and Extraction. In: Neves, J., Santos, M.F., Machado, J.M. (eds) Progress in Artificial Intelligence. EPIA 2007. Lecture Notes in Computer Science(), vol 4874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77002-2_58

Download citation

DOI: https://doi.org/10.1007/978-3-540-77002-2_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77000-8
Online ISBN: 978-3-540-77002-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics