Skip to main content

New Techniques for Relevant Word Ranking and Extraction

  • Conference paper
Progress in Artificial Intelligence (EPIA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4874))

Included in the following conference series:

Abstract

In this paper we first propose two new metrics to rank the relevance of words in a text. The metrics presented are purely statistic and language independent and are based in the analysis of each word’s neighborhood. Typically, a relevant word is more strongly connected to some of its neighbors in despite of others. We also present a new technique based on the syllable analysis and show that despite it can be a metric by itself, it can also improve the quality of the proposed methods as also greatly improve the quality of other proposed methods (such as Tf-idf). Finally, based on the rankings previously obtained and using another neighborhood analysis, we present a new method to decide about the relevance of words on a yes/no basis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development 2, 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  2. Zhou, H., Slater, G.W.: A metric to search for relevant words. Physica A: Statistical Mechanics and its Applications 329(1-2), 309–327

    Google Scholar 

  3. Silva, J.F., Mexia, J.T., Coelho, C.A., Lopes, G.P.: Multilingual document clustering, topic extraction and data transformation. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 74–87. Springer, Heidelberg (2001)

    Google Scholar 

  4. Ortuño, M., Carpena, P., Bernaola-Galván, P., Muñoz, E., Somoza, A.M.: Europhys. Lett. 57(5), 759–764 (2002)

    Article  Google Scholar 

  5. Salton, G., Buckley, C.: Term-weighing approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Neves Manuel Filipe Santos José Manuel Machado

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ventura, J., Ferreira da Silva, J. (2007). New Techniques for Relevant Word Ranking and Extraction. In: Neves, J., Santos, M.F., Machado, J.M. (eds) Progress in Artificial Intelligence. EPIA 2007. Lecture Notes in Computer Science(), vol 4874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77002-2_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77002-2_58

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77000-8

  • Online ISBN: 978-3-540-77002-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics