Abstract
This paper outlines the adaptation of an algorithm for automatic extraction of keywords for the Portuguese Language. Keywords make possible to summarize the contents of documents in a compact form, and may also be used as an efficient measure of similarity between texts. This work is focused on the extraction of keywords for theses on several fields of knowledge. To identify the keywords the KEA algorithm was used, together with a stemming technique specific to Portuguese and a manually created list of stopwords. It is shown that the results obtained are good enough for practical use and similarly match what have been done for the English Language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cunha, C., Cintra, L.F.L.: Nova Gramática do Português Contemporâneo, 3rd edn. Nova Fronteira, Rio de Janeiro (2001)
Dias, M.A.L.: Automatic Extraction of Keywords for the Portuguese Language Applied to Theses in the Engineering Field. Master thesis (in Portuguese, to be published)
Orengo, V.M., Huyck, C.R.: A Stemming Algorithim for The Portuguese Language. In: Proceedings of the SPIRE Conference. Laguna de San Raphael: [s.n.] (2001)
Witten, I.H., et al.: KEA: Practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries. [S.l.]: [s.n.] (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dias, M.A.L., de Gomensoro Malheiros, M. (2006). Automatic Extraction of Keywords for the Portuguese Language. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_22
Download citation
DOI: https://doi.org/10.1007/11751984_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34045-4
Online ISBN: 978-3-540-34046-1
eBook Packages: Computer ScienceComputer Science (R0)