Automatic Keyword Extraction Using Domain Knowledge

Hulth, Anette; Karlgren, Jussi; Jonsson, Anna; Boström, Henrik; Asker, Lars

doi:10.1007/3-540-44686-9_47

Anette Hulth²,
Jussi Karlgren³,
Anna Jonsson²,
Henrik Boström^2,4 &
…
Lars Asker^2,4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2004))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1026 Accesses
33 Citations
3 Altmetric

Abstract

Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchi- cally organised domain specific thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords. In the presented ex- periment, the combination of the evidence from frequency analysis and the hierarchically organised thesaurus was done using inductive logic programming.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Turney, P. D. (2000). Learning Algorithms for Keyphrase Extraction. Information Retrieval, 2(4):303–336. Kluwer Academic Publishers.
Article Google Scholar
Earl, L. L. (1970). Information Storage & Retrieval, volume 6, pp. 313–334. Pergamon Press.
Article Google Scholar
Luhn, H. P. (1957). A Statistical Approach to Mechanical Encoding and searching of Literary Information. IBM Journal of Research and Development, 1:309–317.
Article MathSciNet Google Scholar
Luhn, H. P. (1959). Auto-Encoding of Documents for Information Retrieval Systems. In: Boaz. M. (ed.) Modern Trends in Documentation, pp. 45–58. Pergamon Press, London.
Google Scholar
Mladenić, D. (1998). Turning Yahoo into an Automatic Web-Page Classifier. In: Prade, H. (ed.) 13th European Conference on Artificial Intelligence ECAI 98, pp. 473–474.
Google Scholar
Boström, H. (2000). Manual for Virtual Predict 0.8, Virtual Genetics Inc.
Google Scholar
Nienhuys-Cheng, S.-H., and de Wolf, R. (1997). Foundations of Inductive Logic Programming. LNAI 1228. Springer.
Google Scholar
Freund Y., and Schapire R. E. (1996). Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer and Systems Sciences, Stockholm University, Electrum 230, SE-164 40, Kista, Sweden
Anette Hulth, Anna Jonsson, Henrik Boström & Lars Asker
Swedish Institute of Computer Science, Box 1263, SE-164 29, Kista, Sweden
Jussi Karlgren
Virtual Genetics Laboratory, AB SE-171 77, Stockholm, Sweden
Henrik Boström & Lars Asker

Authors

Anette Hulth
View author publications
You can also search for this author in PubMed Google Scholar
Jussi Karlgren
View author publications
You can also search for this author in PubMed Google Scholar
Anna Jonsson
View author publications
You can also search for this author in PubMed Google Scholar
Henrik Boström
View author publications
You can also search for this author in PubMed Google Scholar
Lars Asker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CIC (Centro de Investigación en Computatción IPN (Instituto Politécnico Nacional), Av. Juan Dios Bátiz s/n esq. M. Othon Mendizabal Col. Nuevo Vallejo, CP. 07738, México, Mexico
Alexander Gelbukh (Unidad Profecional “Adolfo López Mateos”) (Unidad Profecional “Adolfo López Mateos”)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hulth, A., Karlgren, J., Jonsson, A., Boström, H., Asker, L. (2001). Automatic Keyword Extraction Using Domain Knowledge. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2001. Lecture Notes in Computer Science, vol 2004. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44686-9_47

Download citation

DOI: https://doi.org/10.1007/3-540-44686-9_47
Published: 16 March 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41687-6
Online ISBN: 978-3-540-44686-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics