Abstract
Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchi- cally organised domain specific thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords. In the presented ex- periment, the combination of the evidence from frequency analysis and the hierarchically organised thesaurus was done using inductive logic programming.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Turney, P. D. (2000). Learning Algorithms for Keyphrase Extraction. Information Retrieval, 2(4):303–336. Kluwer Academic Publishers.
Earl, L. L. (1970). Information Storage & Retrieval, volume 6, pp. 313–334. Pergamon Press.
Luhn, H. P. (1957). A Statistical Approach to Mechanical Encoding and searching of Literary Information. IBM Journal of Research and Development, 1:309–317.
Luhn, H. P. (1959). Auto-Encoding of Documents for Information Retrieval Systems. In: Boaz. M. (ed.) Modern Trends in Documentation, pp. 45–58. Pergamon Press, London.
Mladenić, D. (1998). Turning Yahoo into an Automatic Web-Page Classifier. In: Prade, H. (ed.) 13th European Conference on Artificial Intelligence ECAI 98, pp. 473–474.
Boström, H. (2000). Manual for Virtual Predict 0.8, Virtual Genetics Inc.
Nienhuys-Cheng, S.-H., and de Wolf, R. (1997). Foundations of Inductive Logic Programming. LNAI 1228. Springer.
Freund Y., and Schapire R. E. (1996). Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hulth, A., Karlgren, J., Jonsson, A., Boström, H., Asker, L. (2001). Automatic Keyword Extraction Using Domain Knowledge. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2001. Lecture Notes in Computer Science, vol 2004. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44686-9_47
Download citation
DOI: https://doi.org/10.1007/3-540-44686-9_47
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41687-6
Online ISBN: 978-3-540-44686-6
eBook Packages: Springer Book Archive