Abstract
In this article we investigate the possibilities to extract Library of Congress Subject Headings from texts. The large number of ambiguous terms turns out to be a problem. Disambiguation of subject headings seems to have potentials to improve the extraction results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use the namespace lcsh for http://id.loc.gov/authorities/subjects/
References
Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Lang. Resour. Eval. 43(3), 209–226 (2009)
Gazendam, L., Wartena, C., Brussee, R.: Thesaurus based term ranking for keyword extraction. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 49–53. IEEE Computer Society (2010)
Kiela, D., Clark, S.: A systematic study of semantic vector space model parameters. In: 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC) (2014)
Kim, S.N., Medelyan, O., Kan, M., Baldwin, T.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)
Medelyan, O., Perrone, V., Witten, I.H.: Subject metadata support powered by maui. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 407–408. ACM (2010)
Medelyan, O., Witten, I.H.: Thesaurus-based index term extraction for agricultural documents. In: Proceedings of the 6th Agricultural Ontology Service Workshop (2005)
Paynter, G.W.: Developing practical automatic metadata assignment and evaluation tools for internet resources. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 291–300. ACM (2005)
Pouliquen, B., Steinberger, R., Ignat, C.: Automatic annotation of multilingual text collections with a conceptual thesaurus. In: Workshop in Ontologies and Information Extraction (EUROLAN 2003) (2003)
Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 54–58. IEEE Computer Society (2010)
Yi, K., Chan, L.M.: Revisiting the syntactical and structural analysis of library of congress subject headings for the digital environment. J. Am. Soc. Inf. Sci. Technol. 61(4), 677–687 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Aga, R.T., Wartena, C., Franke-Maier, M. (2016). Automatic Recognition and Disambiguation of Library of Congress Subject Headings. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-43997-6_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43996-9
Online ISBN: 978-3-319-43997-6
eBook Packages: Computer ScienceComputer Science (R0)