Automatic Recognition and Disambiguation of Library of Congress Subject Headings

Aga, Rosa Tsegaye; Wartena, Christian; Franke-Maier, Michael

doi:10.1007/978-3-319-43997-6_40

Rosa Tsegaye Aga¹⁷,
Christian Wartena¹⁷ &
Michael Franke-Maier¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9819))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1628 Accesses

Abstract

In this article we investigate the possibilities to extract Library of Congress Subject Headings from texts. The large number of ambiguous terms turns out to be a problem. Disambiguation of subject headings seems to have potentials to improve the extraction results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Information Extraction: A Distant Reading of the Brazilian Historical-Biographical Dictionary

Automatic Methods for Extracting Taxonomic Relationships from Texts

Article 26 September 2023

Terminology/Keyphrase Extraction for Creation of Book Indexes in Polish

Notes

1.
We use the namespace lcsh for http://id.loc.gov/authorities/subjects/

References

Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Lang. Resour. Eval. 43(3), 209–226 (2009)
Article Google Scholar
Gazendam, L., Wartena, C., Brussee, R.: Thesaurus based term ranking for keyword extraction. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 49–53. IEEE Computer Society (2010)
Google Scholar
Kiela, D., Clark, S.: A systematic study of semantic vector space model parameters. In: 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC) (2014)
Google Scholar
Kim, S.N., Medelyan, O., Kan, M., Baldwin, T.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)
Article Google Scholar
Medelyan, O., Perrone, V., Witten, I.H.: Subject metadata support powered by maui. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 407–408. ACM (2010)
Google Scholar
Medelyan, O., Witten, I.H.: Thesaurus-based index term extraction for agricultural documents. In: Proceedings of the 6th Agricultural Ontology Service Workshop (2005)
Google Scholar
Paynter, G.W.: Developing practical automatic metadata assignment and evaluation tools for internet resources. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 291–300. ACM (2005)
Google Scholar
Pouliquen, B., Steinberger, R., Ignat, C.: Automatic annotation of multilingual text collections with a conceptual thesaurus. In: Workshop in Ontologies and Information Extraction (EUROLAN 2003) (2003)
Google Scholar
Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 54–58. IEEE Computer Society (2010)
Google Scholar
Yi, K., Chan, L.M.: Revisiting the syntactical and structural analysis of library of congress subject headings for the digital environment. J. Am. Soc. Inf. Sci. Technol. 61(4), 677–687 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Hochschule Hannover - University of Applied Sciences and Arts, Hannover, Germany
Rosa Tsegaye Aga & Christian Wartena
Freie Universität Berlin, Universitätsbibliothek, Berlin, Germany
Michael Franke-Maier

Authors

Rosa Tsegaye Aga
View author publications
You can also search for this author in PubMed Google Scholar
Christian Wartena
View author publications
You can also search for this author in PubMed Google Scholar
Michael Franke-Maier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Wartena .

Editor information

Editors and Affiliations

Universität Duisburg-Essen , Duisburg, Germany
Norbert Fuhr
Hungarian Academy of Science , Budapest, Hungary
László Kovács
Leibniz Universität Hannover , Hannover, Germany
Thomas Risse
Leibniz Universität Hannover , Hannover, Germany
Wolfgang Nejdl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aga, R.T., Wartena, C., Franke-Maier, M. (2016). Automatic Recognition and Disambiguation of Library of Congress Subject Headings. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-43997-6_40
Published: 10 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43996-9
Online ISBN: 978-3-319-43997-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics