Skip to main content

Automatic Recognition and Disambiguation of Library of Congress Subject Headings

  • Conference paper
  • First Online:
Research and Advanced Technology for Digital Libraries (TPDL 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9819))

Included in the following conference series:

  • 1628 Accesses

Abstract

In this article we investigate the possibilities to extract Library of Congress Subject Headings from texts. The large number of ambiguous terms turns out to be a problem. Disambiguation of subject headings seems to have potentials to improve the extraction results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We use the namespace lcsh for http://id.loc.gov/authorities/subjects/

References

  1. Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Lang. Resour. Eval. 43(3), 209–226 (2009)

    Article  Google Scholar 

  2. Gazendam, L., Wartena, C., Brussee, R.: Thesaurus based term ranking for keyword extraction. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 49–53. IEEE Computer Society (2010)

    Google Scholar 

  3. Kiela, D., Clark, S.: A systematic study of semantic vector space model parameters. In: 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC) (2014)

    Google Scholar 

  4. Kim, S.N., Medelyan, O., Kan, M., Baldwin, T.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)

    Article  Google Scholar 

  5. Medelyan, O., Perrone, V., Witten, I.H.: Subject metadata support powered by maui. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 407–408. ACM (2010)

    Google Scholar 

  6. Medelyan, O., Witten, I.H.: Thesaurus-based index term extraction for agricultural documents. In: Proceedings of the 6th Agricultural Ontology Service Workshop (2005)

    Google Scholar 

  7. Paynter, G.W.: Developing practical automatic metadata assignment and evaluation tools for internet resources. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 291–300. ACM (2005)

    Google Scholar 

  8. Pouliquen, B., Steinberger, R., Ignat, C.: Automatic annotation of multilingual text collections with a conceptual thesaurus. In: Workshop in Ontologies and Information Extraction (EUROLAN 2003) (2003)

    Google Scholar 

  9. Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Tjoa, A.M., Wagner, R.R. (eds.) Database and Expert Systems Applications, DEXA, International Workshops, Bilbao, Spain, August 30 - September 3, 2010, pp. 54–58. IEEE Computer Society (2010)

    Google Scholar 

  10. Yi, K., Chan, L.M.: Revisiting the syntactical and structural analysis of library of congress subject headings for the digital environment. J. Am. Soc. Inf. Sci. Technol. 61(4), 677–687 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Wartena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Aga, R.T., Wartena, C., Franke-Maier, M. (2016). Automatic Recognition and Disambiguation of Library of Congress Subject Headings. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43997-6_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43996-9

  • Online ISBN: 978-3-319-43997-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics