Skip to main content

Evaluating Learning Language Representations

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2015)

Abstract

Machine learning offers significant benefits for systems that process and understand natural language: (a) lower maintenance and upkeep costs than when using manually-constructed resources, (b) easier portability to new domains, tasks, or languages, and (c) robust and timely adaptation to situation-specific settings. However, the behaviour of an adaptive system is less predictable than when using an edited, stable resource, which makes quality control a continuous issue. This paper proposes an evaluation benchmark for measuring the quality, coverage, and stability of a natural language system as it learns word meaning. Inspired by existing tests for human vocabulary learning, we outline measures for the quality of semantic word representations, such as when learning word embeddings or other distributed representations. These measures highlight differences between the types of underlying learning processes as systems ingest progressively more data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baroni, M., Lenci, A.: How we BLESSed distributional semantic evaluation. In: Proceedings of the 2011 Workshop on GEometrical Models of Natural Language Semantics, pp. 1–10. ACL (2011)

    Google Scholar 

  2. Cook, P., Lau, J.H., McCarthy, D., Baldwin, T.: Novel word-sense identification. In: Proceedings of COLING, pp. 1624–1635 (2014)

    Google Scholar 

  3. Frishkoff, G.A., Collins-Thompson, K., Perfetti, C.A., Callan, J.: Measuring incremental changes in word knowledge: Experimental validation and implications for learning and assessment. Behavior Research Methods 40(4), 907–925 (2008)

    Article  Google Scholar 

  4. Frishkoff, G.A., Perfetti, C.A., Collins-Thompson, K.: Predicting robust vocabulary growth from measures of incremental learning. Scientific Studies of Reading 15(1), 71–91 (2011)

    Article  Google Scholar 

  5. Hill, F., Reichart, R., Korhonen, A.: Simlex-999: Evaluating semantic models with (genuine) similarity estimation (2014). arXiv preprint arXiv:1408.3456

  6. Karlgren, J. (ed.): Proceedings of the EACL workshop on New Text: Wikis and blogs and other dynamic text sources, EACL 2006

    Google Scholar 

  7. Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104(2), 211–240 (1997)

    Article  Google Scholar 

  8. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp. 3111–3119 (2013)

    Google Scholar 

  9. Turney, P.D., Pantel, P.: From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research 37, 141–188 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Jurgens .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Karlgren, J. et al. (2015). Evaluating Learning Language Representations. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24027-5_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24026-8

  • Online ISBN: 978-3-319-24027-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics