Abstract
This paper studies quality and orthogonality of information sources used in methods for computing word semantics. The quality of the methods is measured on several hand-crafted comparison datasets. The orthogonality is estimated by measuring the performance increase when two information sources are linearly interpolated using optimal interpolation parameters. The experiment conclusions reveal both expected and contradictory results and offer a deeper insight into the information sources of particular methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The distance from the most abstract synset – the common root.
- 2.
A quote by John Rupert Firth.
- 3.
- 4.
- 5.
- 6.
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Bruni, E.: The men test collection (2012). http://clic.cimec.unitn.it/ elia.bruni/MEN.html. Accessed 3 April 2015
Gabrilovich, E.: The wordsimilarity-353 test collection (2002). http://www.cs.technion.ac.il/ gabr/resources/data/wordsim353/. Accessed 1 March 2015
Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)
Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)
Landauer, T.K., Dutnais, S.T.: A solution to plato problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Lin, D.: Extracting collocations from text corpora (1998)
Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: CoNLL. Sofia, Bulgaria (2013)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995). doi:10.1145/219717.219748
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics (2014). http://aclweb.org/anthology/D14-1162
Radinsky, K.: The wordsimilarity-353 test collection (2010). http://tx.technion.ac.il/ kirar/Datasets.html. Accessed 1 April 2015
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI 1995, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc., San Francisco (1995)
Rubenstein, H., Goodenough, J.: Contextual correlates of synonymy. Commun. ACM 8, 627–633 (1965)
Turney, P.D., Pantel, P.: From frequency to meaning : vector space models of semantics. J. Artif. Intell. Res. 137, 141–188 (2010)
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, ACL 1994, pp. 133–138. Association for Computational Linguistics, Stroudsburg, PA, USA (1994). http://dx.doi.org/10.3115/981732.981751
Acknowledgements
This work was supported by grant no. SGS-2013-029 Advanced computing and information systems, by the European Regional Development Fund (ERDF) and by project “NTIS – New Technologies for Information Society”, European Centre of Excellence, CZ.1.05/1.1.00/02.0090. The access to the MetaCentrum computing facilities provided under the programme “Projects of Large Infrastructure for Research, Development, and Innovations” LM2010005, funded by the Ministry of Education, Youth, and Sports of the Czech Republic, is highly appreciated. The access to the CERIT-SC computing and storage facilities provided under the programme Center CERIT Scientific Cloud, part of the Operational Program Research and Development for Innovations, reg. no. CZ. 1.05/3.2.00/08.0144 is acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Konopík, M., Praz̆ák, O. (2015). Information Sources of Word Semantics Methods. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)