Abstract
Vector space word representations have gained big success recently at improving performance across various NLP tasks. However, existing word embeddings learning methods only utilize homo-lingual corpus. Inspired by transfer learning, we propose a novel language transfer method to obtain word embeddings via language transfer. Under this method, in order to obtain word embeddings of one language (target language), we train models on corpus of another different language (source language) instead. And then we use the obtained source language word embeddings to represent target language word embeddings. We evaluate the word embeddings obtained by the proposed method on word similarity tasks across several benchmark datasets. And the results show that our method is surprisingly effective, outperforming competitive baselines by a large margin. Another benefit of our method is that the process of collecting new corpus might be skipped.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Annual Meeting of the Association for Computational Linguistics, ACL (2012)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: A simple and general method for semisupervised learning. In: ACL (2010)
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH (2010)
Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: NIPS, pp. 1081–1088 (2009)
Luong, M., Socher, R., Manning, C.: Better word representations with recursive neural networks for morphology. In: CONLL (2013)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML (2009)
Torrey, L., Shavlik, J.: Transfer learning. In: Soria, E., Martin, J., Magdalena, R., Martinez, M., Serrano, A. (eds.) Handbook of Research on Machine Learning Applications. IGI Global (2009)
Asadi, M., Huber, M.: Effective control knowledge transfer through learning skill and representation hierarchies. In: International Joint Conference on Artificial Intelligence (2007)
Huang, F., Yates, A.: Distributional representations for handling sparsity in supervised sequence labeling. In: ACL (2009)
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: CoNLL (2009)
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: ACL, pp. 595–603 (2008)
Miller, S., Guinness, J., Zamanian, A.: Name tagging with word clusters and discriminative training. In: HLT-NAACL, pp. 337–342 (2004)
Liang, P.: Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology (2005)
Bengio, Y.: Neural net language models. Scholarpedia 3, 3881 (2008)
Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: NIPS (2001)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. Journal of Machine Learning Research 3, 1137–1155 (2003)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: ICML (2008)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. AISTATS (2005)
Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL-HLT (2013)
Miller, S., Guinness, J., Zamanian, A.: Name tagging with word clusters and discriminative training. In: HLT-NAACL, pp. 337–342 (2004)
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: EMNLP (2011)
Socher, R., Manning, C., Ng, A.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: NIPS*2010 Workshop on Deep Learning and Unsupervised Feature Learning (2010)
Socher, R., Lin, C.C., Ng, A., et al.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 129–136 (2011)
Socher, R., Bauer, J., Manning, C.D., et al.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference (2013)
Mikolov, T., Kombrink, S., Burget, L., Cernocký, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: ICASSP (2011)
Mikolov, T., Zweig, G.: Context dependent recurrent neural network language model. In: SLT (2012)
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, H., Solan, Z., Wolfman, G., Uppin, E.: Placing search in context: The concept revisited. ACM Transactions on Information Systems 20(1), 116–131 (2002)
Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)
Miller, G., Charles, W.: Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1), 1–28 (1991)
Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting Similarities among Languages for Machine Translation. arXiv preprint arXiv:1309.4168 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, C., Xu, B., Wu, G., Wang, X., Ge, W., Li, Y. (2014). Obtaining Better Word Representations via Language Transfer. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-54906-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)