Improving Word Embeddings via Combining with Complementary Languages

Li, Changliang; Xu, Bo; Wu, Gaowei; Zhuang, Tao; Wang, Xiuying; Ge, Wendong

doi:10.1007/978-3-319-06483-3_31

Improving Word Embeddings via Combining with Complementary Languages

Changliang Li²¹,
Bo Xu²¹,
Gaowei Wu²¹,
Tao Zhuang²¹,
Xiuying Wang²¹ &
…
Wendong Ge²¹

Conference paper

2677 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8436))

Abstract

Word embeddings have recently been demonstrated outstanding results across various NLP tasks. However, most existing word embeddings learning methods employ mono-lingual corpus without exploiting the linguistic relationship among languages. In this paper, we introduce a novel CCL (Combination with Complementary Languages) method to improve word embeddings. Under this method, one word embeddings are replaced by its center word embeddings, which is obtained by combining with the corresponding word embeddings in other different languages. We apply our method to several baseline models and evaluate the quality of word embeddings on word similarity task across two benchmark datasets. Despite its simplicity, the results show that our method is surprisingly effective in capturing semantic information, and outperforms baselines by a large margin, at most 20 Spearman rank correlation (ρ×100).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Turian, J., Ratinov, L., Bengio, Y.: Word representations: A simple and general method for semisupervised learning. In: ACL (2010)
Google Scholar
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive auto encoders for predicting sentiment distributions. In: EMNLP (2011)
Google Scholar
Mikolov, T., Karafi’at, M., Burget, L., Cernock’y, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH (2010)
Google Scholar
E. H. Huang, R. Socher, C. D. Manning, and A. Y. Ng. Improving word representations via global context and multiple word prototypes. In: Annual Meeting of the Association for Computational Linguistics (ACL) (2012)
Google Scholar
Luong, M., Socher, R., Manning, C.: Better word representations with recursive neural networks for morphology. In: CONLL (2013)
Google Scholar
Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting Similarities among Languages for Machine Translation. arXiv preprint arXiv:1309.4168 (2013)
Google Scholar
Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL-HLT (2013)
Google Scholar
Mikolov, T., Le, Q., Sutskever, I.: Exploiting Similarities among Languages for Machine Translation. Technical report, arXiv (2013)
Google Scholar
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, H., Solan, Z., Wolfman, G., Uppin, E.: Placing search in context: the oncept revisited. ACM Transactions on Information Systems 20(1), 116–131 (2002)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, P.R. China
Changliang Li, Bo Xu, Gaowei Wu, Tao Zhuang, Xiuying Wang & Wendong Ge

Authors

Changliang Li
View author publications
You can also search for this author in PubMed Google Scholar
Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Gaowei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Xiuying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wendong Ge
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Medicine and School of Electrical Engineering and Computer Science, Department of Epidemiology & Community Medicine, University of Ottawa, 451 Smyth Road, Room 3105, K1H 8M5, Ottawa, ON, Canada
Marina Sokolova
Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, N2L 3G1, Waterloo, ON, Canada
Peter van Beek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, C., Xu, B., Wu, G., Zhuang, T., Wang, X., Ge, W. (2014). Improving Word Embeddings via Combining with Complementary Languages. In: Sokolova, M., van Beek, P. (eds) Advances in Artificial Intelligence. Canadian AI 2014. Lecture Notes in Computer Science(), vol 8436. Springer, Cham. https://doi.org/10.1007/978-3-319-06483-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-06483-3_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06482-6
Online ISBN: 978-3-319-06483-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics