Abstract
This paper hypothesizes that better word embeddings can be learned by representing words and subwords by different lengths of vectors. To investigate the impact of the length of subword vectors on word embeddings, this paper proposes a model based on the Subword Information Skip-gram model. The experiments on two datasets with respect to two tasks show that the proposed model outperforms 6 baselines, which confirms the aforementioned hypothesis. In addition, we also observe that, within a specific range, a higher dimensionality of subword vectors always improve the quality of word embeddings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. In: ACL, pp. 135–146 (2017)
Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: CoNLL, pp. 104–113 (2013)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.Y.: Co-learning of word representations and morpheme representations. In: COLING, pp. 141–150 (2014)
Acknowledgements
We thank the reviewers for their valuable comments. This research is supported by National Natural Science Foundation of China (No. U1836109 and No. 61772289).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, X., Luo, Y., Zhang, Y., Yuan, X. (2019). On the Impact of the Length of Subword Vectors on Word Embeddings. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_74
Download citation
DOI: https://doi.org/10.1007/978-3-030-18590-9_74
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18589-3
Online ISBN: 978-3-030-18590-9
eBook Packages: Computer ScienceComputer Science (R0)