Abstract
In traditional word2vec methods, hierarchical softmax algorithm uses the whole vocabulary to construct a Huffman tree and it trains each pair of words just in logarithmic time consumption. But due to the lack of consideration about cooperation of each word in the corpus, it will reduce the performance of language model and the trained word vectors. In this paper, we substitute a purely data-driven method for the original Huffman-tree method to rebuild the binary tree. The new construction method utilizes the semantical and syntactical cooperation of words to cluster the words hierarchically. The cooperation of words is reflected in the word vectors which collected from the initial Huffman-tree training procedure. Our methods substantially improve the performances of word vectors in semantical and syntactical tasks.
The National Key Research and Development Program of China (No. 2016YFB0700500), and the National Science Foundation of China (No. 61572075, No. 61702036), and Fundamental Research Funds for the Central Universities (No. FRF-TP-17-012A1), and China Postdoctoral Science Foundation (No. 2017M620619).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (i), pp. 740–750 (2014)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Sienčnik, S.K.: Adapting word2vec to named entity recognition (2015)
McMahon, J.G., Smith, F.J.: Improving statistical language model performance with automatically generated word hierarchies. Comput. Linguist. 22(2), 217–247 (1996)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, pp. 1–12 (2013)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), pp. 246–252, March 2005
Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words (1994)
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta, May 2010
Wikipedia Contributors: Language model – Wikipedia, the free encyclopedia (2018). (Online: Accessed 9 May 2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, Z., Ban, X., Hu, J. (2018). Improving Word Representation Quality Trained by word2vec via a More Efficient Hierarchical Clustering Method. In: Luo, Y. (eds) Cooperative Design, Visualization, and Engineering. CDVE 2018. Lecture Notes in Computer Science(), vol 11151. Springer, Cham. https://doi.org/10.1007/978-3-030-00560-3_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-00560-3_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00559-7
Online ISBN: 978-3-030-00560-3
eBook Packages: Computer ScienceComputer Science (R0)