Abstract
In this paper, we propose a novel model which exploits the topic relevance to enhance the word embedding learning. We attempt to leverage the hidden topic-bigram model to build topic relevance matrices, then learn the Topic-Bigram Word Embedding (TBWE) by aggregating the context as well as corresponding topic-bigram information. The topic relevance weights are updated with word embeddings simultaneously during the training process. To verify the validity and accuracy of the model, we conduct experiments on word analogy task and word similarity task. The results show that the TBWE model can achieve the better performance in both two tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barbieri, N., Manco, G., Ritacco, E., Carnuccio, M., Bevacqua, A.: Probabilistic topic models for sequence data. Mach. Learn. 93(1), 5–29 (2013)
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. Machine Learning. In: Proceedings of the 25th International Conference (ICML 2008), vol. 307, pp. 160–167. ACM, Helsinki, Finland (2008)
Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the 10th International World Wide Web Conference, WWW 2001, pp. 406–414. ACM, Hong Kong, China (2001)
Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)
Hu, Q., Pei, Y., Chen, Q., He, L.: SG++: word representation with sentiment and negation for twitter sentiment classification. In: Proceedings of the 39th International conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 997–1000. ACM, Pisa, Italy (2016)
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, vol. 1, pp. 873–882. The Association for Computer Linguistics, Jeju Island, Korea (2012)
Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 302–308. The Association for Computer Linguistics, Baltimore (2014)
Liu, Q., Ling, Z., Jiang, H., Hu, Y.: Part-of-speech relevance weights for learning word embeddings. CoRR abs/1603.07695 (2016)
Liu, Y., Liu, Z., Chua, T., Sun, M.: Topical word embeddings. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2418–2424. AAAI Press, Austin, Texas, USA (2015)
Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the 17th Conference on Computational Natural Language Learning, CoNLL 2013, pp. 104–113. ACL, Sofia, Bulgaria (2013)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045–1048. ISCA, Makuhari, Chiba, Japan (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, vol. 26, pp. 3111–3119. Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA (2013)
Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, AISTATS 2005. Society for Artificial Intelligence and Statistics, Bridgetown, Barbados (2005)
Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.: Co-learning of word representations and morpheme representations. In: Proceedings of the 25th International Conference on Computational Linguistics, COLING 2014, pp. 141–150, Dublin, Ireland (2014)
Ren, Y., Wang, R., Ji, D.: A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 369, 188–198 (2016)
Ren, Y., Zhang, Y., Zhang, M., Ji, D.: Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3038–3044. AAAI Press, Phoenix, Arizona, USA (2016)
Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)
Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)
Wang, H., Wang, J., Zhao, M., Cao, J., Guo, M.: Joint topic-semantic-aware social recommendation for online voting. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 347–356. ACM, Singapore (2017)
Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, ICSLP 2000/INTERSPEECH 2000, pp. 202–205. ISCA, Beijing, China (2000)
Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 545–550. The Association for Computer Linguistics, Baltimore (2014)
Acknowledgments
This work is supported by the National Key Research and Development Program of China under grants 2016QY01W0202 and 2016YFB0800402, National Natural Science Foundation of China under grants 61572221, U1401258, 61433006, 61502185 and 61772219, Major Projects of the National Social Science Foundation under grant 16ZDA092, Science and Technology Support Program of Hubei Province under grant 2015AAA013, Science and Technology Program of Guangdong Province under grant 2014B010111007 and Guangxi High level innovation Team in Higher Education Institutions—Innovation Team of ASEAN Digital Cloud Big Data Security and Mining Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, Q., Li, R., Li, Y., Liu, Q. (2018). Topic-Bigram Enhanced Word Embedding Model. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-04182-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)