Skip to main content

Topic-Bigram Enhanced Word Embedding Model

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11303))

Included in the following conference series:

  • 2222 Accesses

Abstract

In this paper, we propose a novel model which exploits the topic relevance to enhance the word embedding learning. We attempt to leverage the hidden topic-bigram model to build topic relevance matrices, then learn the Topic-Bigram Word Embedding (TBWE) by aggregating the context as well as corresponding topic-bigram information. The topic relevance weights are updated with word embeddings simultaneously during the training process. To verify the validity and accuracy of the model, we conduct experiments on word analogy task and word similarity task. The results show that the TBWE model can achieve the better performance in both two tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.psych.ualberta.ca/~westburylab/downloads/westburylab.wikicorp.download.html.

  2. 2.

    https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient.

References

  1. Barbieri, N., Manco, G., Ritacco, E., Carnuccio, M., Bevacqua, A.: Probabilistic topic models for sequence data. Mach. Learn. 93(1), 5–29 (2013)

    Article  MathSciNet  Google Scholar 

  2. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  3. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. Machine Learning. In: Proceedings of the 25th International Conference (ICML 2008), vol. 307, pp. 160–167. ACM, Helsinki, Finland (2008)

    Google Scholar 

  4. Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the 10th International World Wide Web Conference, WWW 2001, pp. 406–414. ACM, Hong Kong, China (2001)

    Google Scholar 

  5. Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)

    Article  MathSciNet  Google Scholar 

  6. Hu, Q., Pei, Y., Chen, Q., He, L.: SG++: word representation with sentiment and negation for twitter sentiment classification. In: Proceedings of the 39th International conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 997–1000. ACM, Pisa, Italy (2016)

    Google Scholar 

  7. Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, vol. 1, pp. 873–882. The Association for Computer Linguistics, Jeju Island, Korea (2012)

    Google Scholar 

  8. Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 302–308. The Association for Computer Linguistics, Baltimore (2014)

    Google Scholar 

  9. Liu, Q., Ling, Z., Jiang, H., Hu, Y.: Part-of-speech relevance weights for learning word embeddings. CoRR abs/1603.07695 (2016)

    Google Scholar 

  10. Liu, Y., Liu, Z., Chua, T., Sun, M.: Topical word embeddings. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2418–2424. AAAI Press, Austin, Texas, USA (2015)

    Google Scholar 

  11. Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the 17th Conference on Computational Natural Language Learning, CoNLL 2013, pp. 104–113. ACL, Sofia, Bulgaria (2013)

    Google Scholar 

  12. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  13. Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045–1048. ISCA, Makuhari, Chiba, Japan (2010)

    Google Scholar 

  14. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, vol. 26, pp. 3111–3119. Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA (2013)

    Google Scholar 

  15. Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)

    Article  MathSciNet  Google Scholar 

  16. Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, AISTATS 2005. Society for Artificial Intelligence and Statistics, Bridgetown, Barbados (2005)

    Google Scholar 

  17. Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.: Co-learning of word representations and morpheme representations. In: Proceedings of the 25th International Conference on Computational Linguistics, COLING 2014, pp. 141–150, Dublin, Ireland (2014)

    Google Scholar 

  18. Ren, Y., Wang, R., Ji, D.: A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 369, 188–198 (2016)

    Article  Google Scholar 

  19. Ren, Y., Zhang, Y., Zhang, M., Ji, D.: Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3038–3044. AAAI Press, Phoenix, Arizona, USA (2016)

    Google Scholar 

  20. Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)

    Article  Google Scholar 

  21. Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)

    Article  Google Scholar 

  22. Wang, H., Wang, J., Zhao, M., Cao, J., Guo, M.: Joint topic-semantic-aware social recommendation for online voting. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 347–356. ACM, Singapore (2017)

    Google Scholar 

  23. Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, ICSLP 2000/INTERSPEECH 2000, pp. 202–205. ISCA, Beijing, China (2000)

    Google Scholar 

  24. Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 545–550. The Association for Computer Linguistics, Baltimore (2014)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China under grants 2016QY01W0202 and 2016YFB0800402, National Natural Science Foundation of China under grants 61572221, U1401258, 61433006, 61502185 and 61772219, Major Projects of the National Social Science Foundation under grant 16ZDA092, Science and Technology Support Program of Hubei Province under grant 2015AAA013, Science and Technology Program of Guangdong Province under grant 2014B010111007 and Guangxi High level innovation Team in Higher Education Institutions—Innovation Team of ASEAN Digital Cloud Big Data Security and Mining Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruixuan Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Q., Li, R., Li, Y., Liu, Q. (2018). Topic-Bigram Enhanced Word Embedding Model. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04182-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04181-6

  • Online ISBN: 978-3-030-04182-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics