Skip to main content

Graph Attention Network for Word Embeddings

  • Conference paper
  • First Online:
Artificial Intelligence and Security (ICAIS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12737))

Included in the following conference series:

  • 1609 Accesses

Abstract

The word embeddings approaches have attracted extensive attention and widely used in many natural language processing (NLP) tasks. Relatedness between words can be reflected in vector space by word embeddings. However, the current word embeddings approaches commonly do not explore the context-specific information of word deeply in the overall corpus. In this paper, we propose to use graph attention network for word embeddings. We build a large single word graph for a corpus based on word order, then learn a word embeddings graph attention network (WEGAT) for the corpus. Our WEGAT is initialized with one-hot representation for word. We propose to use masked language model (MLM) as supervised task. In addition, through the text classification experiment, it is showed that accuracy of the word embeddings represented by WEGAT is higher than the current method for the same classification method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhang, Y., Jin, R., Zhou, Z.-H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1–4), 43–52 (2010)

    Article  Google Scholar 

  2. Firth. J.R.: A synopsis of linguistic theory, 1930–1955. In: Studies in linguistic analysis, Philological Society, Oxford (1957)

    Google Scholar 

  3. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Article  Google Scholar 

  4. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543 (2014)

    Google Scholar 

  5. Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 238–247 (2014)

    Google Scholar 

  6. Dhillon, P.S., Foster, D.P., Ungar, L.H.: Eigen words: spectral word embeddings. J. Mach. Learn. Res. 16, 3035–3078 (2015)

    MathSciNet  MATH  Google Scholar 

  7. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  8. Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp 384–394 (2010)

    Google Scholar 

  9. Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: Sixth International Conference on Spoken Language Processing (2000)

    Google Scholar 

  10. Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, pp. 641–648 (2007)

    Google Scholar 

  11. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  Google Scholar 

  12. Deerwester, S., Dumais, S.T., Furnas, G.W., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  13. Lee, D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  14. Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(6), 1137–1155 (2003)

    MATH  Google Scholar 

  15. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: International Conference on Learning Representations Workshop Track (2013)

    Google Scholar 

  16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  18. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  19. Jin, W., Srihari, R.K. :Graph - based text representation and knowledge discover. In: 2007 ACM Symposium on Applied Computing, pp. 807–811 (2007)

    Google Scholar 

  20. Chay, R., Tsoi, A.C., Hagenbuchner, M., et al.: A concept link graph for text structure mining. In: 32nd Australasian Computer Science Conference (2009)

    Google Scholar 

  21. Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification (2018)

    Google Scholar 

  22. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017))

    Google Scholar 

  23. Velickovic, P., Cucurull, G., Casanova, A., et al.: Graph attention networks (2017)

    Google Scholar 

Download references

Acknowledgement

We express our heartfelt thanks to Velickovic P, Cucurull G, Casanova A, et al. for providing the open source.

Funding

Our research fund is funded by Fundamental Research Funds for the Central Universities (3072020CFQ0602, 3072020CF0604, 3072020CFP0601) and 2019 Industrial Internet Innovation and Development Engineering (KY1060020002, KY10600200008).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Ethics declarations

The authors declare that they have no conflicts of interest to report regarding the present study.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Long, Y., Xu, H., Qi, P., Zhang, L., Li, J. (2021). Graph Attention Network for Word Embeddings. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2021. Lecture Notes in Computer Science(), vol 12737. Springer, Cham. https://doi.org/10.1007/978-3-030-78612-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78612-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78611-3

  • Online ISBN: 978-3-030-78612-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics