Abstract
The word embeddings approaches have attracted extensive attention and widely used in many natural language processing (NLP) tasks. Relatedness between words can be reflected in vector space by word embeddings. However, the current word embeddings approaches commonly do not explore the context-specific information of word deeply in the overall corpus. In this paper, we propose to use graph attention network for word embeddings. We build a large single word graph for a corpus based on word order, then learn a word embeddings graph attention network (WEGAT) for the corpus. Our WEGAT is initialized with one-hot representation for word. We propose to use masked language model (MLM) as supervised task. In addition, through the text classification experiment, it is showed that accuracy of the word embeddings represented by WEGAT is higher than the current method for the same classification method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhang, Y., Jin, R., Zhou, Z.-H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1–4), 43–52 (2010)
Firth. J.R.: A synopsis of linguistic theory, 1930–1955. In: Studies in linguistic analysis, Philological Society, Oxford (1957)
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543 (2014)
Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 238–247 (2014)
Dhillon, P.S., Foster, D.P., Ungar, L.H.: Eigen words: spectral word embeddings. J. Mach. Learn. Res. 16, 3035–3078 (2015)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp 384–394 (2010)
Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: Sixth International Conference on Spoken Language Processing (2000)
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, pp. 641–648 (2007)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Deerwester, S., Dumais, S.T., Furnas, G.W., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Lee, D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(6), 1137–1155 (2003)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: International Conference on Learning Representations Workshop Track (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Jin, W., Srihari, R.K. :Graph - based text representation and knowledge discover. In: 2007 ACM Symposium on Applied Computing, pp. 807–811 (2007)
Chay, R., Tsoi, A.C., Hagenbuchner, M., et al.: A concept link graph for text structure mining. In: 32nd Australasian Computer Science Conference (2009)
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification (2018)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017))
Velickovic, P., Cucurull, G., Casanova, A., et al.: Graph attention networks (2017)
Acknowledgement
We express our heartfelt thanks to Velickovic P, Cucurull G, Casanova A, et al. for providing the open source.
Funding
Our research fund is funded by Fundamental Research Funds for the Central Universities (3072020CFQ0602, 3072020CF0604, 3072020CFP0601) and 2019 Industrial Internet Innovation and Development Engineering (KY1060020002, KY10600200008).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Ethics declarations
The authors declare that they have no conflicts of interest to report regarding the present study.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Long, Y., Xu, H., Qi, P., Zhang, L., Li, J. (2021). Graph Attention Network for Word Embeddings. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2021. Lecture Notes in Computer Science(), vol 12737. Springer, Cham. https://doi.org/10.1007/978-3-030-78612-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-78612-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78611-3
Online ISBN: 978-3-030-78612-0
eBook Packages: Computer ScienceComputer Science (R0)