Abstract
The currently used word embedding techniques use fixed vectors to represent words without the concept of context and dynamics. This paper proposes a deep neural network CoDyWor to model the context of words so that words in different contexts have different vector representations of words. First of all, each layer of the model captures contextual information for each word of the input statement from different angles, such as grammatical information and semantic information, et al. Afterwards, different weights are assigned to each layer of the model through a multi-layered attention mechanism. At last, the information of each layer is integrated to form a dynamic word with contextual information to represent the vector. By comparing different models on the public dataset, it is found that the model’s accuracy in the task of logical reasoning has increased by 2.0%, F1 value in the task of named entity recognition has increased by 0.47%, and F1 value in the task of reading comprehension has increased by 2.96%. The experimental results demonstrate that this technology of word representation enhances the effect of the existing word representation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hashimoto, K., Xiong, C., Tsuruoka, Y., et al.: A joint many-task model: growing a neural network for multiple NLP tasks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1923–1933. Association for Computational Linguistics, Copenhagen (2017)
Bowman, S.R., Potts, C., Manning, C.D.: Recursive neural networks can learn logical semantics. In: Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, pp. 12–21. Association for Computational Linguistics, Beijing (2015)
Nallapati, R., Zhou, B., Gulcehre, C., et al.: Abstractive Text Summarization Using Sequence-To-Sequence RNNs and Beyond, pp. 280–290. Association for Computational Linguistics (2016)
Xiong, C., Merity, S., Socher, R.: Dynamic memory networks for visual and textual question answering. In: International Conference on Machine Learning, pp. 2397–2406 (2016)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
Bojanowski, P., Grave, E., Joulin, A., et al.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)
Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)
Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, Stroudsburg (2018)
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
McCann, B., Bradbury, J., Xiong, C., et al.: Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6294–6305 (2017)
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805 (2018)
Radford, A., Narasimhan, K., Salimans, T., et al.: Improving language understanding by generative pre-training (2018). https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/languageunderstandingpaper.pdf
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks?. In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122. Association for Computational Linguistics, New Orleans (2018)
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recognition, pp. 142–147. Association for Computational Linguistics (2003)
Rajpurkar, P., Zhang, J., Lopyrev, K., et al.: Squad: 100,000+ Questions for Machine Comprehension of Text, pp. 2383–2392. Association for Computational Linguistics (2016)
Acknowledgements
The work was partially supported by the China Postdoctoral Science Foundation under Grant No. 2019M653400; the Sichuan Science and Technology Program under Grant Nos. 2018GZ0253, 2019YFS0236, 2018GZ0182, 2018GZ0093 and 2018GZDZX0039.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, X., Xiong, X., Ju, S., Xie, Z., Wang, J. (2019). A Dynamic Word Representation Model Based on Deep Context. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_60
Download citation
DOI: https://doi.org/10.1007/978-3-030-32236-6_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)