Abstract
This article discusses the algorithms and methods of data representation for knowledge graphs. The proposed algorithms make it possible to automate the process of extracting and processing data from users’ requests within the mixed learning process and reduce the role of an expert in the preparation of question-answering data sets necessary for training models of dialogue systems. The results show that the method of enrichment of the knowledge graph leads to an increase in the number of links and the accuracy of vector representation models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dale, R.: The return of the chatbots. Nat. Lang. Eng. 22(5), 811–817 (2016)
Bordes, A., Boureau, Y.L., Weston, J.: Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683 (2016)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)
Zhang, Y., Dai, H., Kozareva, Z., Smola, A.J., Song, L.: Variational reasoning for question answering with knowledge graph. In: Thirty-Second AAAI Conference on Artificial Intelligence, April 2018
Nivre, J., De Marneffe, M. C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 1659–1666, May 2016
Dobrov, B.V., Lukashevich, N.V.: Thesaurus RuTez as a resource for solving problems of information retrieval. In: Proceedings of the All-Russian Conference of Knowledge-Ontology-Theory (UMBRELLA 2009), Novosibirsk, vol. 10 (2009)
Lukashevich, N.V., Lashevich, G.E.: RuWordNet thesaurus: structure and current state. In: Knowledge-Ontology-Theory (UMBRELLA 2017), pp. 48–57 (2017)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Pawar, A., Mago, V.: Calculating the similarity between words and sentences using a lexical database and corpus statistics. arXiv preprint arXiv:1802.05667 (2018)
Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proceedings of the Conference on Email and Anti-Spam, vol. 2004, July 2004
Romanov, A., Volchek, D., Chirkin, A., Mouromtsev, D., Sender, A., Dergachev, A.: Implementing a natural language processing approach for an online exercise in urban design. In: Piotrowski’s Readings in Language Engineering and Applied Linguistics, pp. 139–154 (2018)
Burtsev, M., Seliverstov, A., Airapetyan, R., Arkhipov, M., Baymurzina, D., Bushkov, N., Gureenkova, O., Khakhulin, T., Kuratov, Y., Kuznetsov, D., Litinsky, A.: DeepPavlov: open-source library for dialogue systems. In: Proceedings of ACL 2018, System Demonstrations, pp. 122–127, July 2018
Acknowledgments
This work was supported by the Government of Russian Federation (Grant 08-08).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Romanov, A., Volchek, D., Mouromtsev, D. (2020). Data Extraction and Preprocessing for Automated Question Answering Based on Knowledge Graphs. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S., Orovic, I., Moreira, F. (eds) Trends and Innovations in Information Systems and Technologies. WorldCIST 2020. Advances in Intelligent Systems and Computing, vol 1159. Springer, Cham. https://doi.org/10.1007/978-3-030-45688-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-45688-7_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45687-0
Online ISBN: 978-3-030-45688-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)