Skip to main content

Data Extraction and Preprocessing for Automated Question Answering Based on Knowledge Graphs

  • Conference paper
  • First Online:
Book cover Trends and Innovations in Information Systems and Technologies (WorldCIST 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1159))

Included in the following conference series:

  • 2163 Accesses

Abstract

This article discusses the algorithms and methods of data representation for knowledge graphs. The proposed algorithms make it possible to automate the process of extracting and processing data from users’ requests within the mixed learning process and reduce the role of an expert in the preparation of question-answering data sets necessary for training models of dialogue systems. The results show that the method of enrichment of the knowledge graph leads to an increase in the number of links and the accuracy of vector representation models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dale, R.: The return of the chatbots. Nat. Lang. Eng. 22(5), 811–817 (2016)

    Article  Google Scholar 

  2. Bordes, A., Boureau, Y.L., Weston, J.: Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683 (2016)

  3. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)

    Google Scholar 

  4. Zhang, Y., Dai, H., Kozareva, Z., Smola, A.J., Song, L.: Variational reasoning for question answering with knowledge graph. In: Thirty-Second AAAI Conference on Artificial Intelligence, April 2018

    Google Scholar 

  5. Nivre, J., De Marneffe, M. C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 1659–1666, May 2016

    Google Scholar 

  6. Dobrov, B.V., Lukashevich, N.V.: Thesaurus RuTez as a resource for solving problems of information retrieval. In: Proceedings of the All-Russian Conference of Knowledge-Ontology-Theory (UMBRELLA 2009), Novosibirsk, vol. 10 (2009)

    Google Scholar 

  7. Lukashevich, N.V., Lashevich, G.E.: RuWordNet thesaurus: structure and current state. In: Knowledge-Ontology-Theory (UMBRELLA 2017), pp. 48–57 (2017)

    Google Scholar 

  8. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  9. Pawar, A., Mago, V.: Calculating the similarity between words and sentences using a lexical database and corpus statistics. arXiv preprint arXiv:1802.05667 (2018)

  10. Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)

    Article  Google Scholar 

  11. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  12. Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proceedings of the Conference on Email and Anti-Spam, vol. 2004, July 2004

    Google Scholar 

  13. Romanov, A., Volchek, D., Chirkin, A., Mouromtsev, D., Sender, A., Dergachev, A.: Implementing a natural language processing approach for an online exercise in urban design. In: Piotrowski’s Readings in Language Engineering and Applied Linguistics, pp. 139–154 (2018)

    Google Scholar 

  14. Burtsev, M., Seliverstov, A., Airapetyan, R., Arkhipov, M., Baymurzina, D., Bushkov, N., Gureenkova, O., Khakhulin, T., Kuratov, Y., Kuznetsov, D., Litinsky, A.: DeepPavlov: open-source library for dialogue systems. In: Proceedings of ACL 2018, System Demonstrations, pp. 122–127, July 2018

    Google Scholar 

Download references

Acknowledgments

This work was supported by the Government of Russian Federation (Grant 08-08).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksei Romanov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Romanov, A., Volchek, D., Mouromtsev, D. (2020). Data Extraction and Preprocessing for Automated Question Answering Based on Knowledge Graphs. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S., Orovic, I., Moreira, F. (eds) Trends and Innovations in Information Systems and Technologies. WorldCIST 2020. Advances in Intelligent Systems and Computing, vol 1159. Springer, Cham. https://doi.org/10.1007/978-3-030-45688-7_27

Download citation

Publish with us

Policies and ethics