Skip to main content
Log in

Building a Vietnamese question answering system based on knowledge graph and distributed CNN

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Question answering system (QAS) can be applied everywhere such as in schools, hospitals, banks, e-commerce websites. A smart QAS that can replace people is what people expect. Therefore, there are a lot of studies to build, develop, and improve QAS. However, QAS used for low-resource languages like Vietnamese is still very limited. So, in this paper, we present a method for building Vietnamese QAS. Except for specific Vietnamese language processes, most of our solutions can also be applied to other languages. We build QAS based on knowledge graph (KG) and convolutional neural network (CNN). KG provides knowledge and deducing ability for QAS. CNN is used to classify questions in the natural language to identify the correct answer to a given question. Moreover, we also use distributed architecture to train the CNN model. On the other hands, we also propose a solution to speed up searching for answers in a large KG by partitioning and indexing KG by using the DM-Tree structure. Besides, we also present experimental results and evaluation results of our model using common metrics to prove the effectiveness of our solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://vi.wikipedia.org

  2. https://github.com/vncorenlp/VnCoreNLP

  3. CoNLL2Triple is a tool we developed to convert CoNLL files into triple files.

  4. https://github.com/vncorenlp/VnCoreNLP/blob/master/TagsetDescription.md

  5. https://en.wikipedia.org/wiki/Hà_Nội

  6. https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics

References

  1. Holotescu C (2016) Moocbuddy: A chatbot for personalized learning with moocs. In: RoCHI

  2. Fadhil A, Villafiorita A (2017) An adaptive learning with gamification & conversational uis: The rise of cibopolibot In: Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization, ser. UMAP ’17, Bratislava, Slovakia: Association for Computing Machinery, 408–412, ISBN: 9781450350679. [Online]. Available: https://doi.org/10.1145/3099023.3099112

  3. Page LC, Gehlbach H (2017) How an artificially intelligent virtual assistant helps students navigate the road to college. AERA Open 3:233285841774922

    Article  Google Scholar 

  4. Kumar MN, Chandar PCL, Prasad AV, Sumangali K (2016) Android based educational chatbot for visually impaired people. IEEE, Chennai, pp 1–4. ISBN: 978-1-5090-0613-7. https://doi.org/10.1109/ICCIC.2016.7919664

  5. Wahyudi ML, Khodra AS, Prihatmanto, Machbub C (2018) A Question Answering System Using Graph-Pattern Association Rules (QAGPAR) on YAGO Knowledge Base. In: 2018 International Conference on Information Technology Systems and Innovation, ICITSI 2018 - Proceedings, ISBN: 9781538656938. https://doi.org/10.1109/ICITSI.2018.8696046. arXiv:1902.00624

  6. Huang X, Zhang J, Li D, Li P (2019) Knowledge graph embedding based question answering. In: WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining, ISBN: 9781450359405. https://doi.org/10.1145/3289600.3290956

  7. Brandtzæg PB, Følstad A (2018) Chatbots: Changing user needs and motivations. Interactions 25:38–43

    Article  Google Scholar 

  8. Deriu JM, Cieliebak M (2018) SwissAlps at SemEval-2017 Task 3: Attentionbased Convolutional Neural Network for Community Question Answering. https://doi.org/10.18653/v1/s17-2054

  9. Zhou X, Hu B, Chen Q, Wang X (2018) Recurrent convolutional neural network for answer selection in community question answering. Neurocomputing. ISSN: 18728286. https://doi.org/10.1016/j.neucom.2016.07.082

  10. Reddy ACO, Madhavi K (2019) Convolutional recurrent neural network with template based representation for complex question answering. International Journal of Electrical and Computer Engineering. ISSN: 20888708. https://doi.org/10.11591/ijece.v10i3.pp2710-2718

  11. Liu L, Luo J (2018) A Question Answering System Based on Deep Learning. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). ISBN: 9783319959566. https://doi.org/10.1007/978-3-319-95957-3_19

  12. Afrae B, Mohamed BA, Boudhir AA (2020) A question answering system with a sequence to sequence grammatical correction. In: Proceedings of the 3rd International Conference on Networking, Information Systems Security, ser. NISS2020, Marrakech, Morocco: Association for Computing Machinery, ISBN: 9781450376341. [Online]. Available: https://doi.org/10.1145/3386723.3387894

  13. Wang Y, Chen Q, He C, Liu H, Wu X (2020) Knowledge base question answering system based on knowledge graph representation learning. In: Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence, ser. ICIAI 2020, Xiamen, China: Association for Computing Machinery, 170–179, ISBN: 9781450376587. [Online]. Available: https://doi.org/10.1145/3390557.3394296

  14. Bhagat P, Prajapati SK, Seth A (2020) Initial lessons from building an ivrbased automated question-answering system. In: Proceedings of the 2020 International Conference on Information and Communication Technologies and Development, ser. ICTD2020, Guayaquil, Ecuador: Association for Computing Machinery, ISBN: 9781450387620. [Online]. Available: https://doi.org/10.1145/3392561.3397581

  15. Tung VX, Minh NL, Hoang DT (2016) Semantic Parsing for Vietnamese Question Answering System. In: Proceedings - 2015 IEEE International Conference on Knowledge and Systems Engineering, KSE 2015, ISBN: 9781467380133. https://doi.org/10.1109/KSE.2015.42

  16. Pham ST, Nguyen DT (2016) A computational and inferential method for analyzing the semantics of phrase and sentence in Vietnamese Question Answering System Model (VietQASM). In: Proceedings - AMS 2015: Asia Modelling Symposium 2015 - Asia 9th International Conference on Mathematical Modelling and Computer Simulation, ISBN: 9781467383233. https://doi.org/10.1109/AMS.2015.26

  17. Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML ’01 Proceedings of the Eighteenth International Conference on Machine Learning, ISSN: 1750-2799. https://doi.org/10.1038/nprot.2006.61. arXiv:arXiv:1011.4088v1

  18. Le-Hong P, Bui D-T (2018) A factoid question answering system for vietnamese. In: Companion Proceedings of the The Web Conference 2018, ser.WWW ’18, Lyon, France: International World Wide Web Conferences Steering Committee, 1049–1055, ISBN: 9781450356404. [Online]. Available: https://doi.org/10.1145/3184558.3191535

  19. Bach NX, Thanh PD, Oanh TT (2020) Question Analysis towards a Vietnamese Question Answering System in the Education Domain. Cybernetics and Information Technologies, ISSN: 13144081. https://doi.org/10.2478/cait-2020-0008

  20. Allam AMN, Haggag MH (2016) The question answering systems : A survey. International Journal of Research and Reviews in Information Sciences (IJRRIS) 2(3)

  21. Vijoy M, Jamal S (2016) Survey on question answering system. Int J Adv Res. https://doi.org/10.21474/ijar01/1303

    Article  Google Scholar 

  22. Sandhini S, Binu R (2018) Classification of question answering systems: A survey. In: Emerging Trends in Engineering, Science and Technology for Society, Energy and Environment - Proceedings of the International Conference in Emerging Trends in Engineering, Science and Technology, ICETEST 2018, ISBN: 9780815357605

  23. Salunkhe A (2020) Evolution of techniques for question answering over knowledge base: A survey. International Journal of Computer Applications 177(34):9–14. ISSN: 0975-8887. https://doi.org/10.5120/ijca2020919817. [Online]. Available: http://www.ijcaonline.org/archives/volume177/number34/31120-2020919817

  24. Lei D, Chen X, Zhao J (2018) Opening the black box of deep learning, May 22. arXiv:1805.08355v1 [cs.LG]

  25. Chang DT (2018) Concept-oriented deep learning, 5. arXiv:1806.01756v1 [cs.AI]

  26. Bhandare A, Bhide M, Gokhale P, Chandavarkar R (2016) Applications of Convolutional Neural Networks. Int J Comput Sci Inf Technol 7(5):2206–2215

    Google Scholar 

  27. Albawi S, Mohammed TA, Al-Zawi S (2018) Understanding of a convolutional neural network. In: Proceedings of 2017 International Conference on Engineering and Technology, ICET 2017, ISBN: 9781538619490. https://doi.org/10.1109/ICEngTechnol.2017.8308186

  28. Valueva MV, Nagornov NN, Lyakhov PA, Valuev GV, Chervyakov NI (2020) Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Mathematics and Computers in Simulation, ISSN: 03784754. https://doi.org/10.1016/j.matcom.2020.04.031

  29. Do P, Phan T, Le H, Gupta BB (2020) Building a knowledge graph by using cross-lingual transfer method and distributed minie algorithm on apache spark. Neural Computing and Applications, ISSN: 1433-3058. [Online]. Available: https://doi.org/10.1007/s00521-020-05495-1

  30. Cai H, Zheng VW, Chang KC-C (9 2018) A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering 30:1616–1637. ISSN: 2326-3865. https://doi.org/10.1109/TKDE.2018.2807452

  31. Nguyen DQ, Vu T, Nguyen DQ, Dras M, Johnson M (2017) From word segmentation to POS tagging for Vietnamese. In: Proceedings of the Australasian Language Technology Association Workshop 2017, Brisbane, Australia, pp 108–113. [Online]. Available: https://www.aclweb.org/anthology/U17-1013

  32. Nguyen DQ, Nguyen DQ, Vu T, Dras M, Johnson M (2018) A fast and accurate vietnamese word segmenter. In: Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds.) Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. European Language Resources Association (ELRA). [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2018/summaries/55.html

  33. Vu T, Nguyen DQ, Nguyen DQ, Dras M, Johnson M (2018) VnCoreNLP: A Vietnamese natural language processing toolkit. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp 56–60. https://doi.org/10.18653/v1/N18-5012. [Online]. Available: https://www.aclweb.org/anthology/N18-5012

  34. Truong D, Vo DT, Nguyen UT (2017) Vietnamese open information extraction. In: ACM International Conference Proceeding Series, ISBN: 9781450353281. https://doi.org/10.1145/3155133.3155171

  35. Langer M, He Z, Rahayu W, Xue Y (2020) Distributed training of deep learning models: A taxonomic perspective. IEEE Transactions on Parallel and Distributed Systems 31(12):2802–2818. ISSN: 2161-9883. [Online]. Available: https://doi.org/10.1109/TPDS.2020.3003307

  36. Joeri CI-D, Hermans R (2016) Distributed keras: Distributed deep learning with apache spark and keras, https://github.com/JoeriHermans/dist-keras/

  37. Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd International Conference on Very Large Databases, VLDB 1997, ISBN: 1558604707

  38. Do P, Hong TP, To HD (2020) Dmtree: A novel indexing method for finding similarities in large vector sets. Int J Adv Comput Sci Appl 11(4):0110483

    Google Scholar 

  39. Cloudera I (2017) Spark Guide. ISBN: 1650362048

  40. Databricks (2017) A Gentle Introduction To Apache Spark. Communication

  41. Drabas T, Lee D (2017) Learning PySpark. ISBN: 1786463709

  42. Shaikh E, Mohiuddin I, Alufaisan Y, Nahvi I (2019) Apache Spark: A Big Data Processing Engine. In: 2019 2nd IEEE Middle East and North Africa COMMunications Conference, MENACOMM 2019, ISBN: 9781728136875. https://doi.org/10.1109/MENACOMM46666.2019.8988541

  43. Veith AdS, de Assuncao MD (2019) Apache Spark. In: Encyclopedia of Big Data Technologies. https://doi.org/10.1007/978-3-319-77525-8_37

  44. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Letters. ISSN 01678655. https://doi.org/10.1016/j.patrec.2005.10.010

  45. Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognition Letters. ISSN: 01678655. https://doi.org/10.1016/j.patrec.2008.08.010

Download references

Acknowledgements

This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCMC) under grant number DS2020-26-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuc Do.

Ethics declarations

Conflict of Interest

We have no conflict of interest for this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phan, T., Do, P. Building a Vietnamese question answering system based on knowledge graph and distributed CNN. Neural Comput & Applic 33, 14887–14907 (2021). https://doi.org/10.1007/s00521-021-06126-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06126-z

Keywords

Navigation