Abstract
Question answering system (QAS) can be applied everywhere such as in schools, hospitals, banks, e-commerce websites. A smart QAS that can replace people is what people expect. Therefore, there are a lot of studies to build, develop, and improve QAS. However, QAS used for low-resource languages like Vietnamese is still very limited. So, in this paper, we present a method for building Vietnamese QAS. Except for specific Vietnamese language processes, most of our solutions can also be applied to other languages. We build QAS based on knowledge graph (KG) and convolutional neural network (CNN). KG provides knowledge and deducing ability for QAS. CNN is used to classify questions in the natural language to identify the correct answer to a given question. Moreover, we also use distributed architecture to train the CNN model. On the other hands, we also propose a solution to speed up searching for answers in a large KG by partitioning and indexing KG by using the DM-Tree structure. Besides, we also present experimental results and evaluation results of our model using common metrics to prove the effectiveness of our solution.














Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
CoNLL2Triple is a tool we developed to convert CoNLL files into triple files.
https://en.wikipedia.org/wiki/Hà_Nội
References
Holotescu C (2016) Moocbuddy: A chatbot for personalized learning with moocs. In: RoCHI
Fadhil A, Villafiorita A (2017) An adaptive learning with gamification & conversational uis: The rise of cibopolibot In: Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization, ser. UMAP ’17, Bratislava, Slovakia: Association for Computing Machinery, 408–412, ISBN: 9781450350679. [Online]. Available: https://doi.org/10.1145/3099023.3099112
Page LC, Gehlbach H (2017) How an artificially intelligent virtual assistant helps students navigate the road to college. AERA Open 3:233285841774922
Kumar MN, Chandar PCL, Prasad AV, Sumangali K (2016) Android based educational chatbot for visually impaired people. IEEE, Chennai, pp 1–4. ISBN: 978-1-5090-0613-7. https://doi.org/10.1109/ICCIC.2016.7919664
Wahyudi ML, Khodra AS, Prihatmanto, Machbub C (2018) A Question Answering System Using Graph-Pattern Association Rules (QAGPAR) on YAGO Knowledge Base. In: 2018 International Conference on Information Technology Systems and Innovation, ICITSI 2018 - Proceedings, ISBN: 9781538656938. https://doi.org/10.1109/ICITSI.2018.8696046. arXiv:1902.00624
Huang X, Zhang J, Li D, Li P (2019) Knowledge graph embedding based question answering. In: WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining, ISBN: 9781450359405. https://doi.org/10.1145/3289600.3290956
Brandtzæg PB, Følstad A (2018) Chatbots: Changing user needs and motivations. Interactions 25:38–43
Deriu JM, Cieliebak M (2018) SwissAlps at SemEval-2017 Task 3: Attentionbased Convolutional Neural Network for Community Question Answering. https://doi.org/10.18653/v1/s17-2054
Zhou X, Hu B, Chen Q, Wang X (2018) Recurrent convolutional neural network for answer selection in community question answering. Neurocomputing. ISSN: 18728286. https://doi.org/10.1016/j.neucom.2016.07.082
Reddy ACO, Madhavi K (2019) Convolutional recurrent neural network with template based representation for complex question answering. International Journal of Electrical and Computer Engineering. ISSN: 20888708. https://doi.org/10.11591/ijece.v10i3.pp2710-2718
Liu L, Luo J (2018) A Question Answering System Based on Deep Learning. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). ISBN: 9783319959566. https://doi.org/10.1007/978-3-319-95957-3_19
Afrae B, Mohamed BA, Boudhir AA (2020) A question answering system with a sequence to sequence grammatical correction. In: Proceedings of the 3rd International Conference on Networking, Information Systems Security, ser. NISS2020, Marrakech, Morocco: Association for Computing Machinery, ISBN: 9781450376341. [Online]. Available: https://doi.org/10.1145/3386723.3387894
Wang Y, Chen Q, He C, Liu H, Wu X (2020) Knowledge base question answering system based on knowledge graph representation learning. In: Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence, ser. ICIAI 2020, Xiamen, China: Association for Computing Machinery, 170–179, ISBN: 9781450376587. [Online]. Available: https://doi.org/10.1145/3390557.3394296
Bhagat P, Prajapati SK, Seth A (2020) Initial lessons from building an ivrbased automated question-answering system. In: Proceedings of the 2020 International Conference on Information and Communication Technologies and Development, ser. ICTD2020, Guayaquil, Ecuador: Association for Computing Machinery, ISBN: 9781450387620. [Online]. Available: https://doi.org/10.1145/3392561.3397581
Tung VX, Minh NL, Hoang DT (2016) Semantic Parsing for Vietnamese Question Answering System. In: Proceedings - 2015 IEEE International Conference on Knowledge and Systems Engineering, KSE 2015, ISBN: 9781467380133. https://doi.org/10.1109/KSE.2015.42
Pham ST, Nguyen DT (2016) A computational and inferential method for analyzing the semantics of phrase and sentence in Vietnamese Question Answering System Model (VietQASM). In: Proceedings - AMS 2015: Asia Modelling Symposium 2015 - Asia 9th International Conference on Mathematical Modelling and Computer Simulation, ISBN: 9781467383233. https://doi.org/10.1109/AMS.2015.26
Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML ’01 Proceedings of the Eighteenth International Conference on Machine Learning, ISSN: 1750-2799. https://doi.org/10.1038/nprot.2006.61. arXiv:arXiv:1011.4088v1
Le-Hong P, Bui D-T (2018) A factoid question answering system for vietnamese. In: Companion Proceedings of the The Web Conference 2018, ser.WWW ’18, Lyon, France: International World Wide Web Conferences Steering Committee, 1049–1055, ISBN: 9781450356404. [Online]. Available: https://doi.org/10.1145/3184558.3191535
Bach NX, Thanh PD, Oanh TT (2020) Question Analysis towards a Vietnamese Question Answering System in the Education Domain. Cybernetics and Information Technologies, ISSN: 13144081. https://doi.org/10.2478/cait-2020-0008
Allam AMN, Haggag MH (2016) The question answering systems : A survey. International Journal of Research and Reviews in Information Sciences (IJRRIS) 2(3)
Vijoy M, Jamal S (2016) Survey on question answering system. Int J Adv Res. https://doi.org/10.21474/ijar01/1303
Sandhini S, Binu R (2018) Classification of question answering systems: A survey. In: Emerging Trends in Engineering, Science and Technology for Society, Energy and Environment - Proceedings of the International Conference in Emerging Trends in Engineering, Science and Technology, ICETEST 2018, ISBN: 9780815357605
Salunkhe A (2020) Evolution of techniques for question answering over knowledge base: A survey. International Journal of Computer Applications 177(34):9–14. ISSN: 0975-8887. https://doi.org/10.5120/ijca2020919817. [Online]. Available: http://www.ijcaonline.org/archives/volume177/number34/31120-2020919817
Lei D, Chen X, Zhao J (2018) Opening the black box of deep learning, May 22. arXiv:1805.08355v1 [cs.LG]
Chang DT (2018) Concept-oriented deep learning, 5. arXiv:1806.01756v1 [cs.AI]
Bhandare A, Bhide M, Gokhale P, Chandavarkar R (2016) Applications of Convolutional Neural Networks. Int J Comput Sci Inf Technol 7(5):2206–2215
Albawi S, Mohammed TA, Al-Zawi S (2018) Understanding of a convolutional neural network. In: Proceedings of 2017 International Conference on Engineering and Technology, ICET 2017, ISBN: 9781538619490. https://doi.org/10.1109/ICEngTechnol.2017.8308186
Valueva MV, Nagornov NN, Lyakhov PA, Valuev GV, Chervyakov NI (2020) Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Mathematics and Computers in Simulation, ISSN: 03784754. https://doi.org/10.1016/j.matcom.2020.04.031
Do P, Phan T, Le H, Gupta BB (2020) Building a knowledge graph by using cross-lingual transfer method and distributed minie algorithm on apache spark. Neural Computing and Applications, ISSN: 1433-3058. [Online]. Available: https://doi.org/10.1007/s00521-020-05495-1
Cai H, Zheng VW, Chang KC-C (9 2018) A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering 30:1616–1637. ISSN: 2326-3865. https://doi.org/10.1109/TKDE.2018.2807452
Nguyen DQ, Vu T, Nguyen DQ, Dras M, Johnson M (2017) From word segmentation to POS tagging for Vietnamese. In: Proceedings of the Australasian Language Technology Association Workshop 2017, Brisbane, Australia, pp 108–113. [Online]. Available: https://www.aclweb.org/anthology/U17-1013
Nguyen DQ, Nguyen DQ, Vu T, Dras M, Johnson M (2018) A fast and accurate vietnamese word segmenter. In: Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds.) Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. European Language Resources Association (ELRA). [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2018/summaries/55.html
Vu T, Nguyen DQ, Nguyen DQ, Dras M, Johnson M (2018) VnCoreNLP: A Vietnamese natural language processing toolkit. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp 56–60. https://doi.org/10.18653/v1/N18-5012. [Online]. Available: https://www.aclweb.org/anthology/N18-5012
Truong D, Vo DT, Nguyen UT (2017) Vietnamese open information extraction. In: ACM International Conference Proceeding Series, ISBN: 9781450353281. https://doi.org/10.1145/3155133.3155171
Langer M, He Z, Rahayu W, Xue Y (2020) Distributed training of deep learning models: A taxonomic perspective. IEEE Transactions on Parallel and Distributed Systems 31(12):2802–2818. ISSN: 2161-9883. [Online]. Available: https://doi.org/10.1109/TPDS.2020.3003307
Joeri CI-D, Hermans R (2016) Distributed keras: Distributed deep learning with apache spark and keras, https://github.com/JoeriHermans/dist-keras/
Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd International Conference on Very Large Databases, VLDB 1997, ISBN: 1558604707
Do P, Hong TP, To HD (2020) Dmtree: A novel indexing method for finding similarities in large vector sets. Int J Adv Comput Sci Appl 11(4):0110483
Cloudera I (2017) Spark Guide. ISBN: 1650362048
Databricks (2017) A Gentle Introduction To Apache Spark. Communication
Drabas T, Lee D (2017) Learning PySpark. ISBN: 1786463709
Shaikh E, Mohiuddin I, Alufaisan Y, Nahvi I (2019) Apache Spark: A Big Data Processing Engine. In: 2019 2nd IEEE Middle East and North Africa COMMunications Conference, MENACOMM 2019, ISBN: 9781728136875. https://doi.org/10.1109/MENACOMM46666.2019.8988541
Veith AdS, de Assuncao MD (2019) Apache Spark. In: Encyclopedia of Big Data Technologies. https://doi.org/10.1007/978-3-319-77525-8_37
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Letters. ISSN 01678655. https://doi.org/10.1016/j.patrec.2005.10.010
Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognition Letters. ISSN: 01678655. https://doi.org/10.1016/j.patrec.2008.08.010
Acknowledgements
This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCMC) under grant number DS2020-26-01.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
We have no conflict of interest for this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Phan, T., Do, P. Building a Vietnamese question answering system based on knowledge graph and distributed CNN. Neural Comput & Applic 33, 14887–14907 (2021). https://doi.org/10.1007/s00521-021-06126-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06126-z