Abstract
In this paper we introduce an approach, called LODQA, for open domain Question Answering over Linked Open Data. We confine ourselves to three kinds of questions: factoid, confirmation, and definition questions. By using LODQA it is feasible to answer questions over 400 millions of entities of any domain without using any training data, since we exploit simultaneously 400 Linked datasets. In particular, we exploit the services of LODsyndesis, a suite of services (based on semantics-aware indexes) which supports cross-dataset reasoning over hundreds of Linked datasets and 2 billion triples. The proposed Question Answering process follows an information extraction approach and comprises several steps including question cleaning, heuristic based question type identification, entity recognition, linking and disambiguation using Linked Data-based methods and pure NLP methods (specifically DBpedia Spotlight and Stanford CoreNLP), WordNet-based question expansion for tackling the lexical gap (between the input question and the underlying sources), and triple scoring for producing the final answer. We discuss the benefits of this approach in terms of answerable questions and answer verification, and we investigate, through experimental results, how the aforementioned steps of the process affect the effectiveness and the efficiency of question answering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abujabal, A., Yahya, M., Riedewald, M., Weikum, G.: Automated template generation for question answering over knowledge graphs. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1191–1200. International World Wide Web Conferences Steering Committee (2017)
Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. arXiv preprint arXiv:1906.08990 (2019)
Bast, H., Haussmann, E.: More accurate question answering on freebase. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1431–1440. ACM (2015)
Berant, J., Liang, P.: Imitation learning of agenda-based semantic parsers. Trans. Assoc. Comput. Linguist. 3, 545–558 (2015)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Bordes, A., Usunier, N., Chopra, S., Weston, J.: Large-scale simple question answering with memory networks. CoRR, abs/1506.02075 (2015)
Diefenbach, D., Singh, K., Maret, P.: WDAqua-core1: a question answering service for RDF knowledge bases. In: Companion of the The Web Conference 2018, pp. 1087–1091. International World Wide Web Conferences Steering Committee (2018)
Dimitrakis, E., Sgontzos, K., Tzitzikas, Y.: A survey on question answering systems over linked data and documents. J. Intell. Inf. Syst., 1–27 (2019). https://doi.org/10.1007/s10844-019-00584-7
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370. Association for Computational Linguistics (2005)
Hakimov, S., Jebbara, S., Cimiano, P.: AMUSE: multilingual semantic parsing for question answering over linked data. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 329–346. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_20
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.-C.: Survey on challenges of question answering in the semantic web. Seman. Web 8(6), 895–920 (2017)
Lehmann, J., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Seman. Web 6(2), 167–195 (2015)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The stanford coreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014)
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Mishra, A., Jain, S.K.: A survey on question answering systems with classification. J. King Saud Univ. Comput. Inf. Sci. 28(3), 345–361 (2016)
Mountantonakis, M., Tzitzikas, Y.: High performance methods for linked open data connectivity analytics. Information 9(6), 134 (2018)
Mountantonakis, M., Tzitzikas, Y.: LODsyndesis: global scale knowledge services. Heritage 1(2), 335–348 (2018)
Mountantonakis, M., Tzitzikas, Y.: Large scale semantic integration of linked data: a survey. ACM Comput. Surv. (CSUR) 52(5), 103 (2019)
Papangelis, A., Papadakos, P., Stylianou, Y., Tzitzikas, Y.: Spoken dialogue for information navigation. In: Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, pp. 229–234 (2018)
Patra, B.: A survey of community question answering. CoRR, abs/1705.04009 (2017)
Radoev, N., Tremblay, M., Gagnon, M., Zouaq, A.: Answering natural language questions on RDF knowledge base in French. In: 7th Open Challenge in Question Answering over Linked Data (QALD 2017), Portoroz, Slovenia (2017)
Reddy, S., et al.: Transforming dependency structures to logical forms for semantic parsing. Trans. Assoc. Comput. Linguist. 4, 127–140 (2016)
Rodrigo, A., Peñas, A.: A study about the future evaluation of question-answering systems. Knowl. Based Syst. 137, 83–93 (2017)
Shekarpour, S., Marx, E., Ngomo, A.-C.N., Auer, S.: SINA: semantic interpretation of user queries for question answering on interlinked data. Web Seman. Sci. Serv. Agents World Wide Web 30, 39–51 (2015)
Stockinger, K.: The rise of natural language interfaces to databases. In: ACM SIGMOD Blog (2019)
Tzitzikas, Y., Manolis, N., Papadakos, P.: Faceted exploration of RDF/S datasets: a survey. J. Intell. Inf. Syst. 48, 1–36 (2016)
Wang, M.: A survey of answer extraction techniques in factoid question answering. In: Computational Linguistics, vol. 1, no. 1 (2006)
Yao, X., Berant, J., Van Durme, B.: Freebase QA: information extraction or semantic parsing. In: Proceedings of ACL (2014)
Yavuz, S., Gur, I., Su, Y., Srivatsa, M., Yan, X.: Improving semantic parsing via answer type inference. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 149–159 (2016)
Yih, W.-T., Chang, M.-W., He, X., Gao, J.: Semantic parsing via staged query graph generation: question answering with knowledge base. In: Proceedings of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1321–1331 (2015)
Zhang, Y., He, S., Liu, K., Zhao, J.: A joint model for question answering over multiple knowledge bases. In: AAAI, pp. 3094–3100 (2016)
Acknowledgements
The research work was supported by the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under the HFRI PhD Fellowship grant (GA. No. 166).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Dimitrakis, E., Sgontzos, K., Mountantonakis, M., Tzitzikas, Y. (2020). Enabling Efficient Question Answering over Hundreds of Linked Datasets. In: Flouris, G., Laurent, D., Plexousakis, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2019. Communications in Computer and Information Science, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-44900-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-44900-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44899-8
Online ISBN: 978-3-030-44900-1
eBook Packages: Computer ScienceComputer Science (R0)