Abstract
We present a study on Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured data, metadata, and knowledge extracted from the literature. We aim to develop a natural language (NL) interface for domain expert to query the MOF-KG. A first step is our benchmark, which consists of 161 complex questions involving comparison, aggregation, and intricate graph structures. Each question has been rephrased into three additional variations, totaling 644 questions and 161 KG queries. We then developed a systematic approach for utilizing ChatGPT to translate natural language questions into formal KG queries. We experimented with different prompt strategies. The research indicated that using an ontology, providing a few-shot examples, and offering a chain-of-thought explanation resulted in the top F1-score of 0.89. We also applied this method to the well-known QALD-9 dataset, achieving performance on par with the state-of-the-art techniques. The results indicate applicability of this model for MOF research and potentially other scientific foci.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. VLDB J. 28, 793–819 (2019)
An, Y., et al.: Exploring pre-trained language models to build knowledge graph for metal-organic frameworks (mofs). In: In 2nd KGBigdata Workshop Collocated with IEEE BigData Conference (2022)
An, Y., et al.: Building open knowledge graph for metal-organic frameworks (mof-kg): challenges and case studies. In: International Workshop on Knowledge Graphs & Open Knowledge Network (OKN) Co-located with the ACM SIGKDD 2022 (2022)
Chen, X., et al.: How robust is GPT-3.5 to predecessors? a comprehensive study on language understanding tasks. arXiv e-prints arXiv:2303.00293 (2023)
Chung, Y., et al.: Advances, updates, and analytics for the computation-ready, experimental metal-organic framework database: core mof 2019. J. Chem. Eng. Data 64(12), 5985–5998 (2019)
Dubey, M., Banerjee, D., Abdelkawi, A., Lehmann, J.: LC-QuAD 2.0: a large dataset for complex question answering over Wikidata and DBpedia. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 69–78. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_5
Fang, X., Kalinowski, A., Zhao, H., You, Z., Zhang, Y., An, Y.: Prompt design and answer processing for knowledge base construction from pre-trained language models (lm-kbc). In: LM-KBC Challenge @ 21st ISWC 2022 (2022)
Jagadish, H.V., et al.: Making database systems usable. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 13–24 (2007)
Liang, S., Stockinger, K., de Farias, T.M., Anisimova, M., Gil, M.: Querying knowledge graphs in natural language. J. Big Data 8(1), 3 (2021)
Lu, P., et al.: Learn to explain: multimodal reasoning via thought chains for science question answering. In: NeurIPS 2022 (2022)
Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. In: ISWC 2019, pp. 487–504 (2019)
McCusker, J.P., Keshan, N., Rashid, S., Deagen, M., Brinson, C., McGuinness, D.L.: NanoMine: a knowledge graph for nanocomposite materials science. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 144–159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_10
Moghadam, P., et al.: Development of a cambridge structural database subset: a collection of metal-organic frameworks for past, present, and future. Chem. Mater. 29(7), 2618–2625 (2017)
Mrdjenovich, D., et al.: propnet: A knowledge graph for materials science. Matter 2(2), 464–480 (2020)
Omar, R., Mangukiya, O., Kalnis, P., Mansour, E.: ChatGPT versus traditional question answering for knowledge graphs: current status and future directions towards knowledge graph chatbots. arXiv e-prints arXiv:2302.06466 (2023)
Park, H., Kang, Y., Choe, W., Kim, J.: Mining insights on metal-organic framework synthesis from scientific literature texts. J. Chem. Inf. Model. 62(5), 1190–1198 (2022)
Purkayastha, S., Dana, S., Garg, D., Khandelwal, D., Bhargav, G.S.: A deep neural approach to kgqa via sparql silhouette generation. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2022)
Rony, M.R.A.H., Kumar, U., Teucher, R., Kovriguina, L., Lehmann, J.: SGPT: a generative approach for sparql query generation from natural language questions. IEEE Access 10, 70712–70723 (2022)
Sabou, M., et al.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Tan, Y., et al.: Evaluation of ChatGPT as a question answering system for answering complex questions. arXiv e-prints arXiv:2303.07992 (Mar 2023)
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: Lc-quad: a corpus for complex question answering over knowledge graphs. In: ISWC 2017 (2017)
Usbeck, R., Gusmita, R.H., Ngomo, A.C.N., Saleem, M.: 9th challenge on question answering over linked data (qald-9). In: Semdeep/NLIWoD@ISWC (2018)
Venugopal, V., Pai, S., Olivetti, E.: MATKG: the largest knowledge graph in materials science – entities, relations, and link prediction through graph representation learning. In: In AI4Mat Workshop in NeurIPS 2022 (2022)
Wei, J., et al.: Chain of thought prompting elicits reasoning in large language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022)
Yaghi, O.: Reticular chemistry in all dimensions. ACS Central Sci. 5(8), 1295–1300 (2019)
Zhang, X., Liu, X., Li, X., Pan, D.: MMKG: an approach to generate metallic materials knowledge graph based on dbpedia and wikipedia. Comput. Phys. Commun. 211, 98–112 (2017)
Zhao, X., et al.: Knowledge graph-empowered materials discovery. In: 1st Workshop on Knowledge Graph and Big Data collocated with 2021 IEEE International Conference on Big Data (Big Data) (2021)
Zhong, Q., Ding, L., Liu, J., Du, B., Tao, D.: Can ChatGPT understand too? a comparative study on ChatGPT and fine-tuned BERT. arXiv e-prints arXiv:2302.10198 (Feb 2023)
Zou, L., Huang, R., Wang, H., Yu, J.X., He, W., Zhao, D.: Natural language question answering over rdf: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 313–324 (2014)
Acknowledgments
This project is partially supported by the Drexel Office of Faculty Affairs’ 2022 Faculty Summer Research awards #284213, and the U.S. National Science Foundation Office of Advanced Cyberinfrastructure (OAC) Grant #1940239, #1940307, and #2118201.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
An, Y. et al. (2024). Knowledge Graph Question Answering for Materials Science (KGQA4MAT). In: Garoufallou, E., Sartori, F. (eds) Metadata and Semantic Research. MTSR 2023. Communications in Computer and Information Science, vol 2048. Springer, Cham. https://doi.org/10.1007/978-3-031-65990-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-65990-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65989-8
Online ISBN: 978-3-031-65990-4
eBook Packages: Computer ScienceComputer Science (R0)