Knowledge Graph Question Answering for Materials Science (KGQA4MAT)

An, Yuan; Greenberg, Jane; Uribe-Romo, Fernando J.; Gómez-Gualdrón, Diego A.; Langlois, Kyle; Furst, Jacob; Kalinowski, Alex; Zhao, Xintong; Hu, Xiaohua

doi:10.1007/978-3-031-65990-4_2

Yuan An⁶,
Jane Greenberg⁶,
Fernando J. Uribe-Romo⁷,
Diego A. Gómez-Gualdrón⁸,
Kyle Langlois⁷,
Jacob Furst⁷,
Alex Kalinowski⁶,
Xintong Zhao⁶ &
…
Xiaohua Hu⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2048))

Included in the following conference series:

Research Conference on Metadata and Semantics Research

192 Accesses

Abstract

We present a study on Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured data, metadata, and knowledge extracted from the literature. We aim to develop a natural language (NL) interface for domain expert to query the MOF-KG. A first step is our benchmark, which consists of 161 complex questions involving comparison, aggregation, and intricate graph structures. Each question has been rephrased into three additional variations, totaling 644 questions and 161 KG queries. We then developed a systematic approach for utilizing ChatGPT to translate natural language questions into formal KG queries. We experimented with different prompt strategies. The research indicated that using an ontology, providing a few-shot examples, and offering a chain-of-thought explanation resulted in the top F1-score of 0.89. We also applied this method to the well-known QALD-9 dataset, achieving performance on par with the state-of-the-art techniques. The results indicate applicability of this model for MOF research and potentially other scientific foci.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Construction of a knowledge graph for framework material enabled by large language models and its application

Article Open access 27 February 2025

MatKG: An autonomously generated knowledge graph in Material Science

Article Open access 17 February 2024

Application of machine reading comprehension techniques for named entity recognition in materials science

Article Open access 02 July 2024

Notes

References

Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. VLDB J. 28, 793–819 (2019)
Article Google Scholar
An, Y., et al.: Exploring pre-trained language models to build knowledge graph for metal-organic frameworks (mofs). In: In 2nd KGBigdata Workshop Collocated with IEEE BigData Conference (2022)
Google Scholar
An, Y., et al.: Building open knowledge graph for metal-organic frameworks (mof-kg): challenges and case studies. In: International Workshop on Knowledge Graphs & Open Knowledge Network (OKN) Co-located with the ACM SIGKDD 2022 (2022)
Google Scholar
Chen, X., et al.: How robust is GPT-3.5 to predecessors? a comprehensive study on language understanding tasks. arXiv e-prints arXiv:2303.00293 (2023)
Chung, Y., et al.: Advances, updates, and analytics for the computation-ready, experimental metal-organic framework database: core mof 2019. J. Chem. Eng. Data 64(12), 5985–5998 (2019)
Article Google Scholar
Dubey, M., Banerjee, D., Abdelkawi, A., Lehmann, J.: LC-QuAD 2.0: a large dataset for complex question answering over Wikidata and DBpedia. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 69–78. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_5
Chapter Google Scholar
Fang, X., Kalinowski, A., Zhao, H., You, Z., Zhang, Y., An, Y.: Prompt design and answer processing for knowledge base construction from pre-trained language models (lm-kbc). In: LM-KBC Challenge @ 21st ISWC 2022 (2022)
Google Scholar
Jagadish, H.V., et al.: Making database systems usable. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 13–24 (2007)
Google Scholar
Liang, S., Stockinger, K., de Farias, T.M., Anisimova, M., Gil, M.: Querying knowledge graphs in natural language. J. Big Data 8(1), 3 (2021)
Article Google Scholar
Lu, P., et al.: Learn to explain: multimodal reasoning via thought chains for science question answering. In: NeurIPS 2022 (2022)
Google Scholar
Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. In: ISWC 2019, pp. 487–504 (2019)
Google Scholar
McCusker, J.P., Keshan, N., Rashid, S., Deagen, M., Brinson, C., McGuinness, D.L.: NanoMine: a knowledge graph for nanocomposite materials science. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 144–159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_10
Chapter Google Scholar
Moghadam, P., et al.: Development of a cambridge structural database subset: a collection of metal-organic frameworks for past, present, and future. Chem. Mater. 29(7), 2618–2625 (2017)
Google Scholar
Mrdjenovich, D., et al.: propnet: A knowledge graph for materials science. Matter 2(2), 464–480 (2020)
Google Scholar
Omar, R., Mangukiya, O., Kalnis, P., Mansour, E.: ChatGPT versus traditional question answering for knowledge graphs: current status and future directions towards knowledge graph chatbots. arXiv e-prints arXiv:2302.06466 (2023)
Park, H., Kang, Y., Choe, W., Kim, J.: Mining insights on metal-organic framework synthesis from scientific literature texts. J. Chem. Inf. Model. 62(5), 1190–1198 (2022)
Article Google Scholar
Purkayastha, S., Dana, S., Garg, D., Khandelwal, D., Bhargav, G.S.: A deep neural approach to kgqa via sparql silhouette generation. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2022)
Google Scholar
Rony, M.R.A.H., Kumar, U., Teucher, R., Kovriguina, L., Lehmann, J.: SGPT: a generative approach for sparql query generation from natural language questions. IEEE Access 10, 70712–70723 (2022)
Article Google Scholar
Sabou, M., et al.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Article Google Scholar
Tan, Y., et al.: Evaluation of ChatGPT as a question answering system for answering complex questions. arXiv e-prints arXiv:2303.07992 (Mar 2023)
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: Lc-quad: a corpus for complex question answering over knowledge graphs. In: ISWC 2017 (2017)
Google Scholar
Usbeck, R., Gusmita, R.H., Ngomo, A.C.N., Saleem, M.: 9th challenge on question answering over linked data (qald-9). In: Semdeep/NLIWoD@ISWC (2018)
Google Scholar
Venugopal, V., Pai, S., Olivetti, E.: MATKG: the largest knowledge graph in materials science – entities, relations, and link prediction through graph representation learning. In: In AI4Mat Workshop in NeurIPS 2022 (2022)
Google Scholar
Wei, J., et al.: Chain of thought prompting elicits reasoning in large language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022)
Google Scholar
Yaghi, O.: Reticular chemistry in all dimensions. ACS Central Sci. 5(8), 1295–1300 (2019)
Article Google Scholar
Zhang, X., Liu, X., Li, X., Pan, D.: MMKG: an approach to generate metallic materials knowledge graph based on dbpedia and wikipedia. Comput. Phys. Commun. 211, 98–112 (2017)
Article Google Scholar
Zhao, X., et al.: Knowledge graph-empowered materials discovery. In: 1st Workshop on Knowledge Graph and Big Data collocated with 2021 IEEE International Conference on Big Data (Big Data) (2021)
Google Scholar
Zhong, Q., Ding, L., Liu, J., Du, B., Tao, D.: Can ChatGPT understand too? a comparative study on ChatGPT and fine-tuned BERT. arXiv e-prints arXiv:2302.10198 (Feb 2023)
Zou, L., Huang, R., Wang, H., Yu, J.X., He, W., Zhao, D.: Natural language question answering over rdf: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 313–324 (2014)
Google Scholar

Download references

Acknowledgments

This project is partially supported by the Drexel Office of Faculty Affairs’ 2022 Faculty Summer Research awards #284213, and the U.S. National Science Foundation Office of Advanced Cyberinfrastructure (OAC) Grant #1940239, #1940307, and #2118201.

Author information

Authors and Affiliations

College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
Yuan An, Jane Greenberg, Alex Kalinowski, Xintong Zhao & Xiaohua Hu
Department of Chemistry, University of Central Florida, Orlando, FL, USA
Fernando J. Uribe-Romo, Kyle Langlois & Jacob Furst
Chemical and Biological Engineering, Colorado School of Mines, Golden, CO, USA
Diego A. Gómez-Gualdrón

Authors

Yuan An
View author publications
You can also search for this author in PubMed Google Scholar
Jane Greenberg
View author publications
You can also search for this author in PubMed Google Scholar
Fernando J. Uribe-Romo
View author publications
You can also search for this author in PubMed Google Scholar
Diego A. Gómez-Gualdrón
View author publications
You can also search for this author in PubMed Google Scholar
Kyle Langlois
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Furst
View author publications
You can also search for this author in PubMed Google Scholar
Alex Kalinowski
View author publications
You can also search for this author in PubMed Google Scholar
Xintong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohua Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan An .

Editor information

Editors and Affiliations

Department of Library Science, Archives and Information. Systems, School of Social Sciences, International Hellenic University, Thessaloniki, Greece
Emmanouel Garoufallou
Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Milano, Italy
Fabio Sartori

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

An, Y. et al. (2024). Knowledge Graph Question Answering for Materials Science (KGQA4MAT). In: Garoufallou, E., Sartori, F. (eds) Metadata and Semantic Research. MTSR 2023. Communications in Computer and Information Science, vol 2048. Springer, Cham. https://doi.org/10.1007/978-3-031-65990-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-65990-4_2
Published: 31 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65989-8
Online ISBN: 978-3-031-65990-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge Graph Question Answering for Materials Science (KGQA4MAT)

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Construction of a knowledge graph for framework material enabled by large language models and its application

MatKG: An autonomously generated knowledge graph in Material Science

Application of machine reading comprehension techniques for named entity recognition in materials science

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Knowledge Graph Question Answering for Materials Science (KGQA4MAT)

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Construction of a knowledge graph for framework material enabled by large language models and its application

MatKG: An autonomously generated knowledge graph in Material Science

Application of machine reading comprehension techniques for named entity recognition in materials science

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation