Skip to main content

Knowledge Graph Question Answering for Materials Science (KGQA4MAT)

  • Conference paper
  • First Online:
Metadata and Semantic Research (MTSR 2023)

Abstract

We present a study on Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured data, metadata, and knowledge extracted from the literature. We aim to develop a natural language (NL) interface for domain expert to query the MOF-KG. A first step is our benchmark, which consists of 161 complex questions involving comparison, aggregation, and intricate graph structures. Each question has been rephrased into three additional variations, totaling 644 questions and 161 KG queries. We then developed a systematic approach for utilizing ChatGPT to translate natural language questions into formal KG queries. We experimented with different prompt strategies. The research indicated that using an ontology, providing a few-shot examples, and offering a chain-of-thought explanation resulted in the top F1-score of 0.89. We also applied this method to the well-known QALD-9 dataset, achieving performance on par with the state-of-the-art techniques. The results indicate applicability of this model for MOF research and potentially other scientific foci.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/KGQA/leaderboard.

  2. 2.

    https://www.ccdc.cam.ac.uk/free-products/csd-mof-collection/.

  3. 3.

    https://zenodo.org/record/3370144#.ZDvtN3bMIdV.

  4. 4.

    https://globalscience.berkeley.edu/database.

  5. 5.

    https://doi.org/10.6084/m9.figshare.16902652.v3.

  6. 6.

    https://github.com/kgqa4mat/KGQA4MAT.

  7. 7.

    https://dbpedia.org/sparql.

  8. 8.

    https://github.com/KGQA/leaderboard/blob/gh-pages/dbpedia/qald.md#qald-9.

  9. 9.

    http://wikidata.dbpedia.org/services-resources/ontology.

References

  1. Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. VLDB J. 28, 793–819 (2019)

    Article  Google Scholar 

  2. An, Y., et al.: Exploring pre-trained language models to build knowledge graph for metal-organic frameworks (mofs). In: In 2nd KGBigdata Workshop Collocated with IEEE BigData Conference (2022)

    Google Scholar 

  3. An, Y., et al.: Building open knowledge graph for metal-organic frameworks (mof-kg): challenges and case studies. In: International Workshop on Knowledge Graphs & Open Knowledge Network (OKN) Co-located with the ACM SIGKDD 2022 (2022)

    Google Scholar 

  4. Chen, X., et al.: How robust is GPT-3.5 to predecessors? a comprehensive study on language understanding tasks. arXiv e-prints arXiv:2303.00293 (2023)

  5. Chung, Y., et al.: Advances, updates, and analytics for the computation-ready, experimental metal-organic framework database: core mof 2019. J. Chem. Eng. Data 64(12), 5985–5998 (2019)

    Article  Google Scholar 

  6. Dubey, M., Banerjee, D., Abdelkawi, A., Lehmann, J.: LC-QuAD 2.0: a large dataset for complex question answering over Wikidata and DBpedia. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 69–78. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_5

    Chapter  Google Scholar 

  7. Fang, X., Kalinowski, A., Zhao, H., You, Z., Zhang, Y., An, Y.: Prompt design and answer processing for knowledge base construction from pre-trained language models (lm-kbc). In: LM-KBC Challenge @ 21st ISWC 2022 (2022)

    Google Scholar 

  8. Jagadish, H.V., et al.: Making database systems usable. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 13–24 (2007)

    Google Scholar 

  9. Liang, S., Stockinger, K., de Farias, T.M., Anisimova, M., Gil, M.: Querying knowledge graphs in natural language. J. Big Data 8(1), 3 (2021)

    Article  Google Scholar 

  10. Lu, P., et al.: Learn to explain: multimodal reasoning via thought chains for science question answering. In: NeurIPS 2022 (2022)

    Google Scholar 

  11. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. In: ISWC 2019, pp. 487–504 (2019)

    Google Scholar 

  12. McCusker, J.P., Keshan, N., Rashid, S., Deagen, M., Brinson, C., McGuinness, D.L.: NanoMine: a knowledge graph for nanocomposite materials science. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 144–159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_10

    Chapter  Google Scholar 

  13. Moghadam, P., et al.: Development of a cambridge structural database subset: a collection of metal-organic frameworks for past, present, and future. Chem. Mater. 29(7), 2618–2625 (2017)

    Google Scholar 

  14. Mrdjenovich, D., et al.: propnet: A knowledge graph for materials science. Matter 2(2), 464–480 (2020)

    Google Scholar 

  15. Omar, R., Mangukiya, O., Kalnis, P., Mansour, E.: ChatGPT versus traditional question answering for knowledge graphs: current status and future directions towards knowledge graph chatbots. arXiv e-prints arXiv:2302.06466 (2023)

  16. Park, H., Kang, Y., Choe, W., Kim, J.: Mining insights on metal-organic framework synthesis from scientific literature texts. J. Chem. Inf. Model. 62(5), 1190–1198 (2022)

    Article  Google Scholar 

  17. Purkayastha, S., Dana, S., Garg, D., Khandelwal, D., Bhargav, G.S.: A deep neural approach to kgqa via sparql silhouette generation. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2022)

    Google Scholar 

  18. Rony, M.R.A.H., Kumar, U., Teucher, R., Kovriguina, L., Lehmann, J.: SGPT: a generative approach for sparql query generation from natural language questions. IEEE Access 10, 70712–70723 (2022)

    Article  Google Scholar 

  19. Sabou, M., et al.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)

    Article  Google Scholar 

  20. Tan, Y., et al.: Evaluation of ChatGPT as a question answering system for answering complex questions. arXiv e-prints arXiv:2303.07992 (Mar 2023)

  21. Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: Lc-quad: a corpus for complex question answering over knowledge graphs. In: ISWC 2017 (2017)

    Google Scholar 

  22. Usbeck, R., Gusmita, R.H., Ngomo, A.C.N., Saleem, M.: 9th challenge on question answering over linked data (qald-9). In: Semdeep/NLIWoD@ISWC (2018)

    Google Scholar 

  23. Venugopal, V., Pai, S., Olivetti, E.: MATKG: the largest knowledge graph in materials science – entities, relations, and link prediction through graph representation learning. In: In AI4Mat Workshop in NeurIPS 2022 (2022)

    Google Scholar 

  24. Wei, J., et al.: Chain of thought prompting elicits reasoning in large language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022)

    Google Scholar 

  25. Yaghi, O.: Reticular chemistry in all dimensions. ACS Central Sci. 5(8), 1295–1300 (2019)

    Article  Google Scholar 

  26. Zhang, X., Liu, X., Li, X., Pan, D.: MMKG: an approach to generate metallic materials knowledge graph based on dbpedia and wikipedia. Comput. Phys. Commun. 211, 98–112 (2017)

    Article  Google Scholar 

  27. Zhao, X., et al.: Knowledge graph-empowered materials discovery. In: 1st Workshop on Knowledge Graph and Big Data collocated with 2021 IEEE International Conference on Big Data (Big Data) (2021)

    Google Scholar 

  28. Zhong, Q., Ding, L., Liu, J., Du, B., Tao, D.: Can ChatGPT understand too? a comparative study on ChatGPT and fine-tuned BERT. arXiv e-prints arXiv:2302.10198 (Feb 2023)

  29. Zou, L., Huang, R., Wang, H., Yu, J.X., He, W., Zhao, D.: Natural language question answering over rdf: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 313–324 (2014)

    Google Scholar 

Download references

Acknowledgments

This project is partially supported by the Drexel Office of Faculty Affairs’ 2022 Faculty Summer Research awards #284213, and the U.S. National Science Foundation Office of Advanced Cyberinfrastructure (OAC) Grant #1940239, #1940307, and #2118201.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan An .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

An, Y. et al. (2024). Knowledge Graph Question Answering for Materials Science (KGQA4MAT). In: Garoufallou, E., Sartori, F. (eds) Metadata and Semantic Research. MTSR 2023. Communications in Computer and Information Science, vol 2048. Springer, Cham. https://doi.org/10.1007/978-3-031-65990-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-65990-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-65989-8

  • Online ISBN: 978-3-031-65990-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics