Abstract
Ontology Matching (OM), is a critical task in knowledge integration, where aligning heterogeneous ontologies facilitates data interoperability and knowledge sharing. Traditional OM systems often rely on expert knowledge or predictive models, with limited exploration of the potential of Large Language Models (LLMs). We present the LLMs4OM framework, a novel approach to evaluate the effectiveness of LLMs in OM tasks. This framework utilizes two modules for retrieval and matching, respectively, enhanced by zero-shot prompting across three ontology representations: concept, concept-parent, and concept-children. Through comprehensive evaluations using 20 OM datasets from various domains, we demonstrate that LLMs, under the LLMs4OM framework, can match and even surpass the performance of traditional OM systems, particularly in complex matching scenarios. Our results highlight the potential of LLMs to significantly contribute to the field of OM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Algergawy, A., Babalou, S., Klan, F., König-Ries, B.: Ontology modularization with OAPT. J. Data Semant. 9(2), 53–83 (2020). https://doi.org/10.1007/s13740-020-00114-7, https://doi.org/10.1007/s13740-020-00114-7
Almazrouei, E., et al.: The falcon series of open language models (2023)
Alsentzer, E., et al.: Publicly available clinical BERT embeddings. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp. 72–78. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019). https://doi.org/10.18653/v1/W19-1909, https://www.aclweb.org/anthology/W19-1909
Amir, M., et al.: Truveta Mapper: a zero-shot ontology alignment framework (2023)
Cer, D., et al.: Universal sentence encoder (2018)
Chung, H.W., et al.: Scaling instruction-finetuned language models (2022)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019)
Dragisic, Z., Ivanova, V., Li, H., Lambrix, P.: Experiences from the anatomy track in the ontology alignment evaluation initiative. J. Biomed. Semant. 8(1), 56 (2017). https://doi.org/10.1186/s13326-017-0166-5
Efeoglu, S.: GraphMatcher: a graph representation learning approach for ontology matching. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3324, pp. 174–180. CEUR-WS.org (2022)
Euzenat, J., Shvaiko, P.: Ontology Matching. Springer Publishing Company, Incorporated, 2nd edn. (2013). https://doi.org/10.1007/978-3-642-38721-0
Euzenat, J., Meilicke, C., Stuckenschmidt, H., Shvaiko, P., Trojahn, C.: Ontology alignment evaluation initiative: six years of experience. J. Data Semant. 15, 158–192 (2011). https://doi.org/10.1007/978-3-642-22630-4_6
Fallatah, O., Zhang, Z., Hopfgartner, F.: A gold standard dataset for large knowledge graphs matching (2020). https://eprints.whiterose.ac.uk/173366/, 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (http://creativecommons.org/licenses/by/4.0)
Faria, D., Pesquita, C., Santos, E., Palmonari, M., Cruz, I.F., Couto, F.M.: The AgreementMakerLight ontology matching system. In: Meersman, R., et al. (eds.) OTM 2013. LNCS, vol. 8185, pp. 527–541. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41030-7_38
Faria, D., Silva, M.C., Cotovio, P., Ferraz, L., Balbi, L., Pesquita, C.: Results for Matcha and Matcha-DL in OAEI 2023. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 164–169. CEUR-WS.org (2023)
Gosselin, F., Zouaq, A.: SORBET: a Siamese network for ontology embeddings using a distance-based regression loss and BERT. In: Payne, T.R., et al. (eds.) The Semantic Web - ISWC 2023, pp. 561–578. Springer Nature Switzerland, Cham (2023)
Gu, A., Dao, T.: Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)
Harrow, I., et al.: Matching disease and phenotype ontologies in the ontology alignment evaluation initiative. J. Biomed. Semant. 8(1), 55 (2017). https://doi.org/10.1186/s13326-017-0162-9, https://doi.org/10.1186/s13326-017-0162-9
He, Y., Chen, J., Antonyrajah, D., Horrocks, I.: BERTMap: A BERT-based ontology alignment system (2022)
He, Y., Chen, J., Dong, H., Horrocks, I.: Exploring large language models for ontology alignment (2023)
He, Y., Chen, J., Dong, H., Jiménez-Ruiz, E., Hadian, A., Horrocks, I.: Machine learning-friendly biomedical datasets for equivalence and subsumption ontology matching. In: Sattler, U., et al. (eds.) The Semantic Web - ISWC 2022, pp. 575–591. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_33
Hertling, S., Paulheim, H.: OLaLa: ontology matching with large language models. In: Proceedings of the 12th Knowledge Capture Conference 2023, pp. 131–139. K-CAP ’23, Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3587259.3627571
Jiang, A.Q., et al.: Mistral 7B (2023)
Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: logic-based and scalable ontology matching. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 273–288. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_18
Karam, N., Khiat, A., Algergawy, A., Sattler, M., Weiland, C., Schmidt, M.: Matching biodiversity and ecology ontologies: challenges and evaluation results. Knowl. Eng. Rev. 35, e9 (2020). https://doi.org/10.1017/S0269888920000132, https://doi.org/10.1017/S0269888920000132
Labrak, Y., Bazoge, A., Morin, E., Gourraud, P.A., Rouvier, M., Dufour, R.: BioMistral: a collection of open-source pretrained large language models for medical domains (2024)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks (2021)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing (2021)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019)
Nas, E., Huschka, M.: MSE Benchmark. https://github.com/EngyNasr/MSE-Benchmark (2023)
Norouzi, S.S., Mahdavinejad, M.S., Hitzler, P.: Conversational ontology alignment with ChatGPT. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 61–66. CEUR-WS.org (2023)
Noy, N., Mcguinness, D.: Ontology development 101: a guide to creating your first ontology. Knowl. Syst. Lab. 32 (2001)
OpenAI: ChatGPT. https://openai.com/chat-gpt/ (2023). Accessed 5 May 2023
OpenAI: new and improved embedding model (2023). https://openai.com/blog/new-and-improved-embedding-model. Retrieved 15 Dec 2022
Osman, I., Ben Yahia, S., Diallo, G.: Ontology integration: approaches and challenging issues. Inf. Fus. 71, 38–63 (2021). https://doi.org/10.1016/j.inffus.2021.01.007
Peng, Y., Alam, M., Bonald, T.: Ontology matching using textual class descriptions. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 67–72. CEUR-WS.org (2023)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014). https://doi.org/10.3115/v1/D14-1162, https://aclanthology.org/D14-1162
Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using Siamese BERT-networks (2019)
Sammut, C., Webb, G.I. (eds.): TF–IDF, pp. 986–987. Springer US, Boston, MA (2010). https://doi.org/10.1007/978-0-387-30164-8_832
Sharma, A., Jain, S.: LSMatch and LSMatch-multilingual results for OAEI 2023. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 159–163. CEUR-WS.org (2023)
Shvaiko, P., Euzenat, J., Jiménez-Ruiz, E., Hassanzadeh, O., Trojahn, C. (eds.): Proceedings of the 18th International Workshop on Ontology Matching co-located with the 22nd International Semantic Web Conference (ISWC 2023), Athens, Greece, November 7, 2023, CEUR Workshop Proceedings, vol. 3591. CEUR-WS.org (2023)
da Silva, J., Revoredo, K., Baião, F., Lima, C.: ALIN results for OAEI 2023. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 140–145. CEUR-WS.org (2023)
Singh, A., D’Arcy, M., Cohan, A., Downey, D., Feldman, S.: SciRepEval: a multi-format benchmark for scientific document representations. ArXiv abs/2211.13308 (2022)
Sousa, G., Lima, R., Trojahn, C.: Combining word and sentence embeddings with alignment extension for property matching. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 91–96. CEUR-WS.org (2023)
Stephan, G., Pascal, H., Andreas, A.: Knowledge Representation and Ontologies, pp. 51–105. Springer, Berlin, Heidelberg (2007). https://doi.org/10.1007/3-540-70894-4_3
Team, M.N.: Introducing MPT-7B: a new standard for open-source, commercially usable LLMs (2023). www.mosaicml.com/blog/mpt-7b. Accessed 05 May 2023
Touvron, H., at el.: Llama 2: open foundation and fine-tuned chat models (2023)
Wang, Z.: AMD results for OAEI 2023. In: OM@ISWC. CEUR Workshop Proceedings, vol. 3591, pp. 146–153. CEUR-WS.org (2023)
Wang, Z.: Contextualized structural self-supervised learning for ontology matching (2023)
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing (2020)
Xue, L., et al.: ByT5: towards a token-free future with pre-trained byte-to-byte models (2022)
Zhang, X., Zhao, C., Wang, X.: A survey on knowledge representation in materials science and engineering: an ontological perspective. Computers in Industry 73, 8–22 (2015). https://doi.org/10.1016/j.compind.2015.07.005
Zheng, L., et al.: Judging LLM-as-a-judge with MT-bench and chatbot arena (2023)
Acknowledgments
We thank Nenad Krdzavac for valuable insights on a previous draft of this paper. This work was supported by the German BMBF project SCINEXT (ID 01lS22070), the European Research Council for ScienceGRAPH (GA ID: 819536), and German DFG for NFDI4DataScience (no. 460234259).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Babaei Giglou, H., D’Souza, J., Engel, F., Auer, S. (2025). LLMs4OM: Matching Ontologies with Large Language Models. In: Meroño Peñuela, A., et al. The Semantic Web: ESWC 2024 Satellite Events. ESWC 2024. Lecture Notes in Computer Science, vol 15344. Springer, Cham. https://doi.org/10.1007/978-3-031-78952-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-78952-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78951-9
Online ISBN: 978-3-031-78952-6
eBook Packages: Computer ScienceComputer Science (R0)