Abstract
Recent advancements in transformer-based models have initiated research interests in investigating their ability to learn to perform reasoning tasks. However, most of the contexts used for this purpose are in practice very simple: generated from short (fragments of) first-order logic sentences with only a few logical operators and quantifiers. In this work, we construct the natural language dataset, DELTA\(_D\), using the description logic language \(\mathcal {ALCQ}\). DELTA\(_D\) contains 384K examples, and increases in two dimensions: i) reasoning depth, and ii) linguistic complexity. In this way, we systematically investigate the reasoning ability of a supervised fine-tuned DeBERTa-based model and of two large language models (GPT-3.5, GPT-4) with few-shot prompting. Our results demonstrate that the DeBERTa-based model can master the reasoning task and that the performance of GPTs can improve significantly even when a small number of samples is provided (9 shots). We open-source our code and datasets.
A. Poulis and E. Tsalapati—This work was performed while the author was with the AI Team, Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
Hence, Delta-closure can be calculated only for KBs that Inferred- OntologyGenerator can calculate all axioms and facts within t.
- 4.
DELTA\(_D\) also contains the justification for each answer to be used for future work or by the research community for other downstream tasks, such as proof generation.
- 5.
The complete Appendix can be accessed at https://github.com/angelosps/DELTA/blob/master/Appendix.pdf.
- 6.
- 7.
References
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003)
Bang, Y., et al.: A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity. CoRR abs/2302.04023 (2023). https://doi.org/10.48550/ARXIV.2302.04023
Clark, P., Tafjord, O., Richardson, K.: Transformers as soft reasoners over language. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020. pp. 3882–3890. ijcai.org (2020). https://doi.org/10.24963/ijcai.2020/537
Dalvi, B., et al.: Explaining answers with entailment trees. In: Conference on Empirical Methods in Natural Language Processing (2021). https://api.semanticscholar.org/CorpusID:233297051
Han, S., et al.: FOLIO: natural language reasoning with first-order logic. CoRR abs/2209.00840 (2022). https://doi.org/10.48550/ARXIV.2209.00840
He, P., Gao, J., Chen, W.: DeBERTaV3: improving DeBERTa using ELECTRA-style pre-training with gradient-disentangled embedding sharing. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, 1–5 May 2023. OpenReview.net (2023). https://openreview.net/pdf?id=sE7-XhLxHA
He, Y., Chen, J., Jimenez-Ruiz, E., Dong, H., Horrocks, I.: Language model analysis for ontology subsumption inference. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Findings of the Association for Computational Linguistics: ACL 2023. pp. 3439–3453. Association for Computational Linguistics, Toronto, Canada (2023). https://doi.org/10.18653/v1/2023.findings-acl.213, https://aclanthology.org/2023.findings-acl.213
Horridge, M., Drummond, N., Goodwin, J., Rector, A.L., Stevens, R., Wang, H.: The Manchester OWL syntax. In: Grau, B.C., Hitzler, P., Shankey, C., Wallace, E. (eds.) Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions, Athens, Georgia, USA, 10–11 November 2006. CEUR Workshop Proceedings, vol. 216. CEUR-WS.org (2006). https://ceur-ws.org/Vol-216/submission_9.pdf
Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Sheth, A., et al. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 323–338. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88564-1_21
Huang, J., Chang, K.C.: Towards reasoning in large language models: a survey. In: Rogers, A., Boyd-Graber, J.L., Okazaki, N. (eds.) Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, 9–14 July 2023, pp. 1049–1065. Association for Computational Linguistics (2023). https://doi.org/10.18653/V1/2023.FINDINGS-ACL.67
Liu, Y., et al.: Summary of ChatGPT-related research and perspective towards the future of large language models. Meta-Radiology 1(2), 100017 (2023). https://doi.org/10.1016/j.metrad.2023.100017
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Madusanka, T., Zahid, I., Li, H., Pratt-Hartmann, I., Batista-Navarro, R.: Not all quantifiers are equal: probing transformer-based language models’ understanding of generalised quantifiers. In: Bouamor, H., Pino, J., Bali, K. (eds.) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 8680–8692. Association for Computational Linguistics, Singapore (2023). https://aclanthology.org/2023.emnlp-main.536
Ontañón, S., Ainslie, J., Cvicek, V., Fisher, Z.: LogicInference: a new dataset for teaching logical inference to seq2seq models. CoRR abs/2203.15099 (2022). https://doi.org/10.48550/arXiv.2203.15099
OpenAI: GPT-4 technical report. CoRR abs/2303.08774 (2023). https://doi.org/10.48550/ARXIV.2303.08774
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020). http://jmlr.org/papers/v21/20-074.html
Rudolph, S.: Foundations of Description Logics, pp. 76–136. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23032-5_2
Saparov, A., He, H.: Language models are greedy reasoners: a systematic formal analysis of chain-of-thought. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, 1–5 May 2023. OpenReview.net (2023). https://openreview.net/pdf?id=qFVVBzXxR2V
Schlegel, V., Pavlov, K.V., Pratt-Hartmann, I.: Can transformers reason in fragments of natural language? In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022, pp. 11184–11199. Association for Computational Linguistics (2022). https://aclanthology.org/2022.emnlp-main.768
Tafjord, O., Dalvi, B., Clark, P.: ProofWriter: generating implications, proofs, and abductive statements over natural language. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, 1–6 August 2021. Findings of ACL, vol. ACL/IJCNLP 2021, pp. 3621–3634. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.findings-acl.317
Tang, X., et al.: Large language models are in-context semantic reasoners rather than symbolic reasoners. CoRR abs/2305.14825 (2023). https://doi.org/10.48550/ARXIV.2305.14825
Tian, J., Li, Y., Chen, W., Xiao, L., He, H., Jin, Y.: Diagnosing the first-order logical reasoning ability through LogicNLI. In: Moens, M., Huang, X., Specia, L., Yih, S.W. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event/Punta Cana, Dominican Republic, 7–11 November 2021, pp. 3738–3747. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.303
Tsalapati, E., et al.: Enhancing polymer electrolyte membrane fuel cell system diagnostics through semantic modelling. Expert Syst. Appl. 163, 113550 (2021). https://doi.org/10.1016/J.ESWA.2020.113550
Weston, J., Bordes, A., Chopra, S., Mikolov, T.: Towards AI-complete question answering: a set of prerequisite toy tasks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1502.05698
Yang, Z., Du, X., Mao, R., Ni, J., Cambria, E.: Logical reasoning over natural language as knowledge representation: a survey. CoRR abs/2303.12023 (2023). https://doi.org/10.48550/arXiv.2303.12023
Yu, F., Zhang, H., Wang, B.: Nature language reasoning, a survey. CoRR abs/2303.14725 (2023). https://doi.org/10.48550/ARXIV.2303.14725
Zhang, H., Li, L.H., Meng, T., Chang, K., den Broeck, G.V.: On the paradox of learning to reason from data. CoRR abs/2205.11502 (2022). https://doi.org/10.48550/arXiv.2205.11502
Acknowledgments
This work has been partially supported by project MIS 5154714 of the National Recovery and Resilience Plan Greece 2.0 funded by the European Union under the NextGenerationEU Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Poulis, A., Tsalapati, E., Koubarakis, M. (2025). Transformers in the Service of Description Logic-Based Contexts. In: Alam, M., Rospocher, M., van Erp, M., Hollink, L., Gesese, G.A. (eds) Knowledge Engineering and Knowledge Management. EKAW 2024. Lecture Notes in Computer Science(), vol 15370. Springer, Cham. https://doi.org/10.1007/978-3-031-77792-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-77792-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-77791-2
Online ISBN: 978-3-031-77792-9
eBook Packages: Computer ScienceComputer Science (R0)