Abstract
Sophisticated Text-to-SQL methods often face errors, such as schema-linking errors, join errors, nested errors, and group-by errors. To mitigate these, it’s crucial to filter out unnecessary tables and columns, focusing the language model on relevant ones. Previous methods have attempted to sort tables and columns based on relevance or directly identify necessary elements, but these approaches suffer from long training times, high costs with GPT-4 tokens, or poor schema linking performance. We propose a two-step schema linking method: first, generate an initial SQL query using the full database schema; then, extract the relevant tables and columns to form a concise schema. This method, tested with Code Llama and GPT-4, shows optimal performance compared to mainstream methods on the Spider dataset, reducing errors and improving efficiency in SQL generation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Katsogiannis-Meimarakis, G., Koutrika, G.: A survey on deep learning approaches for Text-to-SQL. VLDB J. 32(4), 905–936 (2023)
Yu, T., et al.: Spider: a large-scale human-labeled dataset for semantic parsing and Text-to-SQL task. In: Proceedings of EMNLP, Brussels, Belgium, Oct 31 - Nov 4, pp. 3911–3921 (2018)
Yu, T., et al.: TypeSQL: knowledge-based neural Text-to-SQL generation. In: Proceedings of NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, Vol. 2 (Short Papers), pp. 588–594 (2018)
Lei, W., et al.: Re-evaluating schema linking in Text-to-SQL. In: Proceedings of EMNLP 2020, pp. 6943–6954 (2020)
Wang, B., et al.: RAT-SQL: relation-aware schema encoding for Text-to-SQL parsers. In: Proceedings of ACL 2020, pp. 7567–7578 (2020)
Guo, J., et al.: Towards complex Text-to-SQL with intermediate representation. In: Proceedings of ACL 2019, Florence, Italy, Jul 28 - Aug 2, Vol. 1: Long Papers, pp. 4524–4535 (2019)
Li, H., et al.: RESDSQL: decoupling schema linking and parsing for Text-to-SQL. In: Proceedings of 37th AAAI Conference on Artificial Intelligence, pp. 13067–13075 (2023)
Pourreza, M., Rafiei, D.: DIN-SQL: decomposed in-context learning of Text-to-SQL with self-correction. CoRR arXiv:2304.11015 (2023)
Dong, X., et al.: C3: zero-shot Text-to-SQL with ChatGPT. CoRR arXiv:2307.07306 (2023)
Gao, D., et al.: Text-to-SQL empowered by large language models: a benchmark evaluation. CoRR arXiv:2308.15363 (2023)
Zhong, R., Yu, T., Klein, D.: Semantic evaluation for Text-to-SQL with distilled test suites. In: Proceedings of EMNLP 2020, pp. 396–411 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, S. et al. (2024). SQL-to-Schema Enhances Schema Linking in Text-to-SQL. In: Strauss, C., Amagasa, T., Manco, G., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2024. Lecture Notes in Computer Science, vol 14910. Springer, Cham. https://doi.org/10.1007/978-3-031-68309-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-68309-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-68308-4
Online ISBN: 978-3-031-68309-1
eBook Packages: Computer ScienceComputer Science (R0)