Abstract
The goal of Text-to-SQL task is to map natural language queries into equivalent structured query languages(NL2SQL). On the WikiSQL dataset, the method used by the state-of-the-art models is to decouple the NL2SQL task into subtasks and then build a dedicated decoder for each subtask. There are some problems in this method, such as the model is too complicated, and the ability to learn the dependency between different subtasks is limited. To solve these problems, this paper innovatively introduces the sharing mechanism of multi-task learning into the NL2SQL task and realizes sharing by letting different subtasks share the same decoder. Firstly, sharing decoders for different subtasks can effectively reduce the complexity of the model, and at the same time, allows different subtasks to share knowledge during the training process so that the model can better learn the dependencies between different subtasks. This paper also designed a re-weighted loss to balance the complexity of the SELECT clause and the WHERE clause. We have evaluated the method in this article on the WikiSQL dataset. The experimental results show that the accuracy of the proposed model is better than state-of-the-art on the WikiSQL without execution guided decoding.
Similar content being viewed by others
References
Bogin B, Berant J, Gardner M (2019) Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4560–4565
Bogin B, Gardner M, Berant J (2019) Global Reasoning over Database Structures for Text-to-SQL Parsing[C]. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3650–3655.
Dong L, Lapata M (2016) Language to Logical Form with Neural Attention[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 33–43.
Dong L, Lapata M (2018) Coarse-to-Fine Decoding for Neural Semantic Parsing[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). : 731–742.
Guo T and Gao H (2019) Content Enhanced BERT-based Text-to-SQL Generation. arXiv preprint arXiv:1910.07179
He P, Mao Y, Chakrabarti K, et al. (2019) X-SQL: reinforce schema representation with context[J]. arXiv e-prints, arXiv: 1908.08113
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hwang W et al. (2019) A comprehensive exploration on wikisql with table-aware word contextualization. arXiv preprint arXiv:1902.01069,
Jianqiang MA et al. (2020) Mention Extraction and Linking for SQL Query Generation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6936–6942.
Kenton JDMWC, Toutanova LK (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT. 4171–4186.
Krishnamurthy J, Dasigi P, and Gardner M (2017) Neural semantic parsing with type constraints for semi-structured tables. in Proceedings of the 2017 Conf Empir Method Nat Language Process .
Liu X, He P, Chen W, et al. (2019) Multi-Task Deep Neural Networks for Natural Language Understanding[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. : 4487–4496.
Lyu Q et al. (2020) Hybrid ranking network for text-to-sql. arXiv preprint arXiv:2008.04759
Pennington J, Socher R and Manning CD (2014) Glove: Global vectors for word representation. in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)
Pires T, Schlinger E, Garrette D (2019) How Multilingual is Multilingual BERT?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. : 4996–5001.
Popescu A-M, Etzioni O, and Kautz H (2003) Towards a theory of natural language interfaces to databases. in Proceedings of the 8th international conference on Intelligent user interfaces
Qi K, et al. (2020) Multi-task MR Imaging with Iterative Teacher Forcing and Re-weighted Deep Learning. arXiv preprint arXiv:2011.13614
Sundermeyer M, Schlüter R and Ney H (2012) LSTM neural networks for language modeling. in Thirteenth annual conference of the international speech communication association
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Proces Syst 27:3104–3112
Xu X, Liu C, Song D (2018) SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning
Yu T et al (2018) TypeSQL: Knowledge-based type-aware neural text-to-SQL generation. In: 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018. Assoc Comput Linguistics (ACL):588–594
Zhong V, Xiong C, Socher R (2018) Seq2SQL: generating structured queries from natural language using reinforcement learning[J]
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wei, C., Huang, S. & Li, R. Enhance text-to-SQL model performance with information sharing and reweight loss. Multimed Tools Appl 81, 15205–15217 (2022). https://doi.org/10.1007/s11042-022-12573-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12573-0