Skip to main content

An Integrated Interactive Framework for Natural Language to SQL Translation

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2023 (WISE 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14306))

Included in the following conference series:

  • 742 Accesses

Abstract

Numerous web applications rely on databases, yet the traditional database interface often proves inconvenient for effective data utilization. It is crucial to address the significant demand from a vast number of end users who seek the ability to input their requirements and obtain query results effortlessly. Natural Language (NL) Interfaces to Databases (NLIDBs) with interactive query mechanisms make databases accessible to end users and simultaneously retain user confidence in the results. This paper proposes an approach called IKnow-SQL for building interactive NLIDBs. IKnow-SQL introduces a unified framework for translation models to improve accuracy and increase interactivity. Specifically, IKnow-SQL first employs an underlying translation model to parse the semantics of a given NL query. By evaluating the model behavior, IKnow-SQL then recognizes the parts of the model output that may require human intervention. Next, IKnow-SQL presents clarifying questions to solicit and memorize user feedback until a polished result is obtained. Extensive experiments are performed to study IKnow-SQL on the public benchmark. The results show that the translation models can be effectively improved using IKnow-SQL with less user feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In accordance with [11], model predictions are the weighted sum of target embeddings over output probabilities.

  2. 2.

    We use the terms exact-match accuracy and translation accuracy interchangeably.

References

  1. Androutsopoulos, I., et al.: Natural language interfaces to databases - an introduction. Nat. Lang. Eng. 1(1), 29–81 (1995)

    Article  Google Scholar 

  2. Baik, C., et al.: Bridging the semantic gap with SQL query logs in natural language interfaces to databases. In: ICDE (2019)

    Google Scholar 

  3. Bogin, B., et al.: Representing schema structure with graph neural networks for text-to-SQL parsing. In: ACL (2019)

    Google Scholar 

  4. Cao, R., et al.: LGESQL: line graph enhanced text-to-SQL model with mixed local and non-local relations. In: ACL (2021)

    Google Scholar 

  5. Castaldo, N., Daniel, F., Matera, M., Zaccaria, V.: Conversational data exploration. In: Bakaev, M., Frasincar, F., Ko, I.-Y. (eds.) ICWE 2019. LNCS, vol. 11496, pp. 490–497. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19274-7_34

    Chapter  Google Scholar 

  6. Chaurasia, S., et al.: Dialog for language to code. In: IJCNLP, pp. 175–180 (2017)

    Google Scholar 

  7. Desolda, G., et al.: Rapid prototyping of chatbots for data exploration. In: BCNC, pp. 5–10 (2021)

    Google Scholar 

  8. Fan, Y., et al.: Gar: a generate-and-rank approach for natural language to SQL translation. In: ICDE (2023)

    Google Scholar 

  9. Feng, L., Lu, H.: Integrating database and world wide web technologies. WWWJ 1(2), 73–86 (1998)

    Google Scholar 

  10. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML, vol. 48, pp. 1050–1059 (2016)

    Google Scholar 

  11. Goyal, K., et al.: Differentiable scheduled sampling for credit assignment. In: ACL, pp. 366–371 (2017)

    Google Scholar 

  12. Grave, E., et al.: Improving neural language models with a continuous cache. In: ICLR (2017)

    Google Scholar 

  13. Guo, J., et al.: Towards complex text-to-SQL in cross-domain database with intermediate representation. In: ACL (2019)

    Google Scholar 

  14. Gur, I., et al.: DialSQL: dialogue based structured query generation. In: ACL (2018)

    Google Scholar 

  15. He, H., et al.: Towards deeper understanding of the search interfaces of the deep web. WWWJ 2, 133–155 (2007)

    Google Scholar 

  16. He, L., et al.: Human-in-the-loop parsing. In: EMNLP, pp. 2337–2342 (2016)

    Google Scholar 

  17. Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: ICLR (2017)

    Google Scholar 

  18. Li, F., Jagadish, H.V.: Constructing an interactive natural language interface for relational databases. PVLDB 8(1), 73–84 (2014)

    Google Scholar 

  19. Li, J., et al.: Graphix-T5: mixing pre-trained transformers with graph-aware layers for text-to-SQL parsing. In: AAAI, pp. 13076–13084 (2023)

    Google Scholar 

  20. Lin, X.V., et al.: Bridging textual and tabular data for cross-domain text-to-SQL semantic parsing. In: EMNLP (2020)

    Google Scholar 

  21. Nakatsuji, M., et al.: Knowledge-aware response selection with semantics underlying multi-turn open-domain conversations. In: World Wide Web, pp. 1–16 (2023)

    Google Scholar 

  22. Niehues, J., et al.: Modeling confidence in sequence-to-sequence models. In: INLG, pp. 575–583 (2019)

    Google Scholar 

  23. OpenAI: GPT-4 technical report. CoRR (2023)

    Google Scholar 

  24. Pourreza, M., Rafiei, D.: DIN-SQL: decomposed in-context learning of text-to-SQL with self-correction. CoRR (2023)

    Google Scholar 

  25. Saha, D., et al.: ATHENA: an ontology-driven system for natural language querying over relational data stores. PVLDB 9(12), 1209–1220 (2016)

    Google Scholar 

  26. Scholak, T., et al.: PICARD: parsing incrementally for constrained auto-regressive decoding from language models. In: EMNLP (2021)

    Google Scholar 

  27. Sen, J., et al.: ATHENA++: natural language querying for complex nested SQL queries. PVLDB 13(11), 2747–2759 (2020)

    Google Scholar 

  28. Shi, P., et al.: Learning contextual representations for semantic parsing with generation-augmented pre-training. In: AAAI (2021)

    Google Scholar 

  29. Simitsis, A., et al.: Précis: from unstructured keywords as queries to structured databases as answers. PVLDB 17(1), 117–149 (2008)

    Google Scholar 

  30. Su, Y., et al.: Natural language interfaces with fine-grained user interaction: a case study on web APIs. In: SIGIR, pp. 855–864 (2018)

    Google Scholar 

  31. Sukhbaatar, S., et al.: End-to-end memory networks. In: NeurIPS, pp. 2440–2448 (2015)

    Google Scholar 

  32. Touvron, H., et al.: LLaMA: open and efficient foundation language models. CoRR (2023)

    Google Scholar 

  33. Wang, B., et al.: RAT-SQL: relation-aware schema encoding and linking for text-to-SQL parsers. In: ACL (2020)

    Google Scholar 

  34. Xu, X., et al.: SQLNet: generating structured queries from natural language without reinforcement learning. CoRR (2017)

    Google Scholar 

  35. Yao, Z., et al.: Interactive semantic parsing for if-then recipes via hierarchical reinforcement learning. In: AAAI, pp. 2547–2554 (2019)

    Google Scholar 

  36. Yao, Z., et al.: Model-based interactive semantic parsing: a unified framework and a text-to-SQL case study. In: EMNLP, pp. 5446–5457 (2019)

    Google Scholar 

  37. Yu, T., et al.: TypeSQL: knowledge-based type-aware neural text-to-SQL generation. In: NAACL (2018)

    Google Scholar 

  38. Yu, T., et al.: CoSQL: a conversational text-to-SQL challenge towards cross-domain natural language interfaces to databases. In: EMNLP (2019)

    Google Scholar 

  39. Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In: UAI (2005)

    Google Scholar 

  40. Zhang, R., et al.: Editing-based SQL query generation for cross-domain context-dependent questions. In: EMNLP, pp. 5337–5348 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuankai Fan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fan, Y. et al. (2023). An Integrated Interactive Framework for Natural Language to SQL Translation. In: Zhang, F., Wang, H., Barhamgi, M., Chen, L., Zhou, R. (eds) Web Information Systems Engineering – WISE 2023. WISE 2023. Lecture Notes in Computer Science, vol 14306. Springer, Singapore. https://doi.org/10.1007/978-981-99-7254-8_50

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7254-8_50

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7253-1

  • Online ISBN: 978-981-99-7254-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics