research-article

A Natural Language and Interactive End-to-End Querying and Reporting System

Authors:
Salil Rajeev Joshi

American Express AI Labs

American Express AI Labs
View Profile

,
Bharath Venkatesh

American Express AI Labs

American Express AI Labs
View Profile

,
Dawn Thomas

American Express AI Labs

American Express AI Labs
View Profile

,
Yue Jiao

American Express AI Labs

American Express AI Labs
View Profile

,
Shourya Roy

American Express AI Labs

American Express AI Labs
View Profile

CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMADJanuary 2020Pages 261–267https://doi.org/10.1145/3371158.3371198

Published:15 January 2020Publication History

CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD

Pages 261–267

ABSTRACT

Natural language query understanding for unstructured textual sources has seen significant progress over the last couple of decades. For structured data, while the ecosystem has evolved with regard to data storage and retrieval mechanisms, the query language has remained predominantly SQL (or SQL-like). Towards making the latter more natural there has been recent research emphasis on Natural Language Interface to DataBases (NLIDB) systems. Piggybacking on the rise of 'deep learning' systems, the state-of-the-art NLIDB solutions over large parallel and standard benchmarks (viz, WikiSQL and Spider) primarily rely on attention based sequence-to-sequence models.

Building industry grade NLIDB solutions for making big data ecosystem accessible by truly natural and unstructured querying mechanism presents several challenges. These include lack of availability of parallel corpora, diversity in underlying data schema, wide variability in the nature of queries to context and dialog management in interactive systems. In this paper, we present an end-to-end system Query Enterprise Data (QED) towards making enterprise descriptive analytics and reporting easier and natural. We elaborate in detail how we addressed the challenges mentioned above and novel features such as handling incomplete queries in incremental fashion as well as highlight the role of an assistive user interface that provides a better user experience. Finally, we conclude the paper with observations and lessons learnt from the experience of transferring and deploying a research solution to industry grade practical deployment.

References

Katrin Affolter, Kurt Stockinger, and Abraham Bernstein. 2019. A Comparative Survey of Recent Natural Language Interfaces for Databases. arXiv preprint arXiv:1906.08990 (2019).Google Scholar
Ion Androutsopoulos, Graeme D Ritchie, and Peter Thanisch. 1995. Natural language interfaces to databases--an introduction. Natural language engineering 1, 1 (1995), 29--81.Google Scholar
Sonia Bergamaschi, Francesco Guerra, Matteo Interlandi, Raquel Trillo-Lado, and Yannis Velegrakis. 2013. QUEST: a keyword search system for relational data based on semantic and machine learning techniques. Proceedings of the VLDB Endowment 6, 12 (2013), 1222--1225.Google ScholarDigital Library
Lukas Blunschi, Claudio Jossen, Donald Kossmann, Magdalini Mori, and Kurt Stockinger. 2012. Soda: Generating sql for business users. Proceedings of the VLDB Endowment 5, 10 (2012), 932--943.Google ScholarDigital Library
Ben Bogin, Matt Gardner, and Jonathan Berant. 2019. Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing. arXiv preprint arXiv:1905.06241 (2019).Google Scholar
Li Dong and Mirella Lapata. 2016. Language to Logical Form with Neural Attention. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 33--43.Google ScholarCross Ref
Li Dong and Mirella Lapata. 2018. Coarse-to-Fine Decoding for Neural Semantic Parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 731--742.Google ScholarCross Ref
William A. Gale, Kenneth W. Church, and David Yarowsky. 1992. One Sense Per Discourse. In Proceedings of the Workshop on Speech and Natural Language (HLT '91). Association for Computational Linguistics, Stroudsburg, PA, USA, 233--237. https://doi.org/10.3115/1075527.1075579Google ScholarDigital Library
Shantanu Godbole and Shourya Roy. 2008. Text Classification, Business Intelligence, and Interactivity: Automating C-Sat Analysis for Services Industry. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 911--919. https://doi.org/10.1145/1401890.1401999Google ScholarDigital Library
Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, and Dongmei Zhang. 2019. Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation. arXiv preprint arXiv:1905.08205 (2019).Google Scholar
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer. 2017. Learning a Neural Semantic Parser from User Feedback. In 55th Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Fei Li and Hosagrahar V Jagadish. 2014. NaLIR: an interactive natural language interface for querying relational databases. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 709--712.Google ScholarDigital Library
Diptikalyan Saha, Avrilia Floratou, Karthik Sankaranarayanan, Umar Farooq Minhas, Ashish R Mittal, and Fatma Özcan. 2016. ATHENA: an ontology-driven system for natural language querying over relational data stores. Proceedings of the VLDB Endowment 9, 12 (2016), 1209--1220.Google ScholarDigital Library
Alkis Simitsis, Georgia Koutrika, and Yannis Ioannidis. 2008. Précis: from unstructured keywords as queries to structured databases as answers. The VLDB JournalâĂŤThe International Journal on Very Large Data Bases 17, 1 (2008), 117--149.Google Scholar
Dezhao Song, Frank Schilder, Charese Smiley, Chris Brew, Tom Zielund, Hiroko Bretz, Robert Martin, Chris Dale, John Duprey, Tim Miller, et al. 2015. TR discover: A natural language interface for querying and analyzing interlinked datasets. In International Semantic Web Conference. Springer, 21--37.Google ScholarDigital Library
Xiaojun Xu, Chang Liu, and Dawn Song. 2017. Sqlnet: Generating structured queries from natural language without reinforcement learning. arXiv preprint arXiv:1711.04436 (2017).Google Scholar
Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. 2016. Neural enquirer: learning to query tables in natural language. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. AAAI Press, 2308--2314.Google ScholarCross Ref
Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang, Dongxu Wang, Zifan Li, and Dragomir R. Radev. 2018. SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task. CoRR abs/1810.05237 (2018). arXiv:1810.05237 http://arxiv.org/abs/1810.05237Google Scholar
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3911--3921.Google ScholarCross Ref
Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR abs/1709.00103 (2017), 1--10.Google Scholar

Recommendations

Interactive natural language interface

To override the complexity of SQL, and to facilitate the manipulation of data in databases for common people (not SQL professionals), many researches have turned out to use natural language instead of SQL. The idea of using natural language instead of ...
Read More
Generic interactive natural language interface to databases (GINLIDB)
EC'09: Proceedings of the 10th WSEAS international conference on evolutionary computing

To override the complexity of SQL, and to facilitate the manipulation of data in databases for common people (not SQL professionals), many researches have turned out to use natural language instead of SQL. The idea of using natural language instead of ...
Read More
Natural language querying of databases

Natural language (NL) interfaces for database (DB) query formulation have always been recognized as a much-needed enhancement for DB end-users. NL systems, however, have shortcomings that have led some DB researchers to question their practicality. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD
January 2020
399 pages
ISBN:9781450377386
DOI:10.1145/3371158
General Chairs:
Vasudeva Varma,
Subbarao Kambhampati,
Program Chairs:
Arnab Bhattacharya,
Sriraam Natarajan,
Publications Chair:
Rishiraj Saha Roy
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 January 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Information Retrieval
NL2SQL
NLIDB
Natural Language Understanding
SQL
Semantic Parsing
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
CoDS COMAD 2020 Paper Acceptance Rate78of275submissions,28%Overall Acceptance Rate197of680submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 272
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Natural Language and Interactive End-to-End Querying and Reporting System

CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD

ABSTRACT

References

Cited By

Recommendations

Interactive natural language interface

Generic interactive natural language interface to databases (GINLIDB)

Natural language querying of databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Natural Language and Interactive End-to-End Querying and Reporting System

CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD

ABSTRACT

References

Cited By

Recommendations

Interactive natural language interface

Generic interactive natural language interface to databases (GINLIDB)

Natural language querying of databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media