Abstract
Retrieving information from relational databases using a natural language query is a challenging task. Usually, the natural language query is transformed into its approximate SQL or formal languages. However, this requires knowledge about database structures, semantic relationships, natural language constructs and also handling ambiguities due to overlapping column names and column values. We present a machine learning based natural language search system to query databases without any knowledge of Structure Query Language (SQL) for underlying database. The proposed system - Cascaded Conditional Random Field is an extension to Conditional Random Fields, an undirected graph model. Unlike traditional Conditional Random Field models, we offer efficient labelling schemes to realize enhanced quality of search results. The system uses text indexing techniques as well as database constraint relationships to identify hidden semantic relationships present in the data. The presented system is implemented and evaluated on two real-life datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jayapandian, M., Jagadish, H.V.: Automated creation of a forms-based database query interface. PVLDB 1(1), 695–709 (2008)
Mador-haim, S., Winter, Y., Braun, A.: Controlled language for geographical information system queries. In: Proceedings of Fifth International Workshop on Inference in Computational Semantics (2006)
Popescu, A.-M., Armanasu, A., Etzioni, O., Ko, D., Yates, A.: Modern natural language interfaces to databases: composing statistical parsing with semantic tractability. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, Morristown, NJ, USA, vol. 141. Association for Computational Linguistics (2004)
Popescu, A.-M., Etzioni, O., Kautz, H.: Towards a theory of natural language interfaces to databases. In: IUI 2003: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp. 149–157. ACM, New York (2003)
Sutton, C., McCallum, A., Rohanimanesh, K.: Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. J. Mach. Learn. Res. 8, 693–723 (2007)
Thompson, C.W., Ross, K.M., Tennant, H.R., Saenz, R.M.: Building usable menu-based natural language interfaces to databases. In: Schkolnick, M., Thanos, C. (eds.) VLDB, pp. 43–55. Morgan Kaufmann, San Francisco (1983)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Indukuri, K.V., Krishnamoorthy, S., Krishna, P.R. (2010). Natural Language Querying over Databases Using Cascaded CRFs. In: Catania, B., Ivanović, M., Thalheim, B. (eds) Advances in Databases and Information Systems. ADBIS 2010. Lecture Notes in Computer Science, vol 6295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15576-5_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-15576-5_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15575-8
Online ISBN: 978-3-642-15576-5
eBook Packages: Computer ScienceComputer Science (R0)