Abstract
Given an output table T that is the result of some unknown query on a database D, Query Reverse Engineering (QRE) computes one or more target query Q such that the result of Q on D is T. A fundamental challenge in QRE is how to efficiently compute target queries given its large search space. In this paper, we focus on the QRE problem for PJ\(^+\) queries, which is a more expressive class of queries than project-join queries by supporting antijoins as well as inner joins. To enhance efficiency, we propose a novel query-centric approach consisting of table partitioning, precomputation, and indexing techniques. Our experimental study demonstrates that our approach significantly outperforms the state-of-the-art solution by an average improvement factor of 120.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In contrast, our approach took 3s to reverse engineer this query (Sect. 6).
- 2.
Although our experiments focus on queries with foreign-key joins (similar to all competing approaches [8, 20]), our approach can be easily extended to reverse engineer PJ queries with non-foreign key join predicates. The main extension is to explicitly annotate the database schema graph with additional join edges.
- 3.
Note that this example is not related to the example in Fig. 2.
- 4.
We did not compare against FastQRE [8] for two reasons. First, FastQRE supports only CPJ queries which are even more restrictive than PJ queries. Second, the code for FastQRE is not available, and its non-trivial implementation requires modification to a database system engine to utilize its query optimizer’s cost model for ranking candidate queries.
- 5.
The code for STAR was based on a version obtained from the authors of [20].
References
Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J.M., Silberschatz, A.: Learning and verifying quantified Boolean queries by example. In: PODS (2013)
Arenas, M., Diaz, G.I.: The exact complexity of the first-order logic definability problem. ACM TODS 41(2), 13:1–13:14 (2016)
Bonifati, A., Ciucanu, R., Staworko, S.: Learning join queries from user examples. ACM TODS 40, 1–38 (2016)
Das Sarma, A., Parameswaran, A., Garcia-Molina, H., Widom, J.: Synthesizing view definitions from data. In: ICDT (2010)
Gao, Y., Liu, Q., Chen, G., Zheng, B., Zhou, L.: Answering why-not questions on reverse top-k queries. PVLDB 8, 738–749 (2015)
He, Z., Lo, E.: Answering why-not questions on top-k queries. In: ICDE (2012)
He, Z., Lo, E.: Answering why-not questions on top-k queries. TKDE 26, 1300–1315 (2014)
Kalashnikov, D.V., Lakshmanan, L.V., Srivastava, D.: FastQRE: fast query reverse engineering. In: SIGMOD (2018)
Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8, 2158–2169 (2015)
Li, M., Chan, C.Y.: Efficient query reverse engineering using table fragments. Technical report (2019)
Liu, Q., Gao, Y., Chen, G., Zheng, B., Zhou, L.: Answering why-not and why questions on reverse top-k queries. VLDB J. 25, 867–892 (2016)
Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., Bra, P.D.: External memory k-bisimulation reduction of big graphs. In: ACM CIKM, pp. 919–928 (2013)
Panev, K., Michel, S., Milchevski, E., Pal, K.: Exploring databases via reverse engineering ranking queries with paleo. PVLDB 13, 1525–1528 (2016)
Psallidas, F., Ding, B., Chakrabarti, K., Chaudhuri, S.: S4: top-k spreadsheet-style search for query discovery. In: SIGMOD (2015)
Shen, Y., Chakrabarti, K., Chaudhuri, S., Ding, B., Novik, L.: Discovering queries based on example tuples. In: SIGMOD (2014)
Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. PVLDB 10, 1394–1405 (2017)
Tran, Q.T., Chan, C.Y.: How to conquer why-not questions. In: SIGMOD (2010)
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: SIGMOD (2009)
Weiss, Y.Y., Cohen, S.: Reverse engineering SPJ-queries from examples. In: PODS (2017)
Zhang, M., Elmeleegy, H., Procopiuc, C.M., Srivastava, D.: Reverse engineering complex join queries. In: SIGMOD (2013)
Acknowledgements
We would like to thank Meihui Zhang for sharing the code of STAR. This research is supported in part by MOE Grant R-252-000-A53-114.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, M., Chan, CY. (2020). Efficient Query Reverse Engineering Using Table Fragments. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-59419-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)