Skip to main content

Efficient Query Reverse Engineering Using Table Fragments

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

  • 2253 Accesses

Abstract

Given an output table T that is the result of some unknown query on a database D, Query Reverse Engineering (QRE) computes one or more target query Q such that the result of Q on D is T. A fundamental challenge in QRE is how to efficiently compute target queries given its large search space. In this paper, we focus on the QRE problem for PJ\(^+\) queries, which is a more expressive class of queries than project-join queries by supporting antijoins as well as inner joins. To enhance efficiency, we propose a novel query-centric approach consisting of table partitioning, precomputation, and indexing techniques. Our experimental study demonstrates that our approach significantly outperforms the state-of-the-art solution by an average improvement factor of 120.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In contrast, our approach took 3s to reverse engineer this query (Sect. 6).

  2. 2.

    Although our experiments focus on queries with foreign-key joins (similar to all competing approaches  [8, 20]), our approach can be easily extended to reverse engineer PJ queries with non-foreign key join predicates. The main extension is to explicitly annotate the database schema graph with additional join edges.

  3. 3.

    Note that this example is not related to the example in Fig. 2.

  4. 4.

    We did not compare against FastQRE  [8] for two reasons. First, FastQRE supports only CPJ queries which are even more restrictive than PJ queries. Second, the code for FastQRE is not available, and its non-trivial implementation requires modification to a database system engine to utilize its query optimizer’s cost model for ranking candidate queries.

  5. 5.

    The code for STAR was based on a version obtained from the authors of [20].

References

  1. Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J.M., Silberschatz, A.: Learning and verifying quantified Boolean queries by example. In: PODS (2013)

    Google Scholar 

  2. Arenas, M., Diaz, G.I.: The exact complexity of the first-order logic definability problem. ACM TODS 41(2), 13:1–13:14 (2016)

    Article  MathSciNet  Google Scholar 

  3. Bonifati, A., Ciucanu, R., Staworko, S.: Learning join queries from user examples. ACM TODS 40, 1–38 (2016)

    Article  MathSciNet  Google Scholar 

  4. Das Sarma, A., Parameswaran, A., Garcia-Molina, H., Widom, J.: Synthesizing view definitions from data. In: ICDT (2010)

    Google Scholar 

  5. Gao, Y., Liu, Q., Chen, G., Zheng, B., Zhou, L.: Answering why-not questions on reverse top-k queries. PVLDB 8, 738–749 (2015)

    Google Scholar 

  6. He, Z., Lo, E.: Answering why-not questions on top-k queries. In: ICDE (2012)

    Google Scholar 

  7. He, Z., Lo, E.: Answering why-not questions on top-k queries. TKDE 26, 1300–1315 (2014)

    Google Scholar 

  8. Kalashnikov, D.V., Lakshmanan, L.V., Srivastava, D.: FastQRE: fast query reverse engineering. In: SIGMOD (2018)

    Google Scholar 

  9. Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8, 2158–2169 (2015)

    Google Scholar 

  10. Li, M., Chan, C.Y.: Efficient query reverse engineering using table fragments. Technical report (2019)

    Google Scholar 

  11. Liu, Q., Gao, Y., Chen, G., Zheng, B., Zhou, L.: Answering why-not and why questions on reverse top-k queries. VLDB J. 25, 867–892 (2016)

    Article  Google Scholar 

  12. Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., Bra, P.D.: External memory k-bisimulation reduction of big graphs. In: ACM CIKM, pp. 919–928 (2013)

    Google Scholar 

  13. Panev, K., Michel, S., Milchevski, E., Pal, K.: Exploring databases via reverse engineering ranking queries with paleo. PVLDB 13, 1525–1528 (2016)

    Google Scholar 

  14. Psallidas, F., Ding, B., Chakrabarti, K., Chaudhuri, S.: S4: top-k spreadsheet-style search for query discovery. In: SIGMOD (2015)

    Google Scholar 

  15. Shen, Y., Chakrabarti, K., Chaudhuri, S., Ding, B., Novik, L.: Discovering queries based on example tuples. In: SIGMOD (2014)

    Google Scholar 

  16. Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. PVLDB 10, 1394–1405 (2017)

    Google Scholar 

  17. Tran, Q.T., Chan, C.Y.: How to conquer why-not questions. In: SIGMOD (2010)

    Google Scholar 

  18. Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: SIGMOD (2009)

    Google Scholar 

  19. Weiss, Y.Y., Cohen, S.: Reverse engineering SPJ-queries from examples. In: PODS (2017)

    Google Scholar 

  20. Zhang, M., Elmeleegy, H., Procopiuc, C.M., Srivastava, D.: Reverse engineering complex join queries. In: SIGMOD (2013)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Meihui Zhang for sharing the code of STAR. This research is supported in part by MOE Grant R-252-000-A53-114.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chee-Yong Chan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Chan, CY. (2020). Efficient Query Reverse Engineering Using Table Fragments. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59419-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59418-3

  • Online ISBN: 978-3-030-59419-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics