Skip to main content
Log in

Efficient order dependency detection

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Order dependencies (ODs) describe a relationship of order between lists of attributes in a relational table. ODs can help to understand the semantics of datasets and the applications producing them. They have applications in the field of query optimization by suggesting query rewrites. Also, the existence of an OD in a table can provide hints on which integrity constraints are valid for the domain of the data at hand. This work is the first to describe the discovery problem for order dependencies in a principled manner by characterizing the search space, developing and proving pruning rules, and presenting the algorithm Order, which finds all order dependencies in a given table. Order traverses the lattice of permutations of attributes in a level-wise bottom-up manner. In a comprehensive evaluation, we show that it is efficient even for various large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Figures 7 and 8 do not analogously show the number of ODs; there are too few, namely 3 and 0, respectively.

References

  1. Abedjan, Z., Golab, L., Naumann, F.: Profiling relational data: a survey. VLDB J. 24(4), 557–581 (2015)

    Article  Google Scholar 

  2. Abedjan, Ziawasch, Naumann, Felix: Advancing the discovery of unique column combinations. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), pp. 1565–1570, (2011)

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the International Conference on Very Large Databases (VLDB), pp. 487–499, (1994)

  4. De Marchi, F., Lopes, S., Petit, J.-M.: Unary and n-ary inclusion dependency discovery in relational databases. J. Intell. Inf. Syst. 32(1), 53–73 (2009)

    Article  Google Scholar 

  5. Dong, J., Hull, R.: Applying approximate order dependency to reduce indexing space. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 119–127, (1982)

  6. Ginsburg, S., Hull, R.: Order dependency in the relational model. Theoret. Comput. Sci. 26(1–2), 149–195 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  7. Golab, L., Karloff, H.J., Korn, F., Saha, A., Srivastava, D.: Sequential dependencies. Proc. VLDB Endow. 2(1), 574–585 (2009)

    Article  Google Scholar 

  8. Halbeisen, L., Hungerbühler, N.: Number theoretic aspects of a combinatorial function. Notes Numb. Theory Discrete Math. 5(4), 138–150 (1999)

    MathSciNet  MATH  Google Scholar 

  9. Heise, A., Quiané-Ruiz, J.-A., Abedjan, Z., Jentzsch, A., Naumann, F.: Scalable discovery of unique column combinations. Proc. VLDB Endow. 7(4), 301–312 (2013)

    Article  Google Scholar 

  10. Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)

    Article  MATH  Google Scholar 

  11. Lichman, M.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2013). http://archive.ics.uci.edu/ml. Accessed March 10, 2015

  12. Liu, J., Li, J., Liu, C., Chen, Y.: Discover dependencies from data—a review. IEEE Trans. Knowl Data Eng. 24(2), 251–264 (2012)

    Article  Google Scholar 

  13. Naumann, F.: Data profiling revisited. SIGMOD Rec. 42(4), 40–49 (2013)

    Article  Google Scholar 

  14. Ng, W.: Ordered functional dependencies in relational databases. Inf. Syst. 24(7), 535–554 (1999)

    Article  MATH  Google Scholar 

  15. Northwestern University. WikiTables: Public Site (2015). http://downey-n1.cs.northwestern.edu/public. Accessed March 10, 2015

  16. Papenbrock, T., Bergmann, T., Finke, M., Zwiener, J., Naumann, F.: Data profiling with Metanome. Proc. VLDB Endow. 8(12), 1860–1871 (2015)

    Article  Google Scholar 

  17. Papenbrock, T., Ehrlich, J., Marten, J., Neubert, T., Rudolph, J.-P., Schönberg, M., Zwiener, J., Naumann, F.: Functional dependency discovery: An experimental evaluation of seven algorithms. Proc. VLDB Endow. 8(10), 1082–1093 (2015)

    Article  Google Scholar 

  18. Sloane, N.J.A.: The On-Line Encyclopedia of Integer Sequences—A000522 (2015). http://oeis.org/A000522. Accessed March 10, 2015

  19. Szlichta, J., Godfrey, P., Gryz, J.: Chasing polarized order dependencies. In: Proceedings of the Alberto Mendelzon International Workshop on Foundations of Data Management (AMW), pp. 168–179, (2012)

  20. Szlichta, J., Godfrey, P., Gryz, J.: Fundamentals of order dependencies. Proc. VLDB Endow. 5(11), 1220–1231 (2012)

    Article  Google Scholar 

  21. Szlichta, J., Godfrey, P., Gryz, J., Ma, W., Qiu, W., Zuzarte, C.: Business-intelligence queries with order dependencies in DB2. In: Proceedings of the International Conference on Extending Database Technology (EDBT), pp. 750–761, (2014)

  22. Szlichta, J., Godfrey, P., Gryz, J., Zuzarte, C.: Expressiveness and complexity of order dependencies. Proc. VLDB Endow. 6(14), 1858–1869 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

We thank Ziawasch Abedjan for his numerous helpful comments, which improved this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Naumann.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Langer, P., Naumann, F. Efficient order dependency detection. The VLDB Journal 25, 223–241 (2016). https://doi.org/10.1007/s00778-015-0412-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-015-0412-3

Keywords

Navigation