Skip to main content

BFASTDC: A Bitwise Algorithm for Mining Denial Constraints

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11029))

Included in the following conference series:

Abstract

Integrity constraints (ICs) are meant for many data management tasks. However, some types of ICs can express semantic rules that others ICs cannot, or vice versa. Denial constraints (DCs) are known to be a response to this expressiveness issue because they generalize important types of ICs, such as functional dependencies (FDs), conditional FDs, and check constraints. In this regard, automatic DC discovery is essential to avoid the expensive and error-prone task of manually designing DCs. FASTDC is an algorithm that serves this purpose, but it is highly sensitive to the number of records in the dataset. This paper presents BFASTDC, a bitwise version of FASTDC that uses logical operations to form the auxiliary data structures from which DCs are mined. Our experimental study shows that BFASTDC can be more than one order of magnitude faster than FASTDC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We have adapted binary search for procedure \(\textsc {Predecessors}(A_j,k)\).

  2. 2.

    Available at: http://da.qcri.org/dc/.

References

  1. Kandel, S., Paepcke, A., Hellerstein, J.M., Heer, J.: Enterprise data analysis and visualization: an interview study. IEEE TVCG 18(12), 2917–2926 (2012)

    Google Scholar 

  2. Abedjan, Z., Golab, L., Naumann, F.: Profiling relational data: a survey. VLDB J. 24(4), 557–581 (2015)

    Article  Google Scholar 

  3. Ayat, N., Afsarmanesh, H., Akbarinia, R., Valduriez, P.: Pay-as-you-go data integration using functional dependencies. In: Quirchmayr, G., Basl, J., You, I., Xu, L., Weippl, E. (eds.) CD-ARES 2012. LNCS, vol. 7465, pp. 375–389. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32498-7_28

    Chapter  MATH  Google Scholar 

  4. Fan, W.: Data quality: from theory to practice. SIGMOD Rec. 44(3), 7–18 (2015)

    Article  Google Scholar 

  5. Bertossi, L.: Database Repairing and Consistent Query Answering. Morgan & Claypool Publishers, San Rafael (2011)

    Google Scholar 

  6. Chu, X., Ilyas, I.F., Papotti, P.: Discovering denial constraints. Proc. VLDB Endow. 6(13), 1498–1509 (2013)

    Article  Google Scholar 

  7. Rekatsinas, T., Chu, X., Ilyas, I.F., Ré, C.: Holoclean: holistic data repairs with probabilistic inference. PVLDB Endow. 10(11), 1190–1201 (2017)

    Article  Google Scholar 

  8. Geerts, F., Mecca, G., Papotti, P., Santoro, D.: That’s all folks!: LLUNATIC goes open source. PVLDB 7, 1565–1568 (2014)

    Google Scholar 

  9. Liu, J., Li, J., Liu, C., Chen, Y.: Discover dependencies from data - a review. IEEE TKDE 24(2), 251–264 (2012)

    Google Scholar 

  10. Papenbrock, T., et al.: Functional dependency discovery: an experimental evaluation of seven algorithms. PVLDB 8(10), 1082–1093 (2015)

    Google Scholar 

  11. Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)

    Article  Google Scholar 

  12. Wyss, C., Giannella, C., Robertson, E.: FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances extended abstract. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 101–110. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44801-2_11

    Chapter  Google Scholar 

  13. Fan, W., Geerts, F., Li, J., Xiong, M.: Discovering conditional functional dependencies. IEEE TKDE 23(5), 683–698 (2011)

    Google Scholar 

  14. Caruccio, L., Deufemia, V., Polese, G.: Relaxed functional dependencies - a survey of approaches. IEEE TKDE 28(1), 147–165 (2016)

    Google Scholar 

  15. BleifuĂŸ, T., Kruse, S., Naumann, F.: Efficient denial constraint discovery with hydra. Proc. VLDB Endow. 11(3), 311–323 (2017)

    Article  Google Scholar 

  16. Fan, W., Geerts, F.: Foundations of Data Quality Management. Morgan & Claypool Publishers, San Rafael (2012)

    MATH  Google Scholar 

  17. Zhang, M., Hadjieleftheriou, M., Ooi, B.C., Procopiuc, C.M., Srivastava, D.: On multi-column foreign key discovery. PVLDB 3(1–2), 805–814 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eduardo H. M. Pena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pena, E.H.M., de Almeida, E.C. (2018). BFASTDC: A Bitwise Algorithm for Mining Denial Constraints. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2018. Lecture Notes in Computer Science(), vol 11029. Springer, Cham. https://doi.org/10.1007/978-3-319-98809-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98809-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98808-5

  • Online ISBN: 978-3-319-98809-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics