Skip to main content
Log in

Possible Keys and Functional Dependencies

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

We study the combined class of possible keys and functional dependencies over Codd tables under NOT NULL constraints. These constraints can express significant application semantics of data that is compliant with the industry standard SQL. Three major contributions are made. Firstly, the PTIME-complete implication problem is characterized axiomatically by a finite set of Horn rules, algorithmically by a linear time decision procedure, and logically by goal and definite clauses under S-3 logic. Secondly, we establish structural and computational properties of Armstrong tables for the combined class. This includes an algorithm that is guaranteed to compute an Armstrong table of size at most quadratic in that of a minimum-sized table. Thirdly, we conduct an empirical evaluation of a GUI-based implementation of this algorithm. We conclude that Armstrong tables are an effective tool for recognizing domain semantics and should be exploited as early as possible during the acquisition of possible keys and functional dependencies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. www.dropbox.com/s/hdh7nvcxl7ys3jo/SQLSampler.zip

References

  1. Alattar M, Sali A (2019) Keys in relational databases with nulls and bounded domains. In: Advances in databases and information systems—23rd European conference, ADBIS 2019, Bled, Slovenia, September 8–11, 2019, Proceedings. pp 33–50

  2. Alattar M, Sali A (2020) Functional dependencies in incomplete databases with limited domains. In: Foundations of information and knowledge systems: 11th international symposium, FoIKS 2020, Dortmund, Germany, February 17–21, 2020, Proceedings. pp 1–21

  3. Alattar M, Sali A (2020) Strongly possible keys for SQL. J Data Semant 9(2–3):85–99

    Article  Google Scholar 

  4. Alattar M, Sali A (2021) Toward an axiomatization of strongly possible functional dependencies. Vietnam J Comput Sci 8(1):133–151

    Article  Google Scholar 

  5. Armstrong WW (1974) Dependency structures of database relationships. Inf Process 74:580–583

    Google Scholar 

  6. Atzeni P, Morfuni N (1986) Functional dependencies and constraints on null values in database relations. Inf Control 70(1):1–31

    Article  MathSciNet  MATH  Google Scholar 

  7. Beeri C, Dowd M, Fagin R, Statman R (1984) On the structure of Armstrong relations for functional dependencies. J ACM 31(1):30–46

    Article  MathSciNet  MATH  Google Scholar 

  8. Börger E, Grädel E, Gurevich Y (1997) The classical decision problem. Perspectives in mathematical logic. Springer, Berlin

    Book  MATH  Google Scholar 

  9. Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6):377–387

    Article  MATH  Google Scholar 

  10. Codd EF (1979) Extending the database relational model to capture more meaning. ACM Trans Database Syst 4(4):397–434

    Article  Google Scholar 

  11. Date CJ (1982) Null values in database management. In: Proceedings of the second British national conference on databases (BNCOD-2), University of Bristol, UK, 7–9 July 1982. pp 147–166

  12. Date CJ, Darwen H (1997) A guide to SQL standard, 4th edn. Addison-Wesley, Boston

    Google Scholar 

  13. Demetrovics J, Füredi Z, Katona GOH (1985) Minimum matrix representation of closure operations. Discrete Appl Math 11(2):115–128

    Article  MathSciNet  MATH  Google Scholar 

  14. Demetrovics J, Katona GOH (1981) Extremal combinatorial problems in relational data base. In: Fundamentals of computation theory, FCT’81, proceedings of the 1981 international FCT-conference, Szeged, Hungary, August 24–28, 1981. pp 110–119

  15. Demetrovics J, Katona GOH (1993) A survey of some combinatorial results concerning functional dependencies in database relations. Ann Math Artif Intell 7(1–4):63–82

    Article  MathSciNet  MATH  Google Scholar 

  16. Diederich J, Milton J (1988) New methods and fast algorithms for database normalization. ACM Trans Database Syst 13(3):339–365

    Article  MathSciNet  Google Scholar 

  17. Dowling WF, Gallier JH (1984) Linear-time algorithms for testing the satisfiability of propositional horn formulae. J Log Program 1(3):267–284

    Article  MathSciNet  MATH  Google Scholar 

  18. Fagin R (1977) Functional dependencies in a relational data base and propositional logic. IBM J Res Dev 21(6):543–544

    Article  MathSciNet  MATH  Google Scholar 

  19. Fagin R (1977) Multivalued dependencies and a new normal form for relational databases. ACM Trans Database Syst 2(3):262–278

    Article  Google Scholar 

  20. Fagin R (1982) Armstrong databases. Tech. Rep. RJ3440(40926), IBM Research Laboratory, San Jose, California, USA

  21. Ferrarotti F, Hartmann S, Le V, Link S (2011) Codd table representations under weak possible world semantics. In: Hameurlain A, Liddle SW, Schewe KD, Zhou X (eds) DEXA 2011, Part I, LNCS, vol 6860. Springer, Heidelberg, pp 125–139

    Google Scholar 

  22. Gottlob G, Pichler R, Wei F (2010) Tractable database design through bounded treewidth. Inf Syst 35(3):278–298

    Article  Google Scholar 

  23. Grant J (2008) Null values in SQL. SIGMOD Rec 37(3):23–25

    Article  Google Scholar 

  24. Guagliardo P, Libkin L (2017) Correctness of SQL queries on databases with nulls. SIGMOD Rec 46(3):5–16

    Article  Google Scholar 

  25. Hall N, Köhler H, Link S, Prade H, Zhou X (2015) Cardinality constraints on qualitatively uncertain data. Data Knowl Eng 99:126–150

    Article  Google Scholar 

  26. Hannula M, Link S (2018) Automated reasoning about key sets. In: Automated reasoning—9th international joint conference, IJCAR 2018, held as part of the federated logic conference, FloC 2018, Oxford, UK, July 14–17, 2018, Proceedings. pp 47–63

  27. Hartmann S (2001) On the implication problem for cardinality constraints and functional dependencies. Ann Math Artif Intell 33(2–4):253–307

    Article  MathSciNet  MATH  Google Scholar 

  28. Hartmann S, Kirchberg M, Link S (2012) Design by example for SQL table definitions with functional dependencies. VLDB J 21(1):121–144

    Article  Google Scholar 

  29. Hartmann S, Link S (2010) When data dependencies over SQL tables meet the Logics of Paradox and S-3. In: Proceedings to the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS). pp 317–326

  30. Hartmann S, Link S (2012) The implication problem of data dependencies over SQL table definitions. ACM Trans Database Syst 37(2):13

    Article  Google Scholar 

  31. Imielinski T, Lipski Jr W (1983) Incomplete information and dependencies in relational databases. In: SIGMOD’83, proceedings of annual meeting, San Jose, California, USA, May 23–26, 1983. pp 178–184

  32. Imielinski T, Lipski W Jr (1984) Incomplete information in relational databases. J ACM 31(4):761–791

    Article  MathSciNet  MATH  Google Scholar 

  33. Jones ND, Laaser WT (1976) Complete problems for deterministic polynomial time. Theor Comput Sci 3(1):105–117

    Article  MathSciNet  MATH  Google Scholar 

  34. Jr WL (1979) On semantic issues connected with incomplete information databases. ACM Trans Database Syst 4(3):262–296

  35. Katona GOH (1992) Combinatorial and algebraic results for database relations. In: Database theory—ICDT’92, 4th international conference, Berlin, Germany, October 14–16, 1992, Proceedings. pp 1–20

  36. Köhler H, Leck U, Link S, Zhou X (2016) Possible and certain keys for SQL. VLDB J 25(4):571–596

    Article  Google Scholar 

  37. Köhler H, Link S (2010) Armstrong axioms and Boyce–Codd-heath normal form under bag semantics. Inf Process Lett 110(16):717–724

    Article  MathSciNet  MATH  Google Scholar 

  38. Köhler H, Link S (2016) SQL schema design: foundations, normal forms, and normalization. In: Proceedings of the 2016 international conference on management of data, SIGMOD conference 2016, San Francisco, CA, USA, June 26–July 01, 2016. pp 267–279

  39. Köhler H, Link S (2018) SQL schema design: foundations, normal forms, and normalization. Inf Syst 76:88–113

    Article  Google Scholar 

  40. Köhler H, Link S, Zhou X (2015) Possible and certain SQL keys. PVLDB 8(11):1118–1129

    Google Scholar 

  41. Langeveldt WD, Link S (2010) Empirical evidence for the usefulness of Armstrong relations in the acquisition of meaningful functional dependencies. Inf Syst 35(3):352–374

    Article  Google Scholar 

  42. Le V, Link S, Ferrarotti F (2013) Effective recognition and visualization of semantic requirements by perfect SQL samples. In: Ng W, Storey VC, Trujillo J (eds) Conceptual modeling—32th international conference, ER 2013, Hong-Kong, China, November 11–13, 2013. Proceedings, lecture notes in computer science, vol 8217. Springer, pp 227–240

  43. Le V, Link S, Ferrarotti F (2015) Empirical evidence for the usefulness of Armstrong tables in the acquisition of semantically meaningful SQL constraints. Data Knowl Eng 98:74–103

    Article  Google Scholar 

  44. Levene M, Loizou G (1998) Axiomatisation of functional dependencies in incomplete relations. Theor Comput Sci 206(1–2):283–300

    Article  MathSciNet  MATH  Google Scholar 

  45. Liddle SW, Embley DW, Woodfield SN (1993) Cardinality constraints in semantic data models. Data Knowl Eng 11(3):235–270

    Article  MATH  Google Scholar 

  46. Lien E (1982) On the equivalence of database models. J ACM 29(2):333–362

    Article  MATH  Google Scholar 

  47. Link S (2008) Charting the completeness frontier of inference systems for multivalued dependencies. Acta Inf 45(7–8):565–591

    Article  MathSciNet  MATH  Google Scholar 

  48. Link S (2008) On the implication of multivalued dependencies in partial database relations. Int J Found Comput Sci 19(3):691–715

    Article  MathSciNet  MATH  Google Scholar 

  49. Link S (2012) Characterisations of multivalued dependency implication over undetermined universes. J Comput Syst Sci 78(4):1026–1044

    Article  MathSciNet  MATH  Google Scholar 

  50. Link S, Wei Z (2021) Logical schema design that quantifies update inefficiency and join efficiency. In: SIGMOD ’21: international conference on management of data, virtual event, China, June 20–25, 2021. pp 1169–1181

  51. Maier D (1983) The theory of relational databases. Computer Science Press, Rockville

    MATH  Google Scholar 

  52. Mannila H, Räihä KJ (1986) Design by example: an application of Armstrong relations. J Comput Syst Sci 33(2):126–141

    Article  MathSciNet  MATH  Google Scholar 

  53. Nguyen C (1987) Ein algebraischer Zugang zu mehrwertigen Logiken und Fuzzy-Logiken und ihre Anwendung auf relationale Datenbanken (mit unvollständigen Informationen). Ph.D. thesis, Dresden University of Technology, Germany

  54. Papenbrock T, Ehrlich J, Marten J, Neubert T, Rudolph J, Schönberg M, Zwiener J, Naumann F (2015) Functional dependency discovery: an experimental evaluation of seven algorithms. PVLDB 8(10):1082–1093

    Google Scholar 

  55. Roblot T, Hannula M, Link S (2018) Probabilistic cardinality constraints—validation, reasoning, and semantic summaries. VLDB J 27(6):771–795

    Article  Google Scholar 

  56. Schaerf M, Cadoli M (1995) Tractable reasoning via approximation. Artif Intell 74:249–310

    Article  MathSciNet  MATH  Google Scholar 

  57. Thalheim B (1989) On semantic issues connected with keys in relational databases permitting null values. Elektron Informationsverarb Kybern 25(1–2):11–20

    MathSciNet  Google Scholar 

  58. Thalheim B (1991) Dependencies in relational databases. Teubner, Stuttgart

    Book  MATH  Google Scholar 

  59. Thalheim B (1992) Fundamentals of cardinality constraints. In: Pernul G, Min Tjoa A (eds) Entity-relationship approach—ER’92, 11th international conference on the entity-relationship approach, Karlsruhe, Germany, October 7–9, 1992, Proceedings, vol 645. Lecture notes in computer science. Springer, Heidelberg, pp 7–23

  60. Thalheim B, Schewe K (2010) NULL ’value’ algebras and logics. In: Information modelling and knowledge bases XXII, 20th European–Japanese conference on information modelling and knowledge bases (EJC 2010), Jyväskylä, Finland, 31 May–4 June 2010. pp 354–367

  61. Tichler K (2004) Extremal theorems for databases. Ann Math Artif Intell 40(1–2):165–182

    Article  MathSciNet  MATH  Google Scholar 

  62. Vincent M, Liu J, Liu C (2004) Strong functional dependencies and their application to normal forms in XML. ACM Trans Database Syst 29(3):445–462

    Article  Google Scholar 

  63. Wei Z, Leck U, Link S (2019) Discovery and ranking of embedded uniqueness constraints. Proc VLDB Endow 12(13):2339–2352

    Article  Google Scholar 

  64. Wei Z, Link S (2019) Embedded functional dependencies and data-completeness tailored database design. PVLDB 12(11):1458–1470

    Google Scholar 

  65. Wei Z, Link S (2021) Embedded functional dependencies and data-completeness tailored database design. ACM Trans Database Syst 46(2):7:1–7:6

    Article  MathSciNet  Google Scholar 

  66. Zaniolo C (1984) Database relations with null values. J Comput Syst Sci 28(1):142–166

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research is supported by the Marsden Fund Council from New Zealand Government funding, and by the Natural Science Foundation of China (Grant No. 61472263).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Link.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koehler, H., Le, V. & Link, S. Possible Keys and Functional Dependencies. J Data Semant 10, 327–366 (2021). https://doi.org/10.1007/s13740-021-00135-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-021-00135-w

Keywords

Navigation