skip to main content
10.1145/2983323.2983801acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Relational Database Schema Design for Uncertain Data

Published: 24 October 2016 Publication History

Abstract

We investigate the impact of uncertainty on relational data\-base schema design. Uncertainty is modeled qualitatively by assigning to tuples a degree of possibility with which they occur, and assigning to functional dependencies a degree of certainty which says to which tuples they apply. A design theory is developed for possibilistic functional dependencies, including efficient axiomatic and algorithmic characterizations of their implication problem. Naturally, the possibility degrees of tuples result in a scale of different degrees of data redundancy. Scaled versions of the classical syntactic Boyce-Codd and Third Normal Forms are established and semantically justified in terms of avoiding data redundancy of different degrees. Classical decomposition and synthesis techniques are scaled as well. Therefore, possibilistic functional dependencies do not just enable designers to control the levels of data integrity and losslessness targeted but also to balance the classical trade-off between query and update efficiency. Extensive experiments confirm the efficiency of our framework and provide original insight into relational schema design.

References

[1]
W. W. Armstrong. Dependency structures of data base relationships. In IFIP Congress, pages 580--583, 1974.
[2]
C. Beeri and P. Bernstein. Computational problems related to the design of normal form relational schemas. ACM TODS, 4(1):30--59, 1979.
[3]
O. Benjelloun, A. D. Sarma, A. Y. Halevy, M. Theobald, and J. Widom. Databases with uncertainty and lineage. VLDB J., 17(2):243--264, 2008.
[4]
P. A. Bernstein. Synthesizing third normal form relations from functional dependencies. ACM TODS, 1(4):277--298, 1976.
[5]
N. A. Chaudhry, J. R. Moyne, and E. A. Rundensteiner. An extended database design methodology for uncertain data management. Inf. Sci., 121(1--2):83--112, 1999.
[6]
E. F. Codd. A relational model of data for large shared data banks. Commun. ACM, 13(6):377--387, 1970.
[7]
E. F. Codd. Further normalization of the database relational model. In Courant Computer Science Symposia 6: Data Base Systems, pages 33--64, 1972.
[8]
N. N. Dalvi and D. Suciu. Management of probabilistic data: foundations and challenges. In PODS, pages 1--12, 2007.
[9]
D. Dubois and H. Prade. Possibility theory. In R. A. Meyers, editor, Computational Complexity, pages 2240--2252. Springer New York, 2012.
[10]
R. Fagin. Multivalued dependencies and a new normal form for relational databases. ACM TODS, 2(3):262--278, 1977.
[11]
R. Fagin. Horn clauses and database dependencies. J. ACM, 29(4):952--985, 1982.
[12]
N. Hall, H. Köhler, S. Link, H. Prade, and X. Zhou. Cardinality constraints on qualitatively uncertain data. Data Knowl. Eng., 99:126--150, 2015.
[13]
A. K. Jha and D. Suciu. Probabilistic databases with markoViews. PVLDB, 5(11):1160--1171, 2012.
[14]
H. Köhler, U. Leck, S. Link, and H. Prade. Logical foundations of possibilistic keys. In JELIA, pages 181--195, 2014.
[15]
H. Köhler, U. Leck, S. Link, and X. Zhou. Possible and certain keys for SQL. VLDB J., 25(4):571--596, 2016.
[16]
H. Köhler and S. Link. SQL schema design: Foundations, normal forms, and normalization. In SIGMOD, pages 267--279, 2016.
[17]
H. Köhler, S. Link, and X. Zhou. Possible and certain SQL keys. PVLDB, 8(11):1118--1129, 2015.
[18]
H. Köhler, S. Link, and X. Zhou. Discovering meaningful certain keys from incomplete and inconsistent relations. IEEE Data Eng. Bull., 39(2):21--37, 2016.
[19]
S. Kolahi and L. Libkin. An information-theoretic analysis of worst-case redundancy in database design. ACM TODS, 35(1), 2010.
[20]
M. Levene and M. W. Vincent. Justification for inclusion dependency normal form. IEEE TKDE, 12(2):281--291, 2000.
[21]
S. Link and H. Prade. Relational database schema design for uncertain data. Technical Report CDMTCS-469, The University of Auckland, 2014.
[22]
S. Link and H. Prade. Possibilistic functional dependencies and their relationship to possibility theory. IEEE Trans. Fuzzy Systems, 24(3):757--763, 2016.
[23]
J. Rissanen. Independent components of relations. ACM TODS, 2(4):317--325, 1977.
[24]
A. D. Sarma, J. D. Ullman, and J. Widom. Schema design for uncertain databases. In AMW, 2009.
[25]
D. Suciu, D. Olteanu, C. Ré, and C. Koch. Probabilistic Databases. Morgan & Claypool Publishers, 2011.
[26]
D.-M. Tsou and P. C. Fischer. Decomposition of a relation scheme into Boyce-Codd normal form. SIGACT News, 14(3):23--29, 1982.
[27]
M. Vincent. Semantic foundations of 4NF in relational database design. Acta Inf., 36(3):173--213, 1999.

Cited By

View all
  • (2023)An Effective Framework for Enhancing Query Answering in a Heterogeneous Data LakeProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591637(770-780)Online publication date: 19-Jul-2023
  • (2020)Fuzzy Extensions of DatabasesFuzzy Approaches for Soft Computing and Approximate Reasoning: Theories and Applications10.1007/978-3-030-54341-9_17(191-200)Online publication date: 27-Oct-2020
  • (2019)Embedded functional dependencies and data-completeness tailored database designProceedings of the VLDB Endowment10.14778/3342263.334262612:11(1458-1470)Online publication date: 1-Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. axioms
  2. boyce-codd normal form
  3. data redundancy
  4. implication problem
  5. possibility theory
  6. third normal form

Qualifiers

  • Research-article

Conference

CIKM'16
Sponsor:
CIKM'16: ACM Conference on Information and Knowledge Management
October 24 - 28, 2016
Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)An Effective Framework for Enhancing Query Answering in a Heterogeneous Data LakeProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591637(770-780)Online publication date: 19-Jul-2023
  • (2020)Fuzzy Extensions of DatabasesFuzzy Approaches for Soft Computing and Approximate Reasoning: Theories and Applications10.1007/978-3-030-54341-9_17(191-200)Online publication date: 27-Oct-2020
  • (2019)Embedded functional dependencies and data-completeness tailored database designProceedings of the VLDB Endowment10.14778/3342263.334262612:11(1458-1470)Online publication date: 1-Jul-2019
  • (2019)A Fourth Normal Form for Uncertain DataAdvanced Information Systems Engineering10.1007/978-3-030-21290-2_19(295-311)Online publication date: 29-May-2019
  • (2019)Possibilistic Logic: From Certainty-Qualified Statements to Two-Tiered Logics – A Prospective SurveyLogics in Artificial Intelligence10.1007/978-3-030-19570-0_1(3-20)Online publication date: 6-May-2019
  • (2018)Handling Uncertainty in Relational Databases with Possibility Theory - A Survey of Different ModelingsScalable Uncertainty Management10.1007/978-3-030-00461-3_30(396-404)Online publication date: 11-Sep-2018
  • (2017)Probabilistic KeysIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.263334229:3(670-682)Online publication date: 1-Mar-2017
  • (2017)Contextual KeysConceptual Modeling10.1007/978-3-319-69904-2_22(266-279)Online publication date: 21-Oct-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media