Abstract
Together with domain and entity integrity, referential integrity embodies the integrity principles of information systems. While relational databases address applications for data that is certain, modern applications require the handling of uncertain data. In particular, the veracity of big data and the complex integration of data from heterogeneous sources leave referential integrity vulnerable. We apply possibility theory to introduce the class of possibilistic inclusion dependencies. We show that our class inherits good computational properties from relational inclusion dependencies. In particular, we show that the associated implication problem is PSPACE-complete, but fixed-parameter tractable in the input arity. Combined with possibilistic keys and functional dependencies, our framework makes it possible to quantify the degree of trust in entities and relationships.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abedjan, Z., Golab, L., Naumann, F., Papenbrock, T.: Data Profiling. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2018)
Atzeni, P., De Antonellis, V.: Relational Database Theory. Benjamin/Cummings (1993)
Balamuralikrishna, N., Jiang, Y., Koehler, H., Leck, U., Link, S., Prade, H.: Possibilistic keys. Fuzzy Sets Syst. 376, 1–36 (2019)
Bertoa, M.F., Burgueño, L., Moreno, N., Vallecillo, A.: Incorporating measurement uncertainty into OCL/UML primitive datatypes. Softw. Syst. Modeling 19(5), 1163–1189 (2019). https://doi.org/10.1007/s10270-019-00741-0
Casanova, M.A., Fagin, R., Papadimitriou, C.H.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28(1), 29–59 (1984)
Chen, P.P.: The entity-relationship model - toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976)
Cosmadakis, S.S., Kanellakis, P.C., Vardi, M.Y.: Polynomial-time implication problems for unary inclusion dependencies. J. ACM 37(1), 15–46 (1990)
Dimolikas, K., Zarras, A.V., Vassiliadis, P.: A study on the effect of a table’s involvement in foreign keys to its schema evolution. In: Dobbie, G., Frank, U., Kappel, G., Liddle, S.W., Mayr, H.C. (eds.) ER 2020. LNCS, vol. 12400, pp. 456–470. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62522-1_34
Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer (2013)
Dubois, D., Prade, H.: Possibility theory. In: Meyers, R.A. (ed.) Computational Complexity: Theory. Techniques, and Applications, pp. 2240–2252. Springer, New York (2012)
Dürsch, F., et al.: Inclusion dependency discovery: an experimental evaluation of thirteen algorithms. In: Zhu, W., et al. (eds.) Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, 3–7 November, 2019, pp. 219–228 (2019)
Köhler, H., Link, S.: Inclusion dependencies and their interaction with functional dependencies in SQL. J. Comput. Syst. Sci. 85, 104–131 (2017)
Köhler, H., Link, S.: Possibilistic data cleaning. IEEE Trans. Knowl. Data Eng. (in press)
Levene, M., Vincent, M.W.: Justification for inclusion dependency normal form. IEEE Trans. Knowl. Data Eng. 12(2), 281–291 (2000)
Link, S., Prade, H.: Possibilistic functional dependencies and their relationship to possibility theory. IEEE Trans. Fuzzy Syst. 24(3), 757–763 (2016)
Link, S., Prade, H.: Relational database schema design for uncertain data. In: Mukhopadhyay, S., et al. (eds.) Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, 24–28 October, 2016, pp. 1211–1220. ACM (2016)
Link, S., Prade, H.: Relational database schema design for uncertain data. Inf. Syst. 84, 88–110 (2019)
Lopes, S., Petit, J.-M., Toumani, F.: Discovering interesting inclusion dependencies: application to logical database tuning. Inf. Syst. 27(1), 1–19 (2002)
Ma, S., Fan, W., Bravo, L.: Extending inclusion dependencies with conditions. Theor. Comput. Sci. 515, 64–95 (2014)
De Marchi, F., Petit, J.-M.: Approximating a set of approximate inclusion dependencies. In: Intelligent Information Processing and Web Mining, Proceedings of the International IIS: IIPWM’05 Conference held in Gdansk, Poland, 13–16 June, 2005, pp. 633–640 (2005)
De. Marchi, F., Petit, J.-M.: Semantic sampling of existing databases through informative armstrong databases. Inf. Syst. 32(3), 446–457 (2007)
Ordonez, C., GarcÃa-GarcÃa, J.: Referential integrity quality metrics. Decis. Support Syst. 44(2), 495–508 (2008)
Roblot, T., Link, S.: Cardinality constraints and functional dependencies over possibilistic data. Data Knowl. Eng. 117, 339–358 (2018)
Sadiq, S.W., et al.: Data quality: The role of empiricism. SIGMOD Rec. 46(4), 35–43 (2017)
Thalheim, B.: Dependencies in relational databases, vol. 126. Teubner, Teubner-Texte zur Mathematik (1991)
Vassiliadis, P., Kolozoff, M.-R., Zerva, M., Zarras, A.V.: Schema Evolution and Foreign Keys: Birth, Eviction, Change and Absence. In: Mayr, H.C., Guizzardi, G., Ma, H., Pastor, O. (eds.) ER 2017. LNCS, vol. 10650, pp. 106–119. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69904-2_9
Wei, Z., Link, S.: DataProf: semantic profiling for iterative data cleansing and business rule acquisition. In: Das, G., Jermaine, C.M., Bernstein, P.A. (eds.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June, 2018, pp. 1793–1796 (2018)
Wei, Z., Link, S.: A Fourth Normal Form for Uncertain Data. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp. 295–311. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2_19
Zhang, R., Indulska, M., Sadiq, S.W.: Discovering data quality problems - the case of repurposed data. Bus. Inf. Syst. Eng. 61(5), 575–593 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Link, S., Wei, Z. (2021). Referential Integrity Under Uncertain Data. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-79382-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79381-4
Online ISBN: 978-3-030-79382-1
eBook Packages: Computer ScienceComputer Science (R0)