Skip to main content

Referential Integrity Under Uncertain Data

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12751))

Included in the following conference series:

  • 2571 Accesses

Abstract

Together with domain and entity integrity, referential integrity embodies the integrity principles of information systems. While relational databases address applications for data that is certain, modern applications require the handling of uncertain data. In particular, the veracity of big data and the complex integration of data from heterogeneous sources leave referential integrity vulnerable. We apply possibility theory to introduce the class of possibilistic inclusion dependencies. We show that our class inherits good computational properties from relational inclusion dependencies. In particular, we show that the associated implication problem is PSPACE-complete, but fixed-parameter tractable in the input arity. Combined with possibilistic keys and functional dependencies, our framework makes it possible to quantify the degree of trust in entities and relationships.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www-01.ibm.com/software/data/bigdata/.

References

  1. Abedjan, Z., Golab, L., Naumann, F., Papenbrock, T.: Data Profiling. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2018)

    Google Scholar 

  2. Atzeni, P., De Antonellis, V.: Relational Database Theory. Benjamin/Cummings (1993)

    Google Scholar 

  3. Balamuralikrishna, N., Jiang, Y., Koehler, H., Leck, U., Link, S., Prade, H.: Possibilistic keys. Fuzzy Sets Syst. 376, 1–36 (2019)

    Article  MathSciNet  Google Scholar 

  4. Bertoa, M.F., Burgueño, L., Moreno, N., Vallecillo, A.: Incorporating measurement uncertainty into OCL/UML primitive datatypes. Softw. Syst. Modeling 19(5), 1163–1189 (2019). https://doi.org/10.1007/s10270-019-00741-0

    Article  Google Scholar 

  5. Casanova, M.A., Fagin, R., Papadimitriou, C.H.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28(1), 29–59 (1984)

    Article  MathSciNet  Google Scholar 

  6. Chen, P.P.: The entity-relationship model - toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976)

    Article  Google Scholar 

  7. Cosmadakis, S.S., Kanellakis, P.C., Vardi, M.Y.: Polynomial-time implication problems for unary inclusion dependencies. J. ACM 37(1), 15–46 (1990)

    Article  MathSciNet  Google Scholar 

  8. Dimolikas, K., Zarras, A.V., Vassiliadis, P.: A study on the effect of a table’s involvement in foreign keys to its schema evolution. In: Dobbie, G., Frank, U., Kappel, G., Liddle, S.W., Mayr, H.C. (eds.) ER 2020. LNCS, vol. 12400, pp. 456–470. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62522-1_34

    Chapter  Google Scholar 

  9. Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer (2013)

    Google Scholar 

  10. Dubois, D., Prade, H.: Possibility theory. In: Meyers, R.A. (ed.) Computational Complexity: Theory. Techniques, and Applications, pp. 2240–2252. Springer, New York (2012)

    Google Scholar 

  11. Dürsch, F., et al.: Inclusion dependency discovery: an experimental evaluation of thirteen algorithms. In: Zhu, W., et al. (eds.) Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, 3–7 November, 2019, pp. 219–228 (2019)

    Google Scholar 

  12. Köhler, H., Link, S.: Inclusion dependencies and their interaction with functional dependencies in SQL. J. Comput. Syst. Sci. 85, 104–131 (2017)

    Article  MathSciNet  Google Scholar 

  13. Köhler, H., Link, S.: Possibilistic data cleaning. IEEE Trans. Knowl. Data Eng. (in press)

    Google Scholar 

  14. Levene, M., Vincent, M.W.: Justification for inclusion dependency normal form. IEEE Trans. Knowl. Data Eng. 12(2), 281–291 (2000)

    Article  Google Scholar 

  15. Link, S., Prade, H.: Possibilistic functional dependencies and their relationship to possibility theory. IEEE Trans. Fuzzy Syst. 24(3), 757–763 (2016)

    Article  Google Scholar 

  16. Link, S., Prade, H.: Relational database schema design for uncertain data. In: Mukhopadhyay, S., et al. (eds.) Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, 24–28 October, 2016, pp. 1211–1220. ACM (2016)

    Google Scholar 

  17. Link, S., Prade, H.: Relational database schema design for uncertain data. Inf. Syst. 84, 88–110 (2019)

    Article  Google Scholar 

  18. Lopes, S., Petit, J.-M., Toumani, F.: Discovering interesting inclusion dependencies: application to logical database tuning. Inf. Syst. 27(1), 1–19 (2002)

    Article  Google Scholar 

  19. Ma, S., Fan, W., Bravo, L.: Extending inclusion dependencies with conditions. Theor. Comput. Sci. 515, 64–95 (2014)

    Article  MathSciNet  Google Scholar 

  20. De Marchi, F., Petit, J.-M.: Approximating a set of approximate inclusion dependencies. In: Intelligent Information Processing and Web Mining, Proceedings of the International IIS: IIPWM’05 Conference held in Gdansk, Poland, 13–16 June, 2005, pp. 633–640 (2005)

    Google Scholar 

  21. De. Marchi, F., Petit, J.-M.: Semantic sampling of existing databases through informative armstrong databases. Inf. Syst. 32(3), 446–457 (2007)

    Google Scholar 

  22. Ordonez, C., García-García, J.: Referential integrity quality metrics. Decis. Support Syst. 44(2), 495–508 (2008)

    Article  Google Scholar 

  23. Roblot, T., Link, S.: Cardinality constraints and functional dependencies over possibilistic data. Data Knowl. Eng. 117, 339–358 (2018)

    Article  Google Scholar 

  24. Sadiq, S.W., et al.: Data quality: The role of empiricism. SIGMOD Rec. 46(4), 35–43 (2017)

    Article  Google Scholar 

  25. Thalheim, B.: Dependencies in relational databases, vol. 126. Teubner, Teubner-Texte zur Mathematik (1991)

    Book  Google Scholar 

  26. Vassiliadis, P., Kolozoff, M.-R., Zerva, M., Zarras, A.V.: Schema Evolution and Foreign Keys: Birth, Eviction, Change and Absence. In: Mayr, H.C., Guizzardi, G., Ma, H., Pastor, O. (eds.) ER 2017. LNCS, vol. 10650, pp. 106–119. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69904-2_9

    Chapter  Google Scholar 

  27. Wei, Z., Link, S.: DataProf: semantic profiling for iterative data cleansing and business rule acquisition. In: Das, G., Jermaine, C.M., Bernstein, P.A. (eds.) Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June, 2018, pp. 1793–1796 (2018)

    Google Scholar 

  28. Wei, Z., Link, S.: A Fourth Normal Form for Uncertain Data. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp. 295–311. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2_19

    Chapter  Google Scholar 

  29. Zhang, R., Indulska, M., Sadiq, S.W.: Discovering data quality problems - the case of repurposed data. Bus. Inf. Syst. Eng. 61(5), 575–593 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Link .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Link, S., Wei, Z. (2021). Referential Integrity Under Uncertain Data. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79382-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79381-4

  • Online ISBN: 978-3-030-79382-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics