Abstract
The outcomes and quality of organizational decisions depend on the characteristics of the data available for making the decisions and on the value of the data in the decision-making process. Toward enabling management of these aspects of data in analytics, we introduce and investigate Data Readiness Level (DRL), a quantitative measure of the value of a piece of data at a given point in a processing flow. Our DRL proposal is a multidimensional measure that takes into account the relevance, completeness, and utility of data with respect to a given analysis task. This study provides a formalization of DRL in a structured-data scenario, and illustrates how knowledge of rules and facts, both within and outside the given data, can be used to identify those transformations of the data that improve its DRL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Due to the page limit, the details can be found in the online version [2] of this paper.
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, San Diego (1995)
Alborzi, F., Chirkova, R., Doyle, J., Fathi, Y: Determining query readiness for structured data. Technical Report (which is not a publication) TR-2015-6, NCSU, 2015. http://www.csc.ncsu.edu/research/tech/reports.php
Buneman, P., Jung, A., Ohori, A.: Using power domains to generalize relational databases. TCS 91(1), 23–55 (1991)
Codd, E.F.: Extending the database relational model to capture more meaning. ACM TODS 4(4), 397–434 (1979)
Codd, E.F.: Understanding relations (installment #7). FDT - Bull. ACM SIGMOD 7(3), 23–28 (1975)
Dallachiesa, M., Ebaid, A., Eldawy, A., Elmagarmid, A., Ilyas, I. F., Ouzzani, M., Tang, N.: NADEEF: a commodity data cleaning system. In: ACM SIGMOD, pp. 541–552 (2013)
Date, C.J.: Database in Depth - Relational Theory for Practitioners. OReilly, Sebastopol (2005)
Deshpande, O., Lamba, D.S., Tourn, M., Das, S., Subramaniam, S., Rajaraman, A., Doan, A.: Building, maintaining, and using knowledge bases: a report from the trenches. In: ACM SIGMOD, pp. 1209–1220 (2013)
Eppler, M.J.: Managing Information Quality: Increasing the Value of Information in Knowledge-intensive Products and Processes. Springer, Berlin (2006)
Gardyn, E.: A Data Quality Handbook for a Data Warehouse. Infrastructure. IQ, 267–290 (1997)
Grahne, G.: The Problem of Incomplete Information in Relational Databases. Springer, Berlin (1991)
Heinrich, B., Helfert, M.: Analyzing data quality investments in CRM: a model-based approach. In: Eighth International Conference on Information Quality, pp. 80–95 (2003)
Heinrich, B., Klier, M., Kaiser, M.: A procedure to develop metrics for currency and its application in CRM. J. Data Inf. Qual. 1(1), 5 (2009)
Hinrichs, H.: Datenqualitatsmanagement in Data Warehouse-Systemen. Ph.D. thesis, Universitat Oldenburg (2002)
Kulikowski, J.L.: Data quality assessment. In: Ferraggine, V.E., Doorn, J.H., Rivero, L.C. (eds.) Handbook of Research on Innovations in Database Technologies and Applications, 378–384. Hershey, PA (2009)
Libkin, L.: A semantics-based approach to design of query languages for partial informatio. In: Thalheim, B., Libkin, L. (eds.) Semantics in Databases. LNCS, vol. 1358, pp. 170–208. Springer, Berlin (1995)
Libkin, L.: Incomplete data: what went wrong, and how to fix it. In: PODS, 1–13. ACM (2014)
Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Comm. ACM 45(4), 211–218 (2002)
Reiter, R.: On closed world data bases. Logic Data Bases 33, 55–76 (1977)
Reiter, R.: Towards a logical reconstruction of relational database theory. Conceptual Model. 33, 191–233 (1982)
Teboul, J.: Managing Quality Dynamics. Prentice Hall, New York (1991)
Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)
Acknowledgment
This material is based upon work supported in whole or in part with funding from the Laboratory for Analytic Sciences (LAS). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the LAS and/or any agency or entity of the United States Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Alborzi, F., Chirkova, R., Doyle, J., Fathi, Y. (2015). Determining Query Readiness for Structured Data. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science(), vol 9263. Springer, Cham. https://doi.org/10.1007/978-3-319-22729-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-22729-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22728-3
Online ISBN: 978-3-319-22729-0
eBook Packages: Computer ScienceComputer Science (R0)