Skip to main content

Determining Query Readiness for Structured Data

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9263))

Included in the following conference series:

Abstract

The outcomes and quality of organizational decisions depend on the characteristics of the data available for making the decisions and on the value of the data in the decision-making process. Toward enabling management of these aspects of data in analytics, we introduce and investigate Data Readiness Level (DRL), a quantitative measure of the value of a piece of data at a given point in a processing flow. Our DRL proposal is a multidimensional measure that takes into account the relevance, completeness, and utility of data with respect to a given analysis task. This study provides a formalization of DRL in a structured-data scenario, and illustrates how knowledge of rules and facts, both within and outside the given data, can be used to identify those transformations of the data that improve its DRL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Due to the page limit, the details can be found in the online version [2] of this paper.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, San Diego (1995)

    MATH  Google Scholar 

  2. Alborzi, F., Chirkova, R., Doyle, J., Fathi, Y: Determining query readiness for structured data. Technical Report (which is not a publication) TR-2015-6, NCSU, 2015. http://www.csc.ncsu.edu/research/tech/reports.php

  3. Buneman, P., Jung, A., Ohori, A.: Using power domains to generalize relational databases. TCS 91(1), 23–55 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  4. Codd, E.F.: Extending the database relational model to capture more meaning. ACM TODS 4(4), 397–434 (1979)

    Article  Google Scholar 

  5. Codd, E.F.: Understanding relations (installment #7). FDT - Bull. ACM SIGMOD 7(3), 23–28 (1975)

    Google Scholar 

  6. Dallachiesa, M., Ebaid, A., Eldawy, A., Elmagarmid, A., Ilyas, I. F., Ouzzani, M., Tang, N.: NADEEF: a commodity data cleaning system. In: ACM SIGMOD, pp. 541–552 (2013)

    Google Scholar 

  7. Date, C.J.: Database in Depth - Relational Theory for Practitioners. OReilly, Sebastopol (2005)

    MATH  Google Scholar 

  8. Deshpande, O., Lamba, D.S., Tourn, M., Das, S., Subramaniam, S., Rajaraman, A., Doan, A.: Building, maintaining, and using knowledge bases: a report from the trenches. In: ACM SIGMOD, pp. 1209–1220 (2013)

    Google Scholar 

  9. Eppler, M.J.: Managing Information Quality: Increasing the Value of Information in Knowledge-intensive Products and Processes. Springer, Berlin (2006)

    Google Scholar 

  10. Gardyn, E.: A Data Quality Handbook for a Data Warehouse. Infrastructure. IQ, 267–290 (1997)

    Google Scholar 

  11. Grahne, G.: The Problem of Incomplete Information in Relational Databases. Springer, Berlin (1991)

    Book  MATH  Google Scholar 

  12. Heinrich, B., Helfert, M.: Analyzing data quality investments in CRM: a model-based approach. In: Eighth International Conference on Information Quality, pp. 80–95 (2003)

    Google Scholar 

  13. Heinrich, B., Klier, M., Kaiser, M.: A procedure to develop metrics for currency and its application in CRM. J. Data Inf. Qual. 1(1), 5 (2009)

    Google Scholar 

  14. Hinrichs, H.: Datenqualitatsmanagement in Data Warehouse-Systemen. Ph.D. thesis, Universitat Oldenburg (2002)

    Google Scholar 

  15. Kulikowski, J.L.: Data quality assessment. In: Ferraggine, V.E., Doorn, J.H., Rivero, L.C. (eds.) Handbook of Research on Innovations in Database Technologies and Applications, 378–384. Hershey, PA (2009)

    Google Scholar 

  16. Libkin, L.: A semantics-based approach to design of query languages for partial informatio. In: Thalheim, B., Libkin, L. (eds.) Semantics in Databases. LNCS, vol. 1358, pp. 170–208. Springer, Berlin (1995)

    Chapter  Google Scholar 

  17. Libkin, L.: Incomplete data: what went wrong, and how to fix it. In: PODS, 1–13. ACM (2014)

    Google Scholar 

  18. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Comm. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  19. Reiter, R.: On closed world data bases. Logic Data Bases 33, 55–76 (1977)

    Google Scholar 

  20. Reiter, R.: Towards a logical reconstruction of relational database theory. Conceptual Model. 33, 191–233 (1982)

    Google Scholar 

  21. Teboul, J.: Managing Quality Dynamics. Prentice Hall, New York (1991)

    Google Scholar 

  22. Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996)

    Article  Google Scholar 

  23. Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)

    MATH  Google Scholar 

Download references

Acknowledgment

This material is based upon work supported in whole or in part with funding from the Laboratory for Analytic Sciences (LAS). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the LAS and/or any agency or entity of the United States Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farid Alborzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Alborzi, F., Chirkova, R., Doyle, J., Fathi, Y. (2015). Determining Query Readiness for Structured Data. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science(), vol 9263. Springer, Cham. https://doi.org/10.1007/978-3-319-22729-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22729-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22728-3

  • Online ISBN: 978-3-319-22729-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics