Skip to main content

Data Currency Assessment Through Data Mining

  • Conference paper
  • First Online:
Advances in Conceptual Modeling (ER 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9382))

Included in the following conference series:

  • 1191 Accesses

Abstract

The application of Data Mining (DM) techniques for DQ, often called Data Quality Mining (DQM), offers a wide range of possibilities for DQ assessment. The goal of this work is to propose a mechanism for data currency assessment using statistics and DM techniques. The proposed approach consists on estimating the validity period for the entities using a training set and then evaluating the probability of currency of the last known data value for each entity. The proposed scheme helps in two ways to lead to an always up-to-date database: it can warn if a certain data value is becoming obsolete, and it can inform the data manager about the best frequency for updating data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Redman, T.C.: Data: an unfolding quality disaster. DM Rev. Mag. 8 (2004)

    Google Scholar 

  2. Scannapieco, M., Missier, P., Batini, C.: Data quality at a glance. Datenbank Spektrum 14, 6–14 (2005)

    Google Scholar 

  3. Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Commun. ACM 40(5), 103–110 (1997)

    Article  Google Scholar 

  4. Pipino, L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  5. Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996)

    Article  Google Scholar 

  6. Scannapieco, M., Catarci, T.: Data quality under the computer science perspective. Arch. Comput. 2, 1–15 (2002)

    Google Scholar 

  7. Heinrich, B., Klier, M.: Assessing data currency: a probabilistic approach. J. Inform. Sci. 37, 86–100 (2011)

    Article  Google Scholar 

  8. Peralta, V., Ruggia R., Kedad, Z., Bouzeghoub, M.: A framework for data quality evaluation in a data integration system. In: 19th Brazilian Database Symposium (SBBD) (2004)

    Google Scholar 

  9. Bouzeghoub, M., Peralta, V.: A framework for analysis of data freshness. In: IQIS, Maison de la Chimie, Paris, France (2004)

    Google Scholar 

  10. Firestone, J.: Data mining and KDD: A shifting mosaic. White Paper (1997)

    Google Scholar 

  11. Grüning, F.: Data quality mining: employing classifiers for assuring consistent datasets. In: Proceedings of the 3rd International ICSC Symposium, ITEE, Oldenburg, Germany (2007)

    Google Scholar 

  12. Hipp, J., Güntzer, U., Grimmer, U.: Data quality mining, making a virtue of necessity. In: Proceedings of the 6th ACM SIGMOD Workshop, California, EEUU (2001)

    Google Scholar 

  13. Grimmer, U., Hinrichs, H.: A methodological approach to data quality management supported by data mining. In: Sixth International Conference on Information Quality (2003)

    Google Scholar 

  14. Farzi, S., Dastjerdi, A.B.: Data quality measurement using data mining. Int. J. Comput. Theory Eng. 2(1), 1793–8201 (2010)

    Google Scholar 

  15. Luebbers, D., Grimmer, U., Jarke, M.: Systematic development of data mining-based data quality tools. In: Proceedings of the 29th VLDB Conference, Berlin, Germany (2003)

    Google Scholar 

  16. Vázquez Soler, S., Yankelevich, D.: Quality mining: a data mining based method for data quality evaluation. In: Sixth International Conference on Information Quality (2003)

    Google Scholar 

  17. Dasu, T., Johnson, T.: Hunting of the snark: finding data glitches using data mining methods. In: Proceedings of the 1999 Conference on Information Quality, MIT (1999)

    Google Scholar 

  18. Maletic, J.I., Marcus, A.: Data cleansing: beyond integrity analysis. In: Proceedings of the 2000 Conference on Information Quality (2000)

    Google Scholar 

  19. Isaac, D., Lynnes, C.: Automated data quality assessment in the intelligent archive (2003)

    Google Scholar 

  20. Alizamini, F.G., Pedram, M.M., Alishahi, M., Badi, K.: Data quality improvement using fuzzy association rules. In: ICEIE (2010)

    Google Scholar 

  21. Fan, W., Geerts, F., Wijsen, J.: Determining the currency of data. ACM Trans. Database Syst. 37(4), 1–46 (2012). Article 25

    Article  Google Scholar 

  22. North, M.A.: Data mining for the masses. Free e-book published by Global Text Project (2012). http://globaltext.terry.uga.edu/booklist?cat=Computing

  23. The World Data Bank - Population, total http://data.worldbank.org/indicator/SP.POP.TOTL. Accessed 15 February 2015

  24. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  25. Machine Learning Group at the University of Waikato - Weka 3: Data Mining Software in Java - http://www.cs.waikato.ac.nz/~ml/weka/. Accessed 15 February 2015

  26. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference Washington DC, USA (1993)

    Google Scholar 

  27. Hipp, J., Gontzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining: a general survey and comparison. SIGKDD Explor. 2(1), 58–64 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Pio Alvarez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pio Alvarez, S., Marotta, A., Tansini, L. (2015). Data Currency Assessment Through Data Mining. In: Jeusfeld, M., Karlapalem, K. (eds) Advances in Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9382. Springer, Cham. https://doi.org/10.1007/978-3-319-25747-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25747-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25746-4

  • Online ISBN: 978-3-319-25747-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics