Abstract
The application of Data Mining (DM) techniques for DQ, often called Data Quality Mining (DQM), offers a wide range of possibilities for DQ assessment. The goal of this work is to propose a mechanism for data currency assessment using statistics and DM techniques. The proposed approach consists on estimating the validity period for the entities using a training set and then evaluating the probability of currency of the last known data value for each entity. The proposed scheme helps in two ways to lead to an always up-to-date database: it can warn if a certain data value is becoming obsolete, and it can inform the data manager about the best frequency for updating data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Redman, T.C.: Data: an unfolding quality disaster. DM Rev. Mag. 8 (2004)
Scannapieco, M., Missier, P., Batini, C.: Data quality at a glance. Datenbank Spektrum 14, 6–14 (2005)
Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Commun. ACM 40(5), 103–110 (1997)
Pipino, L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)
Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Commun. ACM 39(11), 86–95 (1996)
Scannapieco, M., Catarci, T.: Data quality under the computer science perspective. Arch. Comput. 2, 1–15 (2002)
Heinrich, B., Klier, M.: Assessing data currency: a probabilistic approach. J. Inform. Sci. 37, 86–100 (2011)
Peralta, V., Ruggia R., Kedad, Z., Bouzeghoub, M.: A framework for data quality evaluation in a data integration system. In: 19th Brazilian Database Symposium (SBBD) (2004)
Bouzeghoub, M., Peralta, V.: A framework for analysis of data freshness. In: IQIS, Maison de la Chimie, Paris, France (2004)
Firestone, J.: Data mining and KDD: A shifting mosaic. White Paper (1997)
Grüning, F.: Data quality mining: employing classifiers for assuring consistent datasets. In: Proceedings of the 3rd International ICSC Symposium, ITEE, Oldenburg, Germany (2007)
Hipp, J., Güntzer, U., Grimmer, U.: Data quality mining, making a virtue of necessity. In: Proceedings of the 6th ACM SIGMOD Workshop, California, EEUU (2001)
Grimmer, U., Hinrichs, H.: A methodological approach to data quality management supported by data mining. In: Sixth International Conference on Information Quality (2003)
Farzi, S., Dastjerdi, A.B.: Data quality measurement using data mining. Int. J. Comput. Theory Eng. 2(1), 1793–8201 (2010)
Luebbers, D., Grimmer, U., Jarke, M.: Systematic development of data mining-based data quality tools. In: Proceedings of the 29th VLDB Conference, Berlin, Germany (2003)
Vázquez Soler, S., Yankelevich, D.: Quality mining: a data mining based method for data quality evaluation. In: Sixth International Conference on Information Quality (2003)
Dasu, T., Johnson, T.: Hunting of the snark: finding data glitches using data mining methods. In: Proceedings of the 1999 Conference on Information Quality, MIT (1999)
Maletic, J.I., Marcus, A.: Data cleansing: beyond integrity analysis. In: Proceedings of the 2000 Conference on Information Quality (2000)
Isaac, D., Lynnes, C.: Automated data quality assessment in the intelligent archive (2003)
Alizamini, F.G., Pedram, M.M., Alishahi, M., Badi, K.: Data quality improvement using fuzzy association rules. In: ICEIE (2010)
Fan, W., Geerts, F., Wijsen, J.: Determining the currency of data. ACM Trans. Database Syst. 37(4), 1–46 (2012). Article 25
North, M.A.: Data mining for the masses. Free e-book published by Global Text Project (2012). http://globaltext.terry.uga.edu/booklist?cat=Computing
The World Data Bank - Population, total http://data.worldbank.org/indicator/SP.POP.TOTL. Accessed 15 February 2015
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Machine Learning Group at the University of Waikato - Weka 3: Data Mining Software in Java - http://www.cs.waikato.ac.nz/~ml/weka/. Accessed 15 February 2015
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference Washington DC, USA (1993)
Hipp, J., Gontzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining: a general survey and comparison. SIGKDD Explor. 2(1), 58–64 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pio Alvarez, S., Marotta, A., Tansini, L. (2015). Data Currency Assessment Through Data Mining. In: Jeusfeld, M., Karlapalem, K. (eds) Advances in Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9382. Springer, Cham. https://doi.org/10.1007/978-3-319-25747-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-25747-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25746-4
Online ISBN: 978-3-319-25747-1
eBook Packages: Computer ScienceComputer Science (R0)