Introduction and Summary
In recent years a powerful combination of database technologies, data mining techniques (see Data Mining) and analytics software have created vast new opportunities for data analysts and statisticians. For example, corporations have duly stored the results of their customer transactions in corporate databases for over a generation. There are, quite literally millions of records. Massively parallel engines can examine these data in heretofore unimagined ways. The potentials to understand customer profitability, develop better understandings of customers’ past needs and predict future ones, and to use those insights to develop new product niches are enormous.
Yet all is not well in the world of data analytics. Unlocking the mysteries data have to offer is difficult at best. And putting the discoveries to work can be even harder. One major reason is poor quality data. Bad data camouflage the hidden nuggets in data or, worse, send an analysis in the wrong direction...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences and Further Reading
Box GEP (1976) Science and statistics. J Am Stat Assoc 71:791–799
Box GEP, Draper NR (1987) Empirical model-building and response surfaces. Wiley, New York
English LP (1999) Improving data warehouse and business information quality: methods for reducing costs and increasing profits. Wiley, New York
Huang K-T, Lee YL, Wang RY (1999) Quality information and knowledge. Prentice Hall, New York
Redman TC (2001) Data quality: the field guide. Butterworth-Heinemann Digital Press, Boston, MA
Redman TC (2008) Data driven: profiting from your most important business asset. Harvard Business School Press, Boston, MA
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this entry
Cite this entry
Guess, F.G., Redman, T.c. (2011). Data Quality (Poor Quality Data: The Fly in the Data Analytics Ointment). In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-04898-2_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering