Definition
Exploratory Data Analysis (EDA) is an approach to data analysis that employs a number of different techniques to:
- 1.
Look at data to see what it seems to say,
- 2.
Uncover underlying structures,
- 3.
Isolate important variables,
- 4.
Detect outliers and other anomalies,
- 5.
Suggest suitable models for conventional statistics.
Key Points
The term “Exploratory Data Analysis” was introduced by John W. Tukey who in [2] shows how simple graphical and quantitative techniques can be used to open-mindedly explore data.
Typical graphical techniques are
- 1.
Plotting the raw data (e.g., stem-and-leaf diagrams, histograms, scatter plots)
- 2.
Plotting simple statistics (e.g., mean plots, box plots, residual plots)
- 3.
Positioning (multiple) plots to amplify cognition
Typical quantitative techniques are
- 1.
Interval estimation
- 2.
Measures of location or of scale
- 3.
Shapes of distributions
Exploratory data analysis can help to improve the results of statistical hypothesis testing by forcing one to...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Berry M.J.A. and Linoff G.S. Mastering Data Mining. Wiley, New York, 2000.
Tukey J.W. Exploratory Data Analysis. Addison Wesley, Reading, MA, 1977.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Hinterberger, H. (2009). Exploratory Data Analysis. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_1384
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_1384
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering