Definition
Exploratory data analysis (EDA) is an approach to data analysis that employs a number of different techniques to:
- 1.
Look at data to see what it seems to say.
- 2.
Uncover underlying structures.
- 3.
Isolate important variables.
- 4.
Detect outliers and other anomalies.
- 5.
Suggest suitable models for conventional statistics.
Key Points
The term “exploratory data analysis” was introduced by John W. Tukey who in [2] shows how simple graphical and quantitative techniques can be used to open-mindedly explore data.
Typical graphical techniques are:
- 1.
Plotting the raw data (e.g., stem-and-leaf diagrams, histograms, scatter plots)
- 2.
Plotting simple statistics (e.g., mean plots, box plots, residual plots)
- 3.
Positioning (multiple) plots to amplify cognition
Typical quantitative techniques are:
- 1.
Interval estimation
- 2.
Measures of location or of scale
- 3.
Shapes of distributions
Exploratory data analysis can help to improve the results of statistical hypothesis testing by forcing one to...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Berry MJA, Linoff GS. Mastering data mining. New York: Wiley; 2000.
Tukey JW. Exploratory data analysis. Reading: Addison Wesley; 1977.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Hinterberger, H. (2018). Exploratory Data Analysis. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1384
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1384
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering