Abstract
Visualisation is usually one of the first steps in handling any data analysis problem. Visualisations are an intuitive way to discover inconsistencies, outliers, dependencies, interesting patterns and peculiarities in the data. However, due to modern computer technology, a vast number of visualisation techniques is available nowadays. Even if only simple scatterplots, plotting pairs of variables against each other, are considered, the number of scatterplots is too large for high-dimensional data to visually inspect each scatterplot. In this paper, we propose a system architecture called AVEDA (Automatic Visual Exploratory Data Analysis) which computes a large number of visualisations, filters out those ones that might contain special patterns and shows only these interesting visualisations to the user. The filtering process for the visualisations is based on statistical tests and statistical measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)
Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications. Springer, Berlin (1997)
Jolliffe, I.: Principal Component Analysis. Springer, New York (2002)
Soukup, T., Davidson, I.: Visual Data Mining: Techniques and Tools for Data Visualization and Mining. Wiley, New York (2002)
Morrison, A., Ross, G., Chalmers, M.: Fast multidimensional scaling through sampling, springs and interpolation. Information Visualization 2 (2003)
Rehm, F., Klawonn, F., Kruse, R.: MDS polar – a new approach for dimension reduction to visualize high dimensional data. In: Famili, A.F., Kook, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 316–327. Springer, Heidelberg (2005)
Lowe, D., Tipping, M.: Feed-forward neural networks topographic mapping for exploratory data analysis. Neural Computing and Applications 4, 83–95 (1996)
Scholz, M., Kaplan, F., Guy, C., Kopka, J., Selbig, J.: Non-linear pca: A missing data approach. Bioinformatics 21, 3887–3895 (2005)
Kolodyazhniy, V., Klawonn, F., Tschumitschew, K.: Neuro-fuzzy model for dimensionality reduction and its application. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 15, 571–593 (2007)
Friedman, J., Tukey, J.: A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers C-23, 881–890 (1974)
Diaconis, P., Freedman, D.: Asymptotics of graphical projection pursuit. The Annals of Statistics 17, 793–815 (1989)
Huber, P.: Projection pursuit. The Annals of Statistics 13, 435–475 (1985)
Friedman, J.: Exploratory projection pursuit. Journal of the American Statistical Assoc. 82, 249–266 (1987)
Hall, P.: On polynomial-based projection indices for exploratory projection pursuit. The Annals of Statistics 17, 589–605 (1989)
Cook, D., Buja, A., Cabrera, J.: Projection pursuit indices based on orthonormal function expansion. Journal of Computational and Graphical Statistics 2, 225–250 (1993)
Posse, C.: Projection pursuit exploratory data analysis. Computational Statistics and Data Analysis 20, 669–687 (1995)
Shaffer, J.P.: Multiple hypothesis testing. Ann. Rev. Psych 46, 561–584 (1995)
Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)
Hopkins, B.: A new method of determining the type of distribution of plant individuals. Annals of Botany 18, 213–226 (1954)
Leban, G., Bratko, I., Petrovic, U., Curk, T., Zupan, B.: VizRank: Finding informative data projections in functional genomics by machine learning. Bioinformatics 21, 413–414 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tschumitschew, K., Klawonn, F. (2009). AVEDA: Statistical Tests for Finding Interesting Visualisations. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2009. Lecture Notes in Computer Science(), vol 5711. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04595-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-04595-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04594-3
Online ISBN: 978-3-642-04595-0
eBook Packages: Computer ScienceComputer Science (R0)