Abstract
Boxplots are well-known exploratory charts used to extract meaningful information from batches of data at a glance. Their strength lies in their ability to summarize data retaining the key information, which also is a desirable property of symbolic variables. In this paper, boxplots are presented as a new kind of symbolic variable. In addition, two different approaches to measure distances between boxplot variables are proposed. The usefulness of these distances is illustrated by means of a hierarchical clustering of boxplot data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
BENJAMINI, Y. (1988): Opening the Box of a Boxplot. American Statistician, 42/4, 257–262.
BILLARD, L., and DIDAY, E. (2002): From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis. Journal of the American Statistical Association, 98/462, 991–999.
BOCK, H.H. and DIDAY, E. (2000): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information ¿From Complex Data. Springer-Verlag, Heidelberg.
FRIGGE, M., HOAGLIN, D. C., and IGLEWICZ, B. (1989): Some Implementations of the Boxplot. American Statistician, 43/1, 50–54.
HOAGLIN, D. C., IGLEWICZ, B., and TUKEY, J. W. (1986): Performance of Some Resistant Rules for Outlier Labeling. Journal of the American Statistical Association, 81/396, 991–999.
ICHINO, M., and YAGUCHI, H. (1994): Generalized Minkowski Metrics for Mixed Feature-Type Data Analysis. IEEE Transactions on Systems, Man and Cybernetics, 24/1, 698–708.
NIBLACK, W., BARBER, R., EQUITZ, W., FLICKNER, M.D., GLASMAN, E.H., PETKOVIC, D., YANKER, P., FALOUTSOS, C., TAUBIN, G., and HEIGHTS, Y. (1993): Querying images by content, using color, texture, and shape. SPIE Conference on Storage and Retrieval for Image and Video Databases, 1908, 173–187.
TRENKLER, D. (2002): Quantile-Boxplots. Communications in Statistics: Simulation and Computation, 31/1, 1–12.
TUKEY, J. W. (1977): Exploratory Data Analysis. Addison-Wesley, Reading.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Arroyo, J., Maté, C., Roque, A.MS. (2006). Hierarchical Clustering for Boxplot Variables. In: Batagelj, V., Bock, HH., Ferligoj, A., Žiberna, A. (eds) Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-34416-0_7
Download citation
DOI: https://doi.org/10.1007/3-540-34416-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34415-5
Online ISBN: 978-3-540-34416-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)