Skip to main content
Log in

Classification of Multivariate Objects Using Interval Quantile Classes

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

The paper contains a proposal of interval data clustering related to given social and economic objects characterized by many interval variables. This multivariate approach is based on an original conception of interval quantiles constructed using a special definition derived from the notion of the Hausdorff distance. In order to improve the quality of classification, the obtained interval quantile classes can be next aggregated into larger merged classes. The efficiency of our method can be assessed using especially defined indices of entropy and volume coefficients. The second notion replaces the classical concept of area, which is not applicable in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BILLARD, L., and DIDAY, E. (2000), “Regression Analysis for Interval-Valued Data”, in Data Analysis, Classification and Related Methods, eds. H.A.L. Kiers, J.-P. Rasson, P.J.F. Groenen, and M. Schader, Berlin: Springer Verlag, pp. 369–374.

    Chapter  Google Scholar 

  • BILLARD, L., and DIDAY, E. (2002), “Symbolic Regression Analysis”, in Classification, Clustering and Data Analysis. Recent Advances and Applications, eds. K. Jajuga, A. Sokołowski and H.-H. Bock, Berlin–Heidelberg: Springer Verlag, pp. 281–288.

    Chapter  Google Scholar 

  • BEN-ISRAEL, A., and IYIGUN, C. (2008), “Probabilistic D-Clustering”, Journal of Classification, 25, 5–26.

    Article  MathSciNet  MATH  Google Scholar 

  • CHAVENT, M. (2004), “A Hausdorff Distance Between Hyper-Rectangles for Clustering Interval Data”, in Classification, Clustering and Data Mining Applications, D. Banks, L. House, F. McMorris, P. Arabie, and W. Gaul, Berlin–Heidelberg: Springer Verlag, pp. 333–339.

    Chapter  Google Scholar 

  • CHAVENT, M., DE CARVALHO, F.A.T., LECHEVALLIER, Y., and VERDE, R. (2006), “New Clustering Methods for Interval Data”, Computational Statistics, 21, 211–229.

    Article  MathSciNet  MATH  Google Scholar 

  • CHAVENT, M., and LECHEVALLIER, Y. (2002), “Dynamical Clustering of Interval Data: Optimization of an Adequacy Criterion Based on Hausdorff Distance”, in Classification, Clustering and Data Analysis. Recent Advances and Applications, eds. K. Jajuga, A. Sokołowski, and H.-H. Bock, Berlin–Heidelberg: Springer Verlag, pp. 53–60.

    Chapter  Google Scholar 

  • CSO (2007), Life Conditions of the Population in Poland in Years 2004–2005, Central Statistical Office of Poland, Department of Social Statistics, Warszawa. Available also at http://www.stat.gov.pl/cps/rde/xbcr/gus/PUBL_warunki_zycia_2004-2005.pdf.

  • DE CARVALHO, F.A.T. (2007), “Fuzzy C–means Clustering for Symbolic Interval Data”, Pattern Recognition Letters, 28, 423–427.

    Article  Google Scholar 

  • DE CARVALHO, F.A.T., BRITO, P., and BOCK, H.-H. (2006 a), “Dynamic Clustering for Interval Data Based on L2-Distance ”, Computational Statistics, 21, 231–250.

    Article  MathSciNet  MATH  Google Scholar 

  • DE CARVALHO, F.A.T., DE SOUZA, R.M.C.R., CHAVENT, M., and LECHEVALLIER, Y. (2006 b), “Adaptive Hausdorff Distances and Dynamic Clustering of Symbolic Interval Data”, Pattern Recognition Letters, 27, 167–179.

    Article  Google Scholar 

  • DE SOUZA, R.M.C.R., and DE CARVALHO, F.A.T. (2004), “Clustering of Interval Data Based on City-Block Distances”, Pattern Recognition Letters, 25, 353–365.

    Article  Google Scholar 

  • DENOEUD, L., and GUÉNOCHE, A. (2006), “Comparison of Distance Indices Between Partitions”, in Data Science and Classification, Studies in Classification, Data Analysis and Knowledge Organisation Series, eds. V. Batagelj, H.-H. Bock, A. Ferligoj, and A. Žiberna, Berlin–Heidelberg: Springer Verlag, pp. 21–28.

    Google Scholar 

  • DENNIS, I., and GUIO, A.-C. (2003), “Poverty and Social Exclusion in the EU after Laeken, Part 1–2”, in Population and Social Conditions, Statistics in Focus Series, Theme 3, No. 8–9., Luxembourg: European Communities, EUROSTAT.

    Google Scholar 

  • FLOREK, K., ŁUKASZEWICZ, J., PERKAL, J., STEINHAUS, H., and ZUBRZYCKI, S. (1951), “Sur la Liaison et la Division des Points d’un Ensemble Fini”, Colloquium Mathematicae, 2, 282–285.

    Google Scholar 

  • GIOIA, F., and LAURO, C.N. (2006), “Principal Component Analysis on Interval Data”, Computational Statistics, 21, 343–363.

    Article  MathSciNet  MATH  Google Scholar 

  • IRPINO, A., and VERDE, R. (2008), “Dynamic Clustering of Interval Data Using a Wasserstein-Based Distance”, Pattern Recognition Letters, 29, 1648–1658.

    Article  Google Scholar 

  • LI, B. (2006), “A New Approach to Cluster Analysis: The Clustering-Function Based Method”, Journal of the Royal Statistical Society, Series B (Statistical Methodology), 68, 457–475.

    Article  MathSciNet  MATH  Google Scholar 

  • MALINA, A., and ZELIAŚ, A. (1998), “On Building Taxonometric Measures on Living Conditions”, Statistics in Transition, 3, 523–544.

    Google Scholar 

  • MILLIGAN, G.W. (1980), “An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering Algorithms”, Psychometrika, 45, 325 – 342.

    Article  Google Scholar 

  • MŁODAK, A. (2002), “An Approach to the Problem of Spatial Differentiation of Multi–feature Objects Using Methods of Game Theory”, Statistics in Transition, 5, 857–872.

    Google Scholar 

  • MŁODAK, A. (2006), “Multilateral Normalizations of Diagnostic Features”, Statistics in Transition, 7, 1125–1139.

    Google Scholar 

  • MŁODAK, A. (2008), “Some Modification of the Simple Component Analysis”, Statistics in Transition – New Series, 9, 337–357.

    Google Scholar 

  • MOORE, R.E. (1966), Interval Analysis, New Jersey: Prentice Hall.

    MATH  Google Scholar 

  • MUNKRES, J. (1999), Topology (2nd ed.); New Jersey: Prentice Hall.

    Google Scholar 

  • ROUSSEEUW, P.J., and LEROY, A.M. (1987), Robust Regression and Outlier Detection, New York: John Wiley and Sons.

    Book  MATH  Google Scholar 

  • SYMMONS, M.J. (1981), “Clustering Criteria and Multivariate Normal Mixtures”, Biometrics, 37, 35–43.

    Article  MathSciNet  Google Scholar 

  • WAGNER, W., BŁAŻCZAK, P., and BUDKA, A. (2003), “Method of Spatial Units Sorting Using Quantile Spaces on a Correlation Graph”, unpublished manuscript (in Polish).

  • WARD, J. H. (1963), “Hierarchical Grouping to Optimize an Objective Function”, Journal of the American Statistical Association, 58, 236–244.

    Article  MathSciNet  Google Scholar 

  • WONG, M. A., and LANE, T. (1982), “A k-th Nearest Neighbor Clustering Procedure”, Journal of the Royal Statistical Society, Series B (Statistical Methodology), 45, 362–368.

    MathSciNet  Google Scholar 

  • ZELIAŚ, A. (2002), “Some Notes on the Selection of Normalization of Diagnostic Variables”, Statistics in Transition, 5, 787–802.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Młodak.

Additional information

Dedicated to the Memory of Prof. Dr. Wiesław Wagner and Dr. Piotr Błażczak.

I would like to express my gratitude to Mrs. Paula Brito, Associate Professor in Statistics and Data Analysis at the Faculty of Economics (Group of Mathematics and Informatics) of the University of Porto as well as to three anonymous referees for careful reading of my paper and for very detailed and useful comments and suggestions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Młodak, A. Classification of Multivariate Objects Using Interval Quantile Classes. J Classif 28, 327–362 (2011). https://doi.org/10.1007/s00357-011-9088-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-011-9088-6

Keywords

Navigation