Abstract
Systems for assessing the classification complexity of a dataset have received increasing attention in research activities on pattern recognition. These systems typically aim at quantifying the overall complexity of a domain, with the goal of comparing different datasets. In this work, we propose a method for partitioning a dataset into regions of different classification complexity, so to highlight sources of complexity inside the dataset. Experiments have been carried out on relevant datasets, proving the effectiveness of the proposed method.
Notes
Of course, a similar definition can also be given for the dataset \(\mathbf {D}\).
References
Singh S (2003) Multiresolution estimates of classification complexity. IEEE Trans Pattern Anal Mach Intell 25:1534–1539
Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24:289–300
Ho TK, Baird HS (1994) Estimating the intrinsic difficulty of a recognition problem. In: Proceedings of the 12th International Conference on Pattern Recognition, pp. 178–183
Loftsgaardne D, Quesenberry C (1965) A non-parametric estimate of multivariate density functiond.o. loftsgaardne. Ann Math Stat 36:1049–1051
Fix E (1951) Discriminatory analysis: nonparametric discrimination: consistency properties. Technical Report. Project 21–49-004, Report Number 4, USAF School of Aviation Medicine, Randolf Field, Texas
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076
Alessandro S-JK, Magnani A, Boyd SP (2006) Robust fisher discriminant analysis. Advances in neural information processing systems. MIT Press, Cambridge
Ho TK, Baird HS (1998) Pattern classification with compact distribution maps. Comput Vis Image Underst 70:101–110
Smith F (1968) Pattern classifier design by linear programming. IEEE Trans Comput 17:367–372
Smith S, Jain A (1988) A test to determine the multivariate normality of a data set. IEEE Trans Pattern Anal Mach Intell 10:757–761
Hoekstra A, Duin RP (1996) On the nonlinearity of pattern classifiers. In: Proceedings of the 13th International Conference on Pattern Recognition (ICPR 96), pp. 271–275
Kohn A, Nakano L, Silva M (1996) A class discriminability measure based on feature space partitioning. Pattern Recognit 29(5):873–887
Li J, Han G, Wen J, Gao X (2011) Robust tensor subspace learning for anomaly detection. Int J Mach Learn Cybern 2(2):89–98
Li N, Guo G-D, Chen L-F, Chen S (2012) Optimal subspace classification method for complex data. Int J Mach Learn Cybern 4:163–171
Bernado-Mansilla E, Ho TK (2005) Domain of competence of xcs classifier system in complexity measurement space. Trans Evol Comput 9:82–104
Luengo J, Herrera F (2012) Shared domains of competence of approximate learning models using measures of separability of classes. Inf Sci 185(1):43–65
Sotoca JM, Mollineda RA, Sánchez JS (2006) A meta-learning framework for pattern classification by means of data complexity measures. Intel Artif 10(29):31–38
Luengo J, Fernández A, García S, Herrera F (2011) Addressing data complexity for imbalanced data sets: analysis of smote-based oversampling and evolutionary undersampling. Soft Comput 15(10):1909–1936
Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell 21:1137–1144
Armano G, Mascia F (2013) A novel method for partitioning feature spaces according to their inherent classification complexity. Int J Pattern Recognit Artif Intell 27(02):1350007
Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Multiple-Valued Logic Soft Comput 17(2–3):255–287
Bache K, Lichman M (2013) UCI machine learning repository. University of California, Irvine
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11:10–18
Acknowledgments
Emanuele Tamponi gratefully acknowledges Sardinia Regional Government for the financial support of his PhD scholarship (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, European Social Fund 2007–2013 - Axis IV Human Resources, Objective l.3, Line of Activity l.3.1.).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Armano, G., Tamponi, E. Experimenting multiresolution analysis for identifying regions of different classification complexity. Pattern Anal Applic 19, 129–137 (2016). https://doi.org/10.1007/s10044-014-0446-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-014-0446-y