Polynomial Kernel Discriminant Analysis for 2D visualization of classification problems

Alawadi, Sadi; Fernández-Delgado, Manuel; Mera, David; Barro, Senén

doi:10.1007/s00521-017-3290-3

Polynomial Kernel Discriminant Analysis for 2D visualization of classification problems

Original Article
Published: 28 November 2017

Volume 31, pages 3515–3531, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Sadi Alawadi¹,
Manuel Fernández-Delgado¹,
David Mera¹ &
…
Senén Barro¹

859 Accesses
6 Citations
Explore all metrics

Abstract

In multivariate classification problems, 2D visualization methods can be very useful to understand the data properties whenever they transform the n-dimensional data into a set of 2D patterns which are similar to the original data from the classification point of view. This similarity can be understood as that a classification method works similarly on the original n-dimensional and on the 2D mapped patterns, i.e., the classifier performance should not be much lower on the mapped than on the original patterns. We propose several simple and efficient mapping methods which allow to visualize classification problems in 2D. In order to preserve the structure about the original classification problem, the mappings minimize different class overlap measures, combined with different functions (linear, quadratic and polynomic of several degrees) from \({\mathbb {R}}^n\) to \({\mathbb {R}}^2\). They are also able to map into \({\mathbb {R}}^2\) new data points (out of sample), not used during the mapping learning. This is one of the main benefits of the proposed methods, since few supervised mappings offer a similar behavior. For 71 data sets of the UCI database, we compare the SVM performance using the original and the 2D mapped patterns. The comparison also includes other 34 popular supervised and unsupervised methods of dimensionality reduction, some of them used for the first time in classification. One of the proposed methods, the Polynomial Kernel Discriminant Analysis of degree 2 (PKDA2), outperforms the remaining mappings. Compared to the original n-dimensional patterns, PKDA2 achieves 82% of the performance (measured by the Cohen kappa), raising or keeping the performance for 26.8% of the data sets. For 36.6% of the data sets, the performance is reduced by less than 10%, and it is reduced by more than 20% only for 22.5% of the data sets. This low reduction in performance shows that the 2D maps created by PKDA2 really represent the original data, whose ability to be classified in 2D is highly preserved. Besides, PKDA is very fast, with times of the same order than LDA. The MATLAB code is available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Exploration of Kernel Functions Potential for Data Representation and Classification: A First Step Toward Interactive Interpretable Dimensionality Reduction

Article 08 December 2023

Discriminative Dimensionality Reduction for the Visualization of Classifiers

Data Classification Based on the Features Reduction and Piecewise Linear Separation

Notes

References

Agrafiotis D (2003) Stochastic proximity embedding. J Comput Chem 24(10):1215–1221
Article Google Scholar
Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial, vol 18. Institute for Signal and information Processing, Starkville
Google Scholar
Baudat G, Anouar F (2000) Generalized discriminant analysis using a kernel approach. Neural Comput 12(10):2385–2404
Article Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article MATH Google Scholar
Bengio Y, Paiement J, Vincent P, Delalleau O, Roux NL, Ouimet M (2004) Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering. In: Advances in neural information processing systems, vol 16, pp 177–184
Brand M (2002) Charting a manifold. In: Proceedings of neural information processing systems, pp 961–968
Buja A, Swayne D, Littman M, Dean H, Hofmann H, Chen L (2008) Data visualization with multidimensional scaling. J Comput Graph Stat 17(2):444–472
Article MathSciNet Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27
Article Google Scholar
Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. In: Proceedings IEEE conference on computer vision and pattern recognition, pp 3025–3032
Coifman R, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21(1):5–30
Article MathSciNet MATH Google Scholar
Cunningham J, Ghahramani Z (2015) Linear dimensionality reduction: survey, insights, and generalizations. J Mach Learn Res 16:2859–2900
MathSciNet MATH Google Scholar
Donoho D, Grimes C (2005) Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 102:7426–7431
Article MATH Google Scholar
Duda R, Hart P, Stork D (2001) Pattern classification, 2nd edn. Wiley-Interscience, Hoboken
MATH Google Scholar
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real classification problems? J Mach Learn Res 15:3133–3181
MathSciNet MATH Google Scholar
Globerson A, Roweis S (2006) Metric learning by collapsing classes. In: Advances in neural information processing systems, vol 18, pp 451–458
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2004) Neighborhood component analysis. In: Proceedings of neural information processing systems, pp 513–520
González-Rufino E, Carrión P, Cernadas E, Fernández-Delgado M, Domínguez-Petit R (2013) Exhaustive comparison of colour texture features and classification methods to discriminate cells categories in histological images of fish ovary. Pattern Recognit 46:2391–2407
Article Google Scholar
He X (2005) Locality preserving projections. Ph.D. thesis, University of Chicago
He X, Cai D, Yan S, Zhang H (2005) Neighborhood preserving embedding. In: Proceedings of IEEE international conference on computer vision, vol 2, pp 1208–1213
Hinton G, Roweis S (2002) Stochastic neighbor embedding. In: Proceedings of neural information processing systems, pp 833–840
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet MATH Google Scholar
Jianzhong W (2011) Geometric structure of high-dimensional data and dimensionality reduction, chap. Maximum variance unfolding. Springer, Berlin, pp 181–202
Google Scholar
Jolliffe I (2002) Principal component analysis. Wiley Online Library
Lanaaya H, Martin A, Aboutajdine D, Khenchaf AH (2005) A new dimensionality reduction method for seabed characterization: supervised curvilinear component analysis. In: Europe Oceans 2005, vol 1, pp 339–344
Lawrence N (2004) Gaussian process latent variable models for visualisation of high dimensional data. In: Advances in neural information processing systems, vol 16, pp 329–336
Lespinats S, Aupetit M, Meyer-Baese A (2015) Classimap: a new dimension reduction technique for exploratory data analysis of labeled data. Int J Pattern Recogn Artif Intell 29(06):1551008
Article Google Scholar
Lespinats S, Verleysen M, Giron A, Fertil B (2007) DD-HDS: a method for visualization and exploration of high-dimensional data. IEEE Trans Neural Netw 18(5):1265–1279
Article Google Scholar
Li C, Guo J (2006) Supervised Isomap with explicit mapping. In: Int Conf Innov Comput Inf Control, vol 3, pp 345–348
Lichman M (2013) UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Lisitsyn S, Widmer C, García F (2013) Tapkee: an efficient dimension reduction library. J Mach Learn Res 14:2355–2359. Software available at http://tapkee.lisitsyn.me
Maaten L (2007) An introduction to dimensionality reduction using Matlab. Technical report 2579-2605, Universiteit Maastricht. http://lvdmaaten.github.io/drtoolbox
Maaten L (2014) Accelerating t-SNE using tree-based algorithms. J Mach Learn Res 15(1):3221–3245
MathSciNet MATH Google Scholar
Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(2579–2605):85
MATH Google Scholar
Maaten L, Postma E, Herik H (2009) Dimensionality reduction: A comparative review. Technical report, Tilburg University. http://lvdmaaten.github.io/drtoolbox
Mika S, Ratsch G, Weston J, Schölkopf B, Mullers KR (1999) Fisher discriminant analysis with kernels. In: Proceedings of IEEE workshop in neural networks for signal processing, pp 41–48
Mthembu L, Greene J (2004) A comparison of three class separability measures. In: Proceedings of symposium of the Pattern Recognition Association of South Africa, pp 63–67
Ridder D, Kouropteva O, Okun O, Pietikainen M, Duin R (2003) Supervised locally linear embedding. In: Proceedings of joint international conference ICANN/ICONIP, Lecture Notes in Computer Science, vol 2714, pp 333–341. Springer, Berlin
Ridder D, Loog M, Reinders MJT (2004) Local Fisher embedding. In: Proceedings of the international conference on pattern recognition, vol 2, pp 295–298
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Article Google Scholar
Sammon J (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput 18(5):401–409
Article Google Scholar
Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Sha F, Saul L (2005) Analysis and extension of spectral methods for nonlinear dimensionality reduction. In: Proceedings of the international conference on machine learning, pp 784–791
Sheskin D (2006) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca Raton
MATH Google Scholar
Silva V, Tenenbaum J (2003) Global versus local methods in nonlinear dimensionality reduction. In: Advances in neural information processing systems, vol 15, pp 705–712
Spearman C (1904) General intelligence objectively determined and measured. Am J Psichol 15:206–221
Google Scholar
Teh Y, Roweis S (2002) Automatic alignment of hidden representations. In: Advances in neural information processing systems, vol 15, pp 841–848
Tenenbaum J, Silva VD, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Thornton C (1998) Separability is a learner’s best friend. In: Proceedings of the Neural Computation and Psychology Workshop. Springer, pp 40–46
Tipping M, Bishop C (1999) Probabilistic principal component analysis. J R Stat Soc Ser B 61:611–622
Article MathSciNet MATH Google Scholar
Torgerson W (1952) Multidimensional scaling: I. Theory and method. Psychometrika 17(4):401–419
Article MathSciNet MATH Google Scholar
Verbeek J (2006) Learning nonlinear image manifolds by global alignment of local linear models. IEEE Trans Pattern Anal Mach Intell 28(8):1236–1250
Article Google Scholar
Webb A (1995) Multidimensional scaling by iterative majorization using radial basis functions. Pattern Recogn 28(5):753–759
Article Google Scholar
Weinberger K, Saul L (2006) Unsupervised learning of image manifolds by semidefinite programming. Int J Comput Vis 70(1):77–90
Article Google Scholar
Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
MATH Google Scholar
Weinberger K, Sha F, Zhu Q, Saul L (2006) Graph Laplacian regularization for large-scale semidefinite programming. In: Advances in neural information processing systems, vol 19, pp 1489–1496
Zhang T, Yang J, Zhao D, Ge X (2007) Linear local tangent space alignment and application to face recognition. Neurocomputing 70(7–9):1547–1553
Article Google Scholar
Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via local tangent space alignment. SIAM J Sci Comput 26(1):313–338
Article MathSciNet MATH Google Scholar
Zhao L, Zhang Z (2009) Supervised locally linear embedding with probability-based distance for classification. Comput Math Appl 57(6):919–926
Article MATH Google Scholar

Download references

Funding

This work was funded by the program Erasmus Mundus Acción 2, Strand 1, Lot 2, PEACE II, with Project Code 2013-2443/001-001.

Author information

Authors and Affiliations

Centro Singular de Investigación en Tecnoloxías da Información da USC (CiTIUS), Universidade de Santiago de Compostela, Rúa Xenaro de la Fuente Domínguez, 15782, Santiago de Compostela, Spain
Sadi Alawadi, Manuel Fernández-Delgado, David Mera & Senén Barro

Authors

Sadi Alawadi
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Fernández-Delgado
View author publications
You can also search for this author in PubMed Google Scholar
David Mera
View author publications
You can also search for this author in PubMed Google Scholar
Senén Barro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Fernández-Delgado.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alawadi, S., Fernández-Delgado, M., Mera, D. et al. Polynomial Kernel Discriminant Analysis for 2D visualization of classification problems. Neural Comput & Applic 31, 3515–3531 (2019). https://doi.org/10.1007/s00521-017-3290-3

Download citation

Received: 21 February 2017
Accepted: 15 November 2017
Published: 28 November 2017
Issue Date: August 2019
DOI: https://doi.org/10.1007/s00521-017-3290-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Polynomial Kernel Discriminant Analysis for 2D visualization of classification problems

Abstract

Access this article

Similar content being viewed by others

Joint Exploration of Kernel Functions Potential for Data Representation and Classification: A First Step Toward Interactive Interpretable Dimensionality Reduction

Discriminative Dimensionality Reduction for the Visualization of Classifiers

Data Classification Based on the Features Reduction and Piecewise Linear Separation

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Polynomial Kernel Discriminant Analysis for 2D visualization of classification problems

Abstract

Access this article

Similar content being viewed by others

Joint Exploration of Kernel Functions Potential for Data Representation and Classification: A First Step Toward Interactive Interpretable Dimensionality Reduction

Discriminative Dimensionality Reduction for the Visualization of Classifiers

Data Classification Based on the Features Reduction and Piecewise Linear Separation

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation