Quasiconformal kernel common locality discriminant analysis with application to breast cancer diagnosis
Introduction
Dimensionality reduction (DR) is the most popular approach for feature extraction. DR has wide applications in computer vision, pattern recognition, gene expression, paleontology, etc. Current DR methods are categorized into supervised learning, such as the linear discriminant analysis (LDA), unsupervised learning, such as the principal component analysis (PCA), and other semi-supervised learning methods that have been proposed in previous works [37], [35], [24].
In recent years, kernel-based methods have attracted attentions in pattern recognition area [9], [10], [22]. Recent many linear methods are proposed, for example, the kernel principal component analysis (KPCA) [26], [27], kernel discriminant analysis (KDA) [28], [18] and improved kernel-based learning methods [22], [34], and the support vector machine (SVM) [30].
Recently, many researchers have focused more on manifold learning, which is derived from PCA. PCA is generalized to principal curves [5] and principal surfaces [3]. Principal curves are essentially equivalent to self-organizing maps (SOM) [19], [25], [1]. As an extended version of SOM, visualization-induced SOM (ViSOM) preserves distance information on maps along with topologies [38], [23]. ViSOM represents a discrete principal curve or surface and produces a smooth and graded mesh in the data space that captures the nonlinear manifold of data [36]. Other nonlinear manifold algorithms have been developed, such as Isomap [31] and locally linear embedding (LLE) [26], [9], locality preserving projection (LPP) [6], and its extension of class-wise locality preserving projection (CLPP), which was proposed in our previous work [9].
In this paper, we present a novel dimensionality reduction method of Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA), and in QKCLDA the quasiconformal kernel is applied. The kernel structure of the data is subjected to an adaptive adjustment according to the distribution of the input data. QKCLDA preserves the local and discriminative relationships of the data for classification. The rest of this paper is organized as follows. In Section 2, we analyze the popular algorithms. In Section 3, we describe common locality discriminant analysis (CLDA), and in Section 4, we extend CLDA with the quasiconformal kernel. The application to breast cancer diagnosis is presented in Section 5. Comprehensive evaluations are implemented in Sections 6 Experimental evaluation on feasibility of QKCLDA, 7 Performance evaluation on breast cancer diagnosis. Conclusions are drawn in Section 8.
Section snippets
Analyses and reviews of KDA, LPP and KPCA
In this section we review and analyze the kernel discriminant analysis (KDA), locality preserving projection (LPP) and kernel principal component analysis (KPCA).
Common locality discriminant analysis
In this section, we describe CLDA which preserves the local and discriminative relationships of the data. CLDA includes two stages of calculating Wlpp and Wcom through the following Eqs. (14), (15):where I is unit matrix, and S is the similarity matrix of the different samples.
Given the original sample x, then y = WlppTx. The CLDA-based vector z = WcomTy, where Wcom is solved as follows:where I is the unit
Quasiconformal kernel common locality discriminant analysis
CLDA is kernelized to develop the kernel common locality discriminant analysis (KCLDA). KCLDA performs well with nonlinear feature extraction but still endures by kernel parameter selection issues. Kernel parameter selection is widespread during kernel learning. We applied a quasiconformal kernel to KCLDA to develop the Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA). QKCLDA changes the data structure with self-adaptive parameters according to the input data for
Application to breast cancer diagnosis
Mammogram-based breast cancer diagnosis is feasible in practical application and has been widely studied in previous works. Cluster detection and classification are two popular methods for breast cancer diagnosis. The previous work [4] presented a uniform scheme of microcalcification clusters detection including morphological analyses, fractal dimension analyses, and complete HOS tests. Excellent performances were reported for the practical applications. In this paper, we emphasize a
Experimental evaluation on feasibility of QKCLDA
We evaluated QKCLDA through some experiments on the ORL [29] and Yale databases [2] with the recognition accuracy as the evaluation way. For comparison, we implemented other popular methods including PCA, KPCA, LDA, wavelet linear discriminant analysis (Wavelet-LDA), KDA, discriminant common vector (DCV), Gabor discriminant common vector (Gabor-DCV), kernel discriminant common vector (KDCV) and KCLDA.
The ORL face database is composed of 400 grayscale images with 10 images for each of 40
Results from Wisconsin Diagnostic Breast Cancer (WDBC) database
We evaluated QKCLDA using the WDBC database [33] of 569 instances (357 benign samples and 212 malignant samples). Each sample represented FNA test measurements for one diagnosis. Each instance has 32 attributes, where the first two corresponded to a unique identification number and diagnosis status (benign/malignant). The remaining 30 features were computations for ten features, along with their means, standard errors and the means of the three largest values for each cell nucleus,
Conclusion
This paper presents the Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA)-based feature extraction. QKCLDA preserves the local and discriminative relationships of the data. The data structure is able to adapt to the input data through the optimization of the quasiconformal kernel. QKCLDA is advantageous for obtaining classifications for breast cancer diagnoses. The experimental results using the WDBC and mini-MIAS databases show that QKCLDA effectively diagnoses breast cancer
Acknowledgments
This work is supported by the National Science Foundation of China under Grant No. 61001165, the Heilongjiang Provincial Natural Science Foundation of China under Grant No. QC2010066, the HIT Young Scholar Foundation of 985 Project, and the Program for Interdisciplinary Basic Research of Science-Engineering-Medicine at the Harbin Institute of Technology, and the Fundamental Research Funds for the Central Universities (Grant No. HIT.BRETIII.201206).
References (38)
- et al.
Imposing tree-based topologies onto self organizing maps
Information Sciences
(2011) - et al.
A high performance edge detector based on fuzzy inference rules
Information Sciences
(2007) - et al.
Kernel class-wise locality preserving projection
Information Sciences
(2008) - et al.
Kernel self-optimized locality preserving discriminant analysis for feature extraction and recognition
Neurocomputing
(2011) - et al.
Kernel self-optimized locality preserving discriminant analysis for feature extraction and recognition
Neurocomputing
(2011) - et al.
Color discrimination enhancement for dichromats using self-organizing color transformation
Information Sciences
(2009) - et al.
Adaptive quasiconformal kernel discriminant analysis
Neurocomputing
(2008) - et al.
A hybrid artificial immune system and Self Organising Map for network intrusion detection
Information Sciences
(2008) - et al.
A unified framework for semi-supervised dimensionality reduction
Pattern Recognition
(2008) Kernel PCA for similarity invariant shape recognition
Neurocomputing
(2007)
Computer-derived nuclear features distinguish malignant from benign breast cytology
Human Pathology
Data visualisation and manifold mapping using the ViSOM
Neural Networks
Self-organizing learning array and its application to economic and financial problems
Information Sciences
Eigenfaces vs. fisherfaces: recognition using class specific linear projection
IEEE Transactions on Pattern Analysis and Machine Intelligence
A unified model for probabilistic principal surfaces
IEEE Transactions on Pattern Analysis and Machine Intelligence
Principal curves
Journal of the American Statistical Association
Face recognition using Laplacianfaces
IEEE Transaction on Pattern Analysis and Machine Intelligence
Cited by (17)
An immune-inspired semi-supervised algorithm for breast cancer diagnosis
2016, Computer Methods and Programs in BiomedicineCitation Excerpt :Their proposed methodology improves the accuracy to 97.38%, when it is tested on the Wisconsin Diagnostic Breast Cancer (WDBC) data set. Li et al. [4] proposed a novel supervised dimensionality reduction method named as quasiconformal kernel common locality discriminant analysis (QKCLDA), which obtained 96.98% accuracy. Quinlan [5] presented C4.5 decision tree method and reached 94.74% classification accuracy using 10-fold cross-validation with WDBC data set.
A fuzzy-rough nearest neighbor classifier combined with consistency-based subset evaluation and instance selection for automated diagnosis of breast cancer
2015, Expert Systems with ApplicationsCitation Excerpt :Inan, Uzer, and Yılmaz (2013) presented an integrated model of association rule mining based feature selection, the principal component analysis and a neural network classifier and reported 98.29% classification accuracy. Li, Peng, and Liu (2013) applied the quasiformal kernel common locality discriminant analysis for dimensionality reduction and reported a classification accuracy of 97.26%. Zheng, Yoon, and Lam (2014) presented a K-means algorithm and support vector machine based model and reported a classification accuracy of 97.38%.
Structural multiple empirical kernel learning
2015, Information SciencesCitation Excerpt :In this paper, the motivation for proposing TSMEKL is to introduce the cluster structural information into each empirical feature space. There is some work considering the structural information by IKM [8,10,14,32,31,61]. For example, Gönen and Alpaydin [14] propose Localized Multiple Kernel Learning (LMKL).
Breast cancer diagnosis using feature extraction and boosted C5.0 decision tree algorithm with penalty factor
2022, Mathematical Biosciences and EngineeringBreast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm
2021, International Journal of Control, Automation and SystemsBreast cancer detection using optimization-based feature pruning and classification algorithms
2021, Middle East Journal of Cancer