Elsevier

Information Sciences

Volume 223, 20 February 2013, Pages 256-269
Information Sciences

Quasiconformal kernel common locality discriminant analysis with application to breast cancer diagnosis

https://doi.org/10.1016/j.ins.2012.10.016Get rights and content

Abstract

Dimensionality reduction (DR) is a popular method in recognition and classification in many areas, such as facial and medical imaging. In this paper, we propose a novel supervised DR method namely Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA). QKCLDA preserves the local and discriminative relationships of the data. Moreover, it adjusts the kernel structure according to the distribution of the input data and thus possesses a classification advantage over traditional kernel-based methods. In QKCLDA, the parameter of the quasiconformal kernel is automatically calculated through optimizing an objective function of maximizing the class discriminative ability. QKCLDA is employed in breast cancer diagnoses, and some experiments using Wisconsin Diagnostic Breast Cancer (WDBC) and mini-MIAS databases have tested its feasibility and performance in assigning these diagnoses.

Introduction

Dimensionality reduction (DR) is the most popular approach for feature extraction. DR has wide applications in computer vision, pattern recognition, gene expression, paleontology, etc. Current DR methods are categorized into supervised learning, such as the linear discriminant analysis (LDA), unsupervised learning, such as the principal component analysis (PCA), and other semi-supervised learning methods that have been proposed in previous works [37], [35], [24].

In recent years, kernel-based methods have attracted attentions in pattern recognition area [9], [10], [22]. Recent many linear methods are proposed, for example, the kernel principal component analysis (KPCA) [26], [27], kernel discriminant analysis (KDA) [28], [18] and improved kernel-based learning methods [22], [34], and the support vector machine (SVM) [30].

Recently, many researchers have focused more on manifold learning, which is derived from PCA. PCA is generalized to principal curves [5] and principal surfaces [3]. Principal curves are essentially equivalent to self-organizing maps (SOM) [19], [25], [1]. As an extended version of SOM, visualization-induced SOM (ViSOM) preserves distance information on maps along with topologies [38], [23]. ViSOM represents a discrete principal curve or surface and produces a smooth and graded mesh in the data space that captures the nonlinear manifold of data [36]. Other nonlinear manifold algorithms have been developed, such as Isomap [31] and locally linear embedding (LLE) [26], [9], locality preserving projection (LPP) [6], and its extension of class-wise locality preserving projection (CLPP), which was proposed in our previous work [9].

In this paper, we present a novel dimensionality reduction method of Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA), and in QKCLDA the quasiconformal kernel is applied. The kernel structure of the data is subjected to an adaptive adjustment according to the distribution of the input data. QKCLDA preserves the local and discriminative relationships of the data for classification. The rest of this paper is organized as follows. In Section 2, we analyze the popular algorithms. In Section 3, we describe common locality discriminant analysis (CLDA), and in Section 4, we extend CLDA with the quasiconformal kernel. The application to breast cancer diagnosis is presented in Section 5. Comprehensive evaluations are implemented in Sections 6 Experimental evaluation on feasibility of QKCLDA, 7 Performance evaluation on breast cancer diagnosis. Conclusions are drawn in Section 8.

Section snippets

Analyses and reviews of KDA, LPP and KPCA

In this section we review and analyze the kernel discriminant analysis (KDA), locality preserving projection (LPP) and kernel principal component analysis (KPCA).

Common locality discriminant analysis

In this section, we describe CLDA which preserves the local and discriminative relationships of the data. CLDA includes two stages of calculating Wlpp and Wcom through the following Eqs. (14), (15):minWlppi,jWlppTxi-WlppTxj2Sij,subject toWlppTWlpp=I,where I is unit matrix, and S is the similarity matrix of the different samples.

Given the original sample x, then y = WlppTx. The CLDA-based vector z = WcomTy, where Wcom is solved as follows:maxWcomWcomTSBWcom,subject toWcomTWcom=I,where I is the unit

Quasiconformal kernel common locality discriminant analysis

CLDA is kernelized to develop the kernel common locality discriminant analysis (KCLDA). KCLDA performs well with nonlinear feature extraction but still endures by kernel parameter selection issues. Kernel parameter selection is widespread during kernel learning. We applied a quasiconformal kernel to KCLDA to develop the Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA). QKCLDA changes the data structure with self-adaptive parameters according to the input data for

Application to breast cancer diagnosis

Mammogram-based breast cancer diagnosis is feasible in practical application and has been widely studied in previous works. Cluster detection and classification are two popular methods for breast cancer diagnosis. The previous work [4] presented a uniform scheme of microcalcification clusters detection including morphological analyses, fractal dimension analyses, and complete HOS tests. Excellent performances were reported for the practical applications. In this paper, we emphasize a

Experimental evaluation on feasibility of QKCLDA

We evaluated QKCLDA through some experiments on the ORL [29] and Yale databases [2] with the recognition accuracy as the evaluation way. For comparison, we implemented other popular methods including PCA, KPCA, LDA, wavelet linear discriminant analysis (Wavelet-LDA), KDA, discriminant common vector (DCV), Gabor discriminant common vector (Gabor-DCV), kernel discriminant common vector (KDCV) and KCLDA.

The ORL face database is composed of 400 grayscale images with 10 images for each of 40

Results from Wisconsin Diagnostic Breast Cancer (WDBC) database

We evaluated QKCLDA using the WDBC database [33] of 569 instances (357 benign samples and 212 malignant samples). Each sample represented FNA test measurements for one diagnosis. Each instance has 32 attributes, where the first two corresponded to a unique identification number and diagnosis status (benign/malignant). The remaining 30 features were computations for ten features, along with their means, standard errors and the means of the three largest values for each cell nucleus,

Conclusion

This paper presents the Quasiconformal Kernel Common Locality Discriminant Analysis (QKCLDA)-based feature extraction. QKCLDA preserves the local and discriminative relationships of the data. The data structure is able to adapt to the input data through the optimization of the quasiconformal kernel. QKCLDA is advantageous for obtaining classifications for breast cancer diagnoses. The experimental results using the WDBC and mini-MIAS databases show that QKCLDA effectively diagnoses breast cancer

Acknowledgments

This work is supported by the National Science Foundation of China under Grant No. 61001165, the Heilongjiang Provincial Natural Science Foundation of China under Grant No. QC2010066, the HIT Young Scholar Foundation of 985 Project, and the Program for Interdisciplinary Basic Research of Science-Engineering-Medicine at the Harbin Institute of Technology, and the Fundamental Research Funds for the Central Universities (Grant No. HIT.BRETIII.201206).

References (38)

  • W.H. Wolberg et al.

    Computer-derived nuclear features distinguish malignant from benign breast cytology

    Human Pathology

    (1995)
  • Hujun Yin

    Data visualisation and manifold mapping using the ViSOM

    Neural Networks

    (2002)
  • Z. Zhu et al.

    Self-organizing learning array and its application to economic and financial problems

    Information Sciences

    (2007)
  • P.N. Belhumeur et al.

    Eigenfaces vs. fisherfaces: recognition using class specific linear projection

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1997)
  • K.Y. Chang et al.

    A unified model for probabilistic principal surfaces

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2001)
  • W.M. Diyana, J. Larcher, R. Besar, A comparison of clustered microcalcifications automated detection methods in digital...
  • T. Hastie et al.

    Principal curves

    Journal of the American Statistical Association

    (1989)
  • X. He, P. Niyogi, Locality preserving projections, in: Proc. Conf. Advances in Neural Information Processing Systems,...
  • X. He et al.

    Face recognition using Laplacianfaces

    IEEE Transaction on Pattern Analysis and Machine Intelligence

    (2005)
  • Cited by (17)

    • An immune-inspired semi-supervised algorithm for breast cancer diagnosis

      2016, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      Their proposed methodology improves the accuracy to 97.38%, when it is tested on the Wisconsin Diagnostic Breast Cancer (WDBC) data set. Li et al. [4] proposed a novel supervised dimensionality reduction method named as quasiconformal kernel common locality discriminant analysis (QKCLDA), which obtained 96.98% accuracy. Quinlan [5] presented C4.5 decision tree method and reached 94.74% classification accuracy using 10-fold cross-validation with WDBC data set.

    • A fuzzy-rough nearest neighbor classifier combined with consistency-based subset evaluation and instance selection for automated diagnosis of breast cancer

      2015, Expert Systems with Applications
      Citation Excerpt :

      Inan, Uzer, and Yılmaz (2013) presented an integrated model of association rule mining based feature selection, the principal component analysis and a neural network classifier and reported 98.29% classification accuracy. Li, Peng, and Liu (2013) applied the quasiformal kernel common locality discriminant analysis for dimensionality reduction and reported a classification accuracy of 97.26%. Zheng, Yoon, and Lam (2014) presented a K-means algorithm and support vector machine based model and reported a classification accuracy of 97.38%.

    • Structural multiple empirical kernel learning

      2015, Information Sciences
      Citation Excerpt :

      In this paper, the motivation for proposing TSMEKL is to introduce the cluster structural information into each empirical feature space. There is some work considering the structural information by IKM [8,10,14,32,31,61]. For example, Gönen and Alpaydin [14] propose Localized Multiple Kernel Learning (LMKL).

    • Breast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm

      2021, International Journal of Control, Automation and Systems
    View all citing articles on Scopus
    View full text