Feature extraction and dimensionality reduction for mass spectrometry data
Section snippets
Background
Mass spectrometry is being used to generate protein profiles from human serum, and proteomic data obtained from mass spectrometry have attracted great interest for the detection of early stage cancer. Surface enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) in combination with advanced data mining algorithms, is used to detect protein patterns associated with diseases [1], [2], [3], [4], [5]. As a kind of MS-based protein chip technology, SELDI-TOF-MS has
Methods
In this research we develop a new application of wavelet feature extraction method for mass spectrometry data. Wavelet high frequency part (detail coefficients) is extracted to characterize the features of mass spectrometry data. The extracted features are used to build the SVM classifying model. Fig. 1 shows the general framework of the proposed method.
Experiments and Results
In this study we use classification accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) to evaluate the performance of the proposed method. Let TP, TN, FP, and FN be the number of true positive (cancer), true negative (control), false positive and false negative samples. Sensitivity is defined as ; specificity is defined as ; positive predictive value is defined as ; negative predictive value is defined as
Discussion and conclusions
In this paper we propose a feature extraction algorithm based on multilevel wavelet decomposition for high dimensional mass spectra. A set of wavelet detail coefficients at different levels is used to reduce the dimensionality of mass spectra and characterizes the transient changes of mass spectra, in order to detect the difference between cancer tissue and normal tissue.
Feature extraction method of wavelet detail coefficients is novel application on mass spectrometry data. A set of orthogonal
Conflict of interest statement
None declared.
Acknowledgements
This work was supported by SRF for ROCS, SEM, and Natural Science Foundation of Shandong Province (Y2008G30), China.
References (21)
- et al.
Use of proteomic patterns in serum to identify ovarian cancer
The Lancet
(2002) - et al.
Genomics and proteomics: application of novel technology to early detection and prevention of cancer
Cancer Detection and Prevention
(2002) - et al.
A data review and re-assessment of ovarian cancer serum proteomic profiling
BMC Bioinformatics
(2003) - et al.
Clinical proteomics: translating benchside promise into bedside reality
Nature Reviews Drug Discovery
(2002) - et al.
Proteomics for cancer biomarker discovery
Clinical Chemistry
(2002) - et al.
Cancer proteomics: the state of the art
Disease Markers
(2001) - et al.
Proteinchip surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures
Prostate Cancer and Prostatic Disease
(1999) - et al.
Development of a novel proteomic approach for the detection of transitional cell carcinoma of the bladder in urine
American Journal of Pathology
(2001) - et al.
Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum
Journal of Computational Biology
(2003) - et al.
Lower dimensional representation of text data based on centroids and least squares
BIT
(2003)