Optimal gene subset selection using the modified SFFS algorithm for tumor classification

Peng, Hongyi; Fu, Yinlian; Liu, Jinshan; Fang, Xiang; Jiang, Chunfu

doi:10.1007/s00521-012-1148-2

Optimal gene subset selection using the modified SFFS algorithm for tumor classification

Review
Published: 06 September 2012

Volume 23, pages 1531–1538, (2013)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Hongyi Peng¹,
Yinlian Fu¹,
Jinshan Liu¹,
Xiang Fang² &
…
Chunfu Jiang³

409 Accesses
9 Citations
Explore all metrics

Abstract

A reliable and precise classification of tumors is essential for successful treatment of cancer. Gene selection is an important step for improved diagnostics. The modified SFFS (sequential forward floating selection) algorithm based on weighted Mahalanobis distance, called MSWM, is proposed to identify optimal informative gene subsets taking into account joint discriminatory power for accurate discrimination in this study. Firstly, we make use of the one-dimensional weighted Mahalanobis distance to perform a preliminary selection of genes and then make use of the modified SFFS method and multidimensional weighted Mahalanobis distance to obtain the optimal informative gene subset for tumor classification. Finally, we used the k nearest neighbor and naive Bayes methods to classify tumors based on the optimal gene subset selected using the MSWM method. To validate the efficiency, the proposed MSWM method is applied to classify two different DNA microarray datasets. Our empirical study shows that the MSWM method for tumor classification can obtain better effectiveness of classification than the BWR (the ratio of between-groups to within-groups sum of squares) and IVGA_I (independent variable group analysis I) methods. It suggests that the MSWM gene selection method is ability to obtain correct informative gene subsets taking into account genes’ joint discriminatory power for tumor classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Optimize Gene Selection Approach for Cancer Classification Using Hybrid Feature Selection Methods

A proficient two stage model for identification of promising gene subset and accurate cancer classification

Article 10 March 2023

A Comparative Study of Gene Selection Methods for Microarray Cancer Classification

References

Bittner M, Chen Y et al (2000) Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406(6795):536–540
Article Google Scholar
Golub TR, Slonim DK, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Article Google Scholar
Shipp MA, Ross KN, Tamayo P et al (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1):68–74
Article Google Scholar
Alon U et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probes by oligonucleotide arrays. Proc Natl Acad Sci USA 96:6745–6750
Article Google Scholar
Ben-Dor A et al (2000) Tissue classification with gene expression profiles. J Comput Biol 7:559–583
Article Google Scholar
Nanni L, Lumini A, Brahnam S (2010) Advanced machine learning technique for microarray spot quality classification. Neural Comput Appl 19(3):471–475
Article Google Scholar
Zheng CH, Huang DS et al (2009) Tumor clustering using non-negative matrix factorization with gene selection. IEEE Trans Info Technol Biomed 13(4):599–607
Article Google Scholar
Yeung KY, Ruzzo WL (2001) Principal component analysis for clustering gene expression data. Bioinformatics 17(9):763–774
Article Google Scholar
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Article Google Scholar
Dudiot S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87
Article Google Scholar
Li T, Zhang C, Ogihara M (2004) A comparative study of feature selection and multiclass methods for tissue classification based on gene expression. Bioinformatics 20(15):2429–2437
Article Google Scholar
Bae K, Mallick BK (2004) Gene selection using a two-level hierarchical Bayesian model. Bioinformatics 20:3423–3430
Article Google Scholar
Lee KE, Sha N et al (2003) Gene selection: a Bayesian variable selection approach. Bioinformatics 19:90–97
Article Google Scholar
Li W, Sun F, Grosse I (2004) Extreme value distribution based on gene selection criteria for discriminant microarray data analysis using logistic regression. J Comput Biol 1:215–226
Article Google Scholar
Draghici S, Kulaeva O et al (2003) Sorin noise sample method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarray. Bioinformatics 19:1348–1359
Article Google Scholar
Shevade SK, Keerthi S (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19:2246–2253
Article Google Scholar
Lagus K, Alhomiemi E et al (2005) Independent variable group analysis in learning compact representations for data. In: Honkela T, Könönen V, Pöllä M, Simula O (eds) Proceedings of the international and interdisciplinary conference on adaptive knowledge representation and reasoning (AKRR’05). Espoo, Finland, pp 49–56
Alhoniemi E, Honkela A et al (2006) Compact modeling of data using independent variable group analysis. Technical Report E3, Helsinki University of Technology. Publications in Computer and Information Science, Espoo, Finland
Zheng CH, Chong YW, Wang HQ (2011) Gene selection using independent variable group analysis for tumor classification. Neural Comput Appl 20:161–170
Article Google Scholar
Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922
Article MATH Google Scholar
Marill T, Green DM (1963) On the effectiveness of receptors in cognition systems. IEEE Trans Inf Theory 9:11–17
Article Google Scholar
Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 20(9):1100–1103
Article MathSciNet MATH Google Scholar
Stearns SD (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international conference on pattern recognition, Coronado, pp 71–75
Jain AK, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158
Article Google Scholar
Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125
Article Google Scholar
Ross DT, Scherf U et al (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24:227–234
Article Google Scholar
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York
MATH Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant Nos. 31071528, 71101095 and 11171117, the National Natural Foundation of Guangdong Province, China under No. S2011010002371, and the Ministry of Education in China Project of Humanities and Social Science under No. 11YJCZH195.

Author information

Authors and Affiliations

Department of Applied Mathematics, South China Agricultural University, Guangzhou, 510642, China
Hongyi Peng, Yinlian Fu & Jinshan Liu
College of Food Science, South China Agricultural University, Guangzhou, 510642, China
Xiang Fang
College of Mathematics and Computational Science, Shenzhen University, Shenzhen, 518060, China
Chunfu Jiang

Authors

Hongyi Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yinlian Fu
View author publications
You can also search for this author in PubMed Google Scholar
Jinshan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Chunfu Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongyi Peng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, H., Fu, Y., Liu, J. et al. Optimal gene subset selection using the modified SFFS algorithm for tumor classification. Neural Comput & Applic 23, 1531–1538 (2013). https://doi.org/10.1007/s00521-012-1148-2

Download citation

Received: 15 April 2012
Accepted: 24 August 2012
Published: 06 September 2012
Issue Date: November 2013
DOI: https://doi.org/10.1007/s00521-012-1148-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal gene subset selection using the modified SFFS algorithm for tumor classification

Abstract

Access this article

Similar content being viewed by others

An Optimize Gene Selection Approach for Cancer Classification Using Hybrid Feature Selection Methods

A proficient two stage model for identification of promising gene subset and accurate cancer classification

A Comparative Study of Gene Selection Methods for Microarray Cancer Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal gene subset selection using the modified SFFS algorithm for tumor classification

Abstract

Access this article

Similar content being viewed by others

An Optimize Gene Selection Approach for Cancer Classification Using Hybrid Feature Selection Methods

A proficient two stage model for identification of promising gene subset and accurate cancer classification

A Comparative Study of Gene Selection Methods for Microarray Cancer Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation