Skip to main content

Advertisement

Log in

Effective Gene Selection Method Using Bayesian Discriminant Based Criterion and Genetic Algorithms

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Microarray gene expression data usually consist of a large amount of genes. Among these genes, only a small fraction is informative for performing cancer diagnostic tests. This paper focuses on effective identification of informative genes. A newly developed gene selection criterion using the concept of Bayesian discriminant is used. The criterion measures the classification ability of a feature set. Excellent gene selection results are then made possible. Apart from the cost function, this paper addresses the drawback of conventional sequential forward search (SFS) method. New genetic algorithms based Bayesian discriminant criterion is designed. The proposed strategies have been thoroughly evaluated on three kinds of cancer diagnoses based on the classification results of three typical classifiers which are a multilayer perception model (MLP), a support vector machine model (SVM), and a 3-nearest neighbor rule classifier (3-NN). The obtained results show that the proposed strategies can improve the performance of gene selection substantially. The experimental results also indicate that the proposed methods are very robust under all the investigated cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Al-Ani and M. Deriche, “Optimal feature selection using information maximisation: case of biomedical data,” in Proc. of the 2000 IEEE Signal Processing Society Workshop, vol. 2, 2000, pp. 841–850.

  2. C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, New York, 1995.

    Google Scholar 

  3. J. Casillas, O. Cordon, M. J. Del Jesus, and F. Herrera, “Genetic Feature Selection in a Fuzzy Rule-based Classification System Learning Process for High-dimensional Problems,” Inf. Sci., vol. 136, 2001, pp. 135–157.

    Article  MATH  Google Scholar 

  4. X. W. Chen, “Gene Selection for Cancer Classification Using Bootstrapped Genetic Algorithms and Support Vector Machines,” Proc. Bioinformatics Conference, 2003.

  5. I. Cheng, D. O. Stram, K. L. Penney, M. Pike et al., “Common Genetic Variation in IGF1 and Prostate Cancer Risk in the Multiethnic Cohort,” J. Natl. Cancer Inst., vol. 98, no. 2, 2006, pp. 123–124.

    Article  Google Scholar 

  6. M. L. Chow, E. J. Moler, and I. S. Mian, “Identifying Marker Genes in Transcription Profiling Data Using a Mixture of Feature Relevance Experts,” Physiol. Genomics, vol. 5, 2001, pp. 99–111.

    Google Scholar 

  7. J. Deutsch, “Evolutionary Algorithms for Finding Optimal Gene Sets in Microarray Prediction,” Bioinformatics, vol. 19, 2003, pp. 45–52.

    Article  Google Scholar 

  8. S. Ding et al., “A Genetic Algorithm Applied to Optimal Gene Subset Selection,” Evolutionary Computation, Congress on, CEC2004. vol. 2, 2004, pp. 1654–1660, Jun.

  9. P. A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach, Prentice-Hall, Englewood Cliffs, NJ, 1982.

    MATH  Google Scholar 

  10. Kai-bo Duan et al., “Multiple SVM-RFE for Gene Selection in Cancer Classification with Expression Data,” IEEE Trans. Nanobioscience, vol. 4, no. 3, 2005, pp. 228–234, Sep.

    Article  Google Scholar 

  11. S. Dudoit, J. Fridlyand, and T. P. Speed, “Comparison of Discrimination Methods for the Classification of Tumours Using Gene Express Data,” J. Am. Stat. Assoc., vol. 97, no. 457, 2002, pp. 77–87.

    Article  MATH  MathSciNet  Google Scholar 

  12. R. Ekins and F. W. Chu, “Microarrays: Their Origins and Applications,” Trends Biotech., vol. 17, 1999, pp. 217–218.

    Article  Google Scholar 

  13. T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard et al., “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, 1999, pp. 531–537.

    Article  Google Scholar 

  14. J. R. Graff, J. A. Deddens, B. W. Knoicek, B. M. Colligan et al., “Integrin-linked Kinase Expression Increases with Prostate Tumor Grade,” Clin. Cancer Res., vol. 7, 2002, pp. 1987–1991.

    Google Scholar 

  15. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene Selection for Cancer Classification Using Support Vector Machines,” Mach. Learn., vol. 46, 2002, pp. 389–422.

    Article  MATH  Google Scholar 

  16. D. Huang and T. W. S. Chow, “Efficiently Searching the Important Input Variables Using Bayesian Discriminant,” IEEE Trans. Circuits Syst., vol. 52, no. 4, 2005, pp. 785–793.

    Article  MathSciNet  Google Scholar 

  17. D. Huang, T. W. S. Chow, E. W. M. Ma, and J. Li, “Efficient Selection of Salient Features from Microarray Gene Expression Data for Cancer Diagnosis,” IEEE Trans. Circuits Syst. Part I, vol. 52, no. 9, 2005, pp. 1909–1918.

    Article  MathSciNet  Google Scholar 

  18. C. Jerónimo, R. Henrique, J. Oliveira, F. Lobo et al., “Aberrant Cellular Retinol Binding Protein 1 (CRBP1) Gene Expression and Promoter Methylation in Prostate Cancer,” J. Clin. Pathol., vol. 57, 2004, pp. 872–876.

    Article  Google Scholar 

  19. K. E. Lee, N. Sha, E. R. Dougherty et al., “Gene Selection: A Bayesian Variable Selection Approach,” Bioinformatics, vol. 19, no. 1, 2003, pp. 90–97.

    Article  Google Scholar 

  20. H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer, London, UK, 1998.

    MATH  Google Scholar 

  21. X. Liu, A. Krishnan, and A. Mondry, “An Entropy-based Gene Selection Method for Cancer Classification Using Microarray Data,” BMC Bioinformatics, vol. 6, no. 76, 2005.

  22. L. C. Molina, L. Belanche, and A. Nebot, “Feature Selection Algorithms: A Survey and Experimental Evaluation,” available at: http://www.lsi.upc.es/dept/techreps/html/R02-62.html, Technical Report, 2002.

  23. N. R. Pal, S. Nandi, and M. K. Kundu, “Self-crossover: A New Genetic Operator and Its Application to Feature Selection,” Int. J. Syst. Sci., vol. 29, no. 2, 1998, pp. 207–212.

    Article  Google Scholar 

  24. E. Parzen, “On the Estimation of a Probability Density Function and Mode,” Ann. Math. Stat., vol. 33, 1962, pp. 1064–1076.

    Article  MathSciNet  Google Scholar 

  25. P. Pudil, J. Novovicova, and J. Kittler, “Floating Search Methods in Feature Selection,” Pattern Recogn. Lett., vol. 15, 1994, pp. 1119–1125.

    Article  Google Scholar 

  26. M. Richeldi and P. Lanzi, “Performing Effective Feature Selection by Investigating the Deep Structure of the Data,” in Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining. Menlo Park, CA, 1996, pp. 379–383.

  27. S. C. Shah and A. Kusiak, “Data Mining and Genetic Algorithm Based Gene/SNP Selection,” Artif. Intell. Med., vol. 31, no. 3, 2004, pp. 183–196.

    Google Scholar 

  28. W. Siedlecki and J. Sklansky, “A Note on Genetic Algorithms for Large Scale Feature Selection,” Pattern Recogn. Lett., vol. 10, 1989, pp. 335–347.

    Article  MATH  Google Scholar 

  29. T. J. Umpai and S. Aitken, “Feature Selection and Classification for Microarray Data Analysis: Evolutionary Methods for Identifying Predictive Genes,” BMC Bioinformatics, vol. 6, no. 148, 2005.

  30. S. S. Uzma and H. G. Robert, “Fingerprinting the Diseased Prostate: Associations between BPH and Prostate Cancer,” J. Cell. Biochem., vol. 91, 2004, pp. 161–169.

    Article  Google Scholar 

  31. E. P. Xing, M. I. Jordan, and M. Karp, “Feature Selection for High-dimensional Genomic Microarray Data,” in Proc. 18th Intl. Conf. On Machine Learning, 2001.

  32. K. Yeung, R. E. Bumgarner, and A. E. Raftery, “Bayesian Model Averaging: Development of An Improved Multi-class, Gene Selection and Classification Tool for Microarray Data,” Bioinformatics, vol. 21, no. 10, 2005, pp. 2394–2402.

    Article  Google Scholar 

  33. C. Zhang, Hai-Ri Li, Jian-Bing Fan, J. Wang-Rodriguez et al., “Profiling Alternatively Spliced mRNA Isoforms for Prostate Cancer Classification,” BMC Bioinformatics, vol. 7, 2006, pp. 202–236.

    Article  Google Scholar 

  34. Chaolin Zhang et al., “Significance of Gene Ranking for Classification of Microarray Samples,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 3, 2006, pp. 312–320.

    Article  Google Scholar 

  35. X. Zhou, X. Wang, and E. Dougherty, “Nonlinear Probit Gene Classification Using Mutual Information and Wavelet-based Feature Selection,” J. Biol. Syst., vol. 12, no. 3, 2004, pp. 371–386.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tommy W. S. Chow.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gan, Z., Chow, T.W.S. & Huang, D. Effective Gene Selection Method Using Bayesian Discriminant Based Criterion and Genetic Algorithms. J Sign Process Syst Sign Image 50, 293–304 (2008). https://doi.org/10.1007/s11265-007-0120-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-007-0120-3

Keywords

Navigation