Skip to main content

Identification of Informative Genes for Molecular Classification Using Probabilistic Model Building Genetic Algorithm

  • Conference paper
Genetic and Evolutionary Computation – GECCO 2004 (GECCO 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3102))

Included in the following conference series:

Abstract

DNA microarray allows the monitoring and measurement of the expression levels of thousands of genes simultaneously in an organism. A systematic and computational analysis of this vast amount of data provides understanding and insight into many aspects of biological processes. Recently, there has been a growing interest in classification of patient samples based on these gene expressions. The main challenge here is the overwhelming number of genes relative to the number of available training samples in the data set, and many of these genes are irrelevant for classification and have negative effect on the accuracy of the classifier. The choice of genes affects several aspects of classification: accuracy, required learning time, cost, and number of training samples needed. In this paper, we propose a new Probabilistic Model Building Genetic Algorithm (PMBGA) for the identification of informative genes for molecular classification and present our unbiased experimental results on three bench-mark data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alizadeh, A.A., Eisen, M.B., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of National Academy of Science, Cell Biology 96, 6745–6750 (1999)

    Article  Google Scholar 

  3. Cestnik, B.: Estimating probabilities: a crucial task in machine learning. In: Proceedings of the European Conference on Artificial Intelligence, pp. 147–149 (1990)

    Google Scholar 

  4. Deb, K., Reddy, A.R.: Reliable classification of two-class cancer data using evolutionary algorithms. BioSystems 72, 111–129 (2003)

    Article  Google Scholar 

  5. Golub, G.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(15), 531–537 (1999)

    Article  Google Scholar 

  6. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the International Joint Conference on Artificial Intelligence (1995)

    Google Scholar 

  7. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  8. Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Boston (2001)

    Google Scholar 

  9. Liu, J., Iba, H.: Selecting Informative Genes with Parallel Genetic Algorithms in Tissue Classification. Genome Informatics 12, 14–23 (2001)

    Google Scholar 

  10. Liu, J., Iba, H.: Selecting Informative Genes Using a Multiobjective Evolutionary Algorithm. In: Proceedings of the World Congress on Computation Intelligence(WCCI 2002), pp. 297–302 (2002)

    Google Scholar 

  11. Mühlenbein, H., Paaß, G.: From Recombination of Genes to the Estimation of Distribution I. In: Binary parameters. Parallel Problem Solving from Nature-PPSN IV. Lecture Notes in Computer Science (LNCS), vol. 1411, pp. 178–187. Springer, Berlin (1996)

    Chapter  Google Scholar 

  12. Paul, T.K., Iba, H.: Linear and Combinatorial Optimizations by Estimation of Distribution Algorithms. In: Proceedings of the 9th MPS Symposium on Evolutionary Computation, IPSJ, Japan, pp. 99–106 (2002)

    Google Scholar 

  13. Paul, T.K., Iba, H.: Reinforcement Learning Estimation of Distribution Algorithm. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1259–1270. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Paul, T.K., Iba, H.: Optimization in Continuous Domain by Real-coded Estimation of Distribution Algorithm. In: Design and Application of Hybrid Intelligent Systems, pp. 262–271. IOS Press, Amsterdam (2003)

    Google Scholar 

  15. Pelikan, M., Goldberg, D.E., Cantú-paz, E.: Linkage Problem, Distribution Estimation and Bayesian Networks. Evolutionary Computation 8(3), 311–340 (2000)

    Article  Google Scholar 

  16. Pelikan, M., Goldberg, D.E., Lobo, F.G.: A Survey of Optimizations by Building and Using Probabilistic Models. Technical Report, Illigal Report no. 99018, University of Illinois at Urbana-Champaign, USA (1999)

    Google Scholar 

  17. Rowland, J.J.: Generalization and Model Selection in Supervised Learning with Evolutionary Computation. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 119–130. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  18. Slonim, D.K., Tamayo, P., et al.: Class Prediction and Discovery Using Gene Expression Data. In: Proceedings of the 4th Annual International Conference on Computational Molecular Biology, pp. 263–272 (2000)

    Google Scholar 

  19. Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. In: Feature extraction, construction and selection, pp. 118–135. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Paul, T.K., Iba, H. (2004). Identification of Informative Genes for Molecular Classification Using Probabilistic Model Building Genetic Algorithm. In: Deb, K. (eds) Genetic and Evolutionary Computation – GECCO 2004. GECCO 2004. Lecture Notes in Computer Science, vol 3102. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24854-5_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24854-5_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22344-3

  • Online ISBN: 978-3-540-24854-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics