Skip to main content

Identification of Multiple Gene Subsets Using Multi-objective Evolutionary Algorithms

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2632))

Abstract

In the area of bioinformatics, the identification of gene subsets responsible for classifying available samples to two or more classes (for example, classes being ‘malignant’ or ‘benign’) is an important task. The main difficulties in solving the resulting optimization problem are the availability of only a few samples compared to the number of genes in the samples and the exorbitantly large search space of solutions. Although there exist a few applications of evolutionary algorithms (EAs) for this task, we treat the problem as a multi-objective optimization problem of minimizing the gene subset size and simultaneous minimizing the number of misclassified samples. Contrary to the past studies, we have discovered that a small gene subset size (such as four or five) is enough to correctly classify 100% or near 100% samples for three cancer samples (Leukemia, Lymphoma, and Colon). Besides a few variants of NSGA-II, in one implementation NSGA-II is modified to find multi-modal non-dominated solutions discovering as many as 630 different three-gene combinations providing a 100% correct classification to the Leukemia data. In order to perform the identification task with more confidence, we have also introduced a threshold in the prediction strength. All simulation results show consistent gene subset identifications on three disease samples and exhibit the flexibilities and efficacies in using a multi-objective EA for the gene identification task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In Proceedings of National Academy of Science, Cell Biology, volume 96, pages 6745–6750, 1999.

    Google Scholar 

  2. A. Ben-Dor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer, and Z. Yakhini. Tissue classification with gene expression profiles. Journal of Computational Biology, 7:559–583, 2000.

    Article  Google Scholar 

  3. P. A. Clarke, M. George, D. Cunningham, I. Swift, and P. Workman. Analysis of tumor gene expression following chemotherapeutic treatment of patients with bowl cancer. In Proceedings of Nature Genetics Microarray Meeting — 99, page 39, 1999.

    Google Scholar 

  4. K. Deb. Multi-objective optimization using evolutionary algorithms. Chichester, UK: Wiley, 2001.

    MATH  Google Scholar 

  5. K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, 2002.

    Article  Google Scholar 

  6. A. A. Alizadeh et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 403:503–511, 2000.

    Article  Google Scholar 

  7. D. Gershon. Microarray technology an array of opportunities. Nature, 416:885–891, 2002.

    Article  Google Scholar 

  8. D. E. Goldberg and J. Richardson. Genetic algorithms with sharing for multimodal function optimization. In Proceedings of the First International Conference on Genetic Algorithms and Their Applications, pages 41–49, 1987.

    Google Scholar 

  9. T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531–537, 1999.

    Article  Google Scholar 

  10. R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence Journal, Special Issue on Relevance, 97:234–271, 1997.

    Google Scholar 

  11. J. Liu and H. Iba. Selecting informative genes using a multiobjective evolutionary algorithm. In Proceedings of the World Congress on Computational Intelligence (WCCI-2002), pages 297–302, 2002.

    Google Scholar 

  12. J. Liu, H. Iba, and M. Ishizuka. Selecting informative genes with parallel genetic algorithms in tissue classification. Genome Informatics, 12:14–23, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Raji Reddy, A., Deb, K. (2003). Identification of Multiple Gene Subsets Using Multi-objective Evolutionary Algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Thiele, L., Deb, K. (eds) Evolutionary Multi-Criterion Optimization. EMO 2003. Lecture Notes in Computer Science, vol 2632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36970-8_44

Download citation

  • DOI: https://doi.org/10.1007/3-540-36970-8_44

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-01869-8

  • Online ISBN: 978-3-540-36970-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics