Skip to main content

Finding the Optimal Gene Order in Displaying Microarray Data

  • Conference paper
  • First Online:
Book cover Genetic and Evolutionary Computation — GECCO 2003 (GECCO 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2724))

Included in the following conference series:

Abstract

The rapid advances of genome-scale sequencing have brought out the necessity of developing new data processing techniques for enormous genomic data. Microarrays, for example, can generate such a large number of gene expression data that we usually analyze them with some clustering algorithms. However, the clustering algorithms have been ineffective for visualization in that they are not concerned about the order of genes in each cluster. In this paper, a hybrid genetic algorithm for finding the optimal order of microarray data, or gene expression profiles, is proposed. We formulate our problem as a new type of traveling salesman problem and apply a hybrid genetic algorithm to the problem. To use the 2D natural crossover, we apply the Sammon’s mapping to the microarray data. Experimental results showed that our algorithm found improved gene orders for visualizing the gene expression profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. A. Alizadeh, M. B. Eisen, and et al. Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature, 403(6769):503–511, 2000.

    Article  Google Scholar 

  2. Z. Bar-Joseph, D. K. Gifford, and T. S. Jaakkola. Fast optimal leaf ordering for hierarchical clustering. Bioinformatics, 17:22–29, 2001.

    Google Scholar 

  3. A. Ben-Dor, R. Shamir, and Z. Yakhini. Clustering gene expresssion patterns. Journal of Computational Biology, 6:281–297, 1999.

    Article  Google Scholar 

  4. J. L. Bentley. Experiments on traveling salesman problem. In 1st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’ 90), pages 129–133, 1990.

    Google Scholar 

  5. T. Biedl, B. Brejova, and et al. Optimal arrangement of leaves in the tree representing hierarchical clustering of gene expression data. Technical Report Technical Report CS-2001-14, Dept. of Computer Science, University of Waterloo, 2001.

    Google Scholar 

  6. T. N. Bui and B. R. Moon. Graph partitioning and genetic algorithms. IEEE Transactions on Computers, 45:841–855, 1996.

    Article  MATH  MathSciNet  Google Scholar 

  7. H. David. First (?) occurrence of common terms in mathematical statistics. The American Statistician, 49:121–133, 1995.

    Article  Google Scholar 

  8. W. Dzwinel. How to make Sammon mapping useful for multidimensional data structures analysis. Pattern Recognition, 27(7):949–959, 1994.

    Article  Google Scholar 

  9. A. Edwards and L. Cavalli-sforza. A method for cluster analysis. Biometrics, 21:362–375, 1965.

    Article  Google Scholar 

  10. M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. In Proceedings of the National Academy of Sciences, pages 14863–14867, 1998.

    Google Scholar 

  11. A. M. Fraser. Reconstructing attractors from scalar time series: a comparison of singular system and redundancy criteria. Physica D, 34:391–404, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  12. M. L. Fredman, D. S. Johnson, L. A. McGeoch, and G. Ostheimer. Data structures for traveling salesman. In 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’ 93), pages 145–154, 1993.

    Google Scholar 

  13. D. E. Goldberg. Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Reading, MA, 1989.

    MATH  Google Scholar 

  14. D. E. Goldberg, K. Deb, and B. Korb. Do not worry, be messy. In Proceedings of the Fourth International Conference on Genetic Algorithms, pages 24–30, 1991.

    Google Scholar 

  15. R. Hamming. Error detecting and error correcting codes. Bell systems Technical Journal, 29(2):147–160, 1950.

    MathSciNet  Google Scholar 

  16. J. Harris. The arithmetic of the product moment of calculating the coefficient of correlation. American Nature, 44:693–699, 1910.

    Article  Google Scholar 

  17. J. Herrero, A. Valencia, and J. Dopazo. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics, 17:126–136, 2001.

    Article  Google Scholar 

  18. J. Holland. Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor, 1975.

    Google Scholar 

  19. D. S. Johnson. Local optimization and the traveling salesman problem. In 17th Colloquium on Automata, Languages, and Programming, pages 446–461, 1990.

    Google Scholar 

  20. S. Jung and B. R. Moon. The natural crossover for the 2D Euclidean TSP. In Genetic and Evolutionary Computation Conference, pages 1003–1010, 2000.

    Google Scholar 

  21. S. Jung and B. R. Moon. Toward minimal restriction of genetic encoding and crossovers for the 2D Euclidean TSP. IEEE Transactions on Evolutionary Computation, 6(6):557–565, 2002.

    Article  Google Scholar 

  22. S. Kawasaki, C. Borchert, and et al. Gene expression profiles during the initial phase of salt stress in rice. Plant Cell, 13(4):889–906, 2001.

    Article  Google Scholar 

  23. M. Kendall. A new measure of rank correlation. Biomerika, 30:81–93, 1938.

    MATH  Google Scholar 

  24. A. B. Khodursky, B. J. Peter, and et al. DNA microarray analysis of gene expression in reponse to physiological and genetic changes that affect tryptophan metabolism in escherichia coli. In Proceedings of the National Academy of Sciences, pages 12170–12175, 2000.

    Google Scholar 

  25. W. Li. Mutual information functions versus correlation functions. Journal of Statistical Physics, 60:823–837, 1990.

    Article  MATH  MathSciNet  Google Scholar 

  26. S. Lin and B. Kernighan. An effective heuristic algorithm for the traveling salesman problem. Operations Research, 21(4598):498–516, 1973.

    Article  MATH  MathSciNet  Google Scholar 

  27. O. Martin, S. Otto, and E. Felten. Large-step Markov chains for the traveling salesman problem. Complex Systems, 5:299–236, 1991.

    MATH  MathSciNet  Google Scholar 

  28. P. Merz and A. Zell. Clustering gene expression profiles with memetic algorithms. In Proceedings of the 7th International Conference on Parallel Problem Solving from Nature, pages 811–820, 2002.

    Google Scholar 

  29. P. Moscato. On evolution, search, optimization, genetic algorithms and martial arts: Towards memetic algorithms. Technical Report Technical Report C3P Report 826, Concurrent Computation Program, California Institute of Technology, 1989.

    Google Scholar 

  30. Y. Nagata and S. Kobayashi. Edge assembly crossover: A high-power genetic algorithm for the traveling saleman problem. In 7th International Conference on Genetic Algorithms, pages 450–457, 1997.

    Google Scholar 

  31. E. Pekalska, D. De Ridder, R. P. W. Duin, and M. A. Kraaijveld. A new method of generalizing Sammon mapping with application to algorithm speed-up. In Fifth Annual Conference of the Advanced School for Computing and Imaging, pages 221–228, 1999.

    Google Scholar 

  32. J. M. Renders and H. Bersini. Hybridizing genetic algorithms with hill-climbing methods for global optimization: Two possible ways. In Proceedings of the First IEEE Conference on Evolutionary Computation, pages 312–317, 1994.

    Google Scholar 

  33. D. De Ridder and R. P. W. Duin. Sammon’s mapping using neural networks: a comparision. Pattern Recognition Letters, 18(11–13):1307–1316, 1997.

    Article  Google Scholar 

  34. J. W. Sammon, Jr. A non-linear mapping for data structure analysis. IEEE Transactions on Computers, 18:401–409, 1969.

    Article  Google Scholar 

  35. R. Schaffer, J. Landgraf, and et al. Microarray analysis of diurnal and circadian-regulated genes in arabidopsis. Plant Cell, 13(1):113–123, 2001.

    Article  Google Scholar 

  36. M. Schena, D. Shalon, R. W. Davis, and P. O. Brown. Quantitative monitoring of gene expresssion patterns with a complementary DNA microarray. Science, 270(5235):467–470, 1995.

    Article  Google Scholar 

  37. D. Shalon, S. J. Smith, and P. O. Brown. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Research, 6(7):639–645, 1996.

    Article  Google Scholar 

  38. R. R. Sokal and C. D. Michener. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, 38:1409–1438, 1958.

    Google Scholar 

  39. C. Spearman. The proof and measurement of association between two things. American Journal of Psychology, 15:72–101, 1904.

    Article  Google Scholar 

  40. T. S. Spellman, G. Sherlock, and et al. Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisia by microarray hybridization. Molecular Biology of the Cell, 9:3273–3297, 1998.

    Google Scholar 

  41. A. Sturn. Cluster analysis for large scale gene expression studies. Master’s thesis, Graz University of Technology, Graz, Austria, 2001.

    Google Scholar 

  42. P. Tamayo, D. Slonim, and et al. Interpreting patterns of gene expresssion with self-organizing maps: Methods and application to hematopoietic differentiation. In Proceedings of the National Academy of Sciences, pages 2907–2912, 1999.

    Google Scholar 

  43. S. Tavazoie, J. D. Hughes, and et al. Systematic determination of genetic net work architecture. Nature Genetics, 22:281–285, 1999.

    Article  Google Scholar 

  44. P. Toronen, M. Kolehmainen, G. Wong, and E. Castren. Analysis of gene expression data using self-organizing maps. FEBS Letters, 451:142–146, 1999.

    Article  Google Scholar 

  45. H. K. Tsai, J. M. Yang, and C. Y. Kao. A genetic algorithm for traveling salesman problems. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), pages 687–693, 2001.

    Google Scholar 

  46. H. K. Tsai, J. M. Yang, and C. Y. Kao. Applying genetic algorithms to finding the optimal order in displaying the microarray data. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2002), pages 610–617, 2002.

    Google Scholar 

  47. H. K. Tsai, J. M. Yang, and C. Y. Kao. Solving traveling salesman problems by combining global and local search mechanisms. In Proceedings of the Congress on Evolutionary Computation (CEC 2002), pages 1290–1295, 2002.

    Google Scholar 

  48. J. Ward. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58:236–244, 1963.

    Article  MathSciNet  Google Scholar 

  49. D. Whitley, V. Gordon, and K. Mathias. Larmarckian evolution, the baldwin effect and function optimization. In International Conference on Evolutionary Computation, Oct. 1994. Lecture Notes in Computer Science, 866:6–15, Springer-Verlag.

    Google Scholar 

  50. D. Whitley and J. Kauth. GENITOR: A different genetic algorithm. In Proceedings of Rocky Mountain Conference on Artificial Intelligence, pages 118–130, 1988.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, SK., Kim, YH., Moon, BR. (2003). Finding the Optimal Gene Order in Displaying Microarray Data. In: Cantú-Paz, E., et al. Genetic and Evolutionary Computation — GECCO 2003. GECCO 2003. Lecture Notes in Computer Science, vol 2724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45110-2_116

Download citation

  • DOI: https://doi.org/10.1007/3-540-45110-2_116

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40603-7

  • Online ISBN: 978-3-540-45110-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics