Abstract
We investigate discrete structures and combinatoric modeling of weighted prefix trees for managing and analyzing DNA microarray data. We describe the algorithms to construct the weighted trees for these data. Using these weighted trees with our algorithms, we propose methods to compute the appearance probability of a DNA microarray, to compare the informational distances in the expression of genes between the DNA microarrays, to search the characteristic microarrays and the group of candidate genes suggestive of a pathology.
Similar content being viewed by others
References
Salomaa, A.: Theory of automata. Pergamon Press (1969)
Jacob, G.: Langages, Automates, Séries Formelles. Publications no 107, Laboratoire d’Informatique, USTL (1978)
Inza I. (2004). Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31: 91–103
Berstel J., Reutenauer, C.: Les séries rationnelles et leurs langage. Masson (1984)
Boulicaut, J.F., Gandrillon, O.: Informatique pour l’analyse du transcriptome. Hermee-Lavoisier, (2004)
Bourdon J. (2001). Size and path length of Patricia tries: dynamical soures context. Random Struct Algorithms 19(3–4): 289–315
Quackenbush, J.: Computational analysis of microarray data. Nat Genet 32, 509–514 (2001)
Sakarovitch, J.: Elements de theorie des automates. Vuibert Informatique (2003)
Brown, M., Grundy, W., et al.: Knowledge-based analysis of microarray gene expression data by using support vector machines. University of California (1999)
Crochemore, M., et al.: Algorithmique du texte. Vuibert Informatique (2001)
Eisen M. and Botstein D. (1998). Cluster analysis and display genome-wide expression patterns. Pro. Natl. Acad. Sci 95(25): 14863–14868
Fliess, M.: Automates stochastiques et séries rationnelles non commutatives. In: Nivat, M.~(ed.), Proceedings International Colloquium on Automata. Languages and Programming (ICALP), pp. 397–411 (1972)
Kerr M. and Churchill G. (2001). Experimental design for gene expression microarrays. Biostatistics 2: 183–201
Lothaire, M.: Combinatorics on Words. Addison-Wesley Publishing (1983)
Flajolet, P., Sedgewick, R.: An Introduction to the Analysis of Algorithms. Addison-Wesley (1996)
Walker P. and Famili A. (2004). Data mining of gene expression changes in Alzheimer brain. Artif. Intell. Med 31: 137–154
Peeter R. (2000). The maximum edge biclique problem is NP-complete. Discrete Appl. Math. 131(3): 651–654
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Interscience Publication (1991)
Tran, T., Nguyen, C.C., Minh, H.N.: Data mining of gene expression microarray via weighted prefix trees. In: Proceedings PAKDD’05, LNAI 3518, pp. 21–31. Springer Verlag (2005)
Tran, T., Nguyen, C.C., Minh, H.N.: Bi-clustering des donns de biopuces par les arbres pondérés de plus long préfixe, accepté. TSI. vol. Modisation et Simulation pour la Post-Gomique, Hemes-Lavoisier (2006)
Szpankowski, W.: Average Case Analysis of Algorithms on Sequences. Wiley (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tran, T., Nguyen, C.C. & Hoang, N.M. Management and analysis of DNA microarray data by using weighted trees. J Glob Optim 39, 623–645 (2007). https://doi.org/10.1007/s10898-007-9158-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-007-9158-9