Skip to main content
Log in

DNA chips for species identification and biological phylogenies

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

The codeword design problem is an important problem in DNA computing and its applications. Several theoretical analyses as well as practical solutions for short oligonucleotides (up to 20-mers) have been generated recently. These solutions have, in turn, suggested new applications to DNA-based indexing and natural language processing, in addition to the obvious applications to the problems of reliability and scalability that generated them. Here we continue the exploration of this type of DNA-based indexing for biological applications and show that DNA noncrosshybridizing (nxh) sets can be successfully applied to infer ab initio phylogenetic trees by providing a way to measure distances among different genomes indexed by sets of short oligonucleotides selected so as to minimize crosshybridization. These phylogenies are solidly established and well accepted in biology. The new technique is much more effective in terms of signal-to-noise ratio, cost and time than current methods. Second, it is demonstrated that DNA indexing does provide novel and principled insights into the phylogenesis of organisms hitherto inaccessible by current methods, such as a prediction of the origin of the Salmonella plasmid 50 as being acquired horizontally, likely from some bacteria somewhat related to Yesinia. Finally, DNA indexing can be scaled up to newly available universal DNA chips readily available both in vitro and in silico. In particular, we show how a recently obtained such set of nxh 16-mers can be used as a universal coordinate system in DNA spaces to characterize very large groups (families, genera, and even phylla) of organisms on a uniform biomarker reference system, a veritable and comprehensive “Atlas of Life”, as it is or as it could be on earth.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Adleman L (1994) Molecular computation of solutions of combinatorial problems. Science 266:1021–1024

    Article  Google Scholar 

  • Bi H, Chen J, Deaton R, Garzon M, Rubin H, Wood DH (2003) A PCR protocol for in vitro selection of non-crosshybridizing oligonucleotides. J Nat Comput 2(3):417–426

    Article  MathSciNet  Google Scholar 

  • Blain D, Garzon MH, Shin SY, Zhang BT, Kashiwamura S, Yamamoto M, Kameda A, Ohuchi A (2004) Development, evaluation and benchmarking of simulation software for biomolecule-based computing. J Nat Comput 3(4):427–442

    Article  MathSciNet  Google Scholar 

  • Bobba KC, Neel AJ, Phan V, Garzon MH (2006) “Reasoning” and “Talking” DNA: can DNA understand English? In: Mao C, Yokomori S (eds) 12th International Conference on DNA Computing DNA12, Lecture notes in computer science 4287. Springer-Verlag, pp 337–349

  • Brown JR, Volker C (2004) Phylogeny of gamma-proteobacteria: resolution of one branch of the universal tree? Bioassay 26:463–468

    Article  Google Scholar 

  • Chen J, Deaton R, Garzon M, Wood DH, Bi H, Carpenter D, Wang YZ (2006) Characterization of non-crosshybridizing DNA oligonucleotides manufactured in vitro. J Nat Comput 5(2):165–181

    Article  MATH  Google Scholar 

  • DasGupta KM, Konwar II, Shvartsman AA (2005) Highly scalable algorithms for robust string barcoding. Int J Bioinform Res Appl 1:2

    Article  Google Scholar 

  • Deaton J, Chen J, Garzon M, Wood DH (2006) Test tube selection of large independent sets of DNA oligonucleotides R. World Scientific Publishing, Singapore pp 152–166 (Volume dedicated to Ned Seeman on occasion of his 60th birthday)

  • Garzon MH, Yan H (eds) (2008) DNA computing 13. In: Proceedings of 13th International Meeting. Lecture notes in computer science, vol 4848. Springer-Verlag, Heidelberg

  • Garzon MH, Blain D, Neel AJ (2004a) Virtual test tubes for biomolecular computing. J Nat Comput 3(4):461–477

    Article  MathSciNet  Google Scholar 

  • Garzon M, Bobba KV, Hyde B (2004b) Digital information encoding on DNA. Lecture notes in computer science, vol 2950. Springer, Heidelberg, pp 152–166

  • Garzon MH, Bobba K, Phan V, Kontham R (2005) Sensitivity and capacity of microarray encodings. In: Carbone A, Pierce NA (eds) 11th International Conference on DNA Computing DNA 11. Lecture notes in computer science, vol 3892. Springer-Verlag, Heidelberg, pp 81–95

  • Garzon MH, Phan V, Roy S, Neel AJ (2006) In search of optimal codes for DNA-computing. In: Mao C, Yokomori S (eds) 12th International Conference on DNA Computing DNA12. Lecture notes in computer science, vol 4287. Springer-Verlag, Heidelberg, pp 143–156

  • Garzon MH, Phan V, Neel A (2009) Optimal codes for computing and self-assembly. Int J Nanotechnol Mol Comput 1:1–17

    Article  Google Scholar 

  • Hennig W (1950) Grundzüge einer Theorie der Phylogenetischen Systematik English revision, Phylogenetic Systematics. (trans: Davis D, Zangerl R). University of Illinois Press, Urbana 1966 (reprinted 1979)

  • Henz SR, Huson DH, Auch AF, Nieselt-Struwe K, Schuster SC (2005) Whole-genome prokaryotic phylogeny. Bioinformatics 21(10):2329–2335

    Article  Google Scholar 

  • Liu TT, Lee REB, Barker KS, Lee RE, Wei L, Homayouni R, Rogers PD (2005) Genome-wide expression profiling of the response to azole, polyene, echinocandin, and pyrimidine antifungal agents in Candida albicans. Antimicrob Agents Chemother 49(6):2226–2236

    Article  Google Scholar 

  • Margulis L (1993) Symbiosis in cell evolution, 2nd edn. Freeman, New York

    Google Scholar 

  • Neel A, Garzon M (2006) Semantic retrieval in DNA-based memories with Gibbs energy models. Biotechnol Prog 22(1):86–90

    Article  Google Scholar 

  • Neel AJ, Garzon MH (2008) DNA-based memories: a survey. Stud Comput Intell 113:259–275

    Article  Google Scholar 

  • Ochman H, Elwyn S et al (1999) Calibrating bacterial evolution. Proc Natl Acad Sci USA 96(22):12638–12643

    Article  Google Scholar 

  • Paulsson J, Chattoraj DK (2006) Origin inactivation in bacterial DNA replication control. Mol Microbiol 61(1):9–15

    Article  Google Scholar 

  • Qiu Q, Mukre P, Bishop M, Bruns D, Wu Q (2008) Hardware accelerator for thermodynamic constrained DNA code generation. In: Garzon MH, Yan H (eds). Lecture notes in computer science, vol 4848. Springer, Heidelberg, pp 201–210

  • Reif JH, LaBean TM, Pirrung M, Rana VS, Guo B, Kingsfor C, Wickman GS (2001) Experimental construction of very-large scale databases with associative search capability. In: Proceedings of the 7th international workshop on DNA-based computers. Lecture notes in computer science, vol 2340. Springer-Verlag, Heidelberg, pp 231–247

  • Seeman N (2003) DNA in a material world. Nature 421:427–431

    Article  MathSciNet  Google Scholar 

  • Stekel D (2003) Microarray bioinformatics. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Tulpan D, Andronescu M, Chang SB, Shortreed MR, Condon A, Hoos HH, Smith LM (2005) Thermodynamically based DNA strand design. Nucleic Acids Res 33(15):4951–4964

    Article  Google Scholar 

  • Volff JN, Altenbuchner J (2000) A new beginning with new ends: linearisation of circular chromosomes during bacterial evolution. FEMS Microbiol Lett 186(2):143–150

    Article  Google Scholar 

  • Watkins NE, SantaLucia J Jr (2005) Nearest-neighbor thermodynamics of deoxyinosine pairs in DNA duplexes. Nucleic Acids Res 33(19):6258–6267

    Article  Google Scholar 

  • Winfree E, Liu F, Wenzler LA, Seeman NC (1998) Design and self-assembly of two-dimensional DNA crystals. Nature 394:539–544

    Article  Google Scholar 

  • Woese C, Fox G (1977) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci USA 74:5088–5090

    Article  Google Scholar 

  • Wong TY, Fernandes S, Sankhon N, Leong PP, Kuo J, Liu JK (2008) Role of premature stop codons in bacterial evolution. J Bacteriol 190(20):6718–6725

    Article  Google Scholar 

  • Zhou F, Olman V, Xu Y (2008) Barcodes for genomes and applications. Bioinformatics 9:546

    Google Scholar 

Download references

Acknowledgments

Many thanks to Abishek Logishetty and Jason Knisley for their help in producing the visualization of the signatures and trees above.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Max H. Garzon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Garzon, M.H., Wong, TY. DNA chips for species identification and biological phylogenies. Nat Comput 10, 375–389 (2011). https://doi.org/10.1007/s11047-010-9232-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-010-9232-y

Keywords

Navigation