Skip to main content
Log in

Codeword design and information encoding in DNA ensembles

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

Encoding of information in DNA-, RNA- and other biomolecules is animportant area of research in fields such as DNA computing,bioinformatics, and, conceivably, microbiology and genetics. This surveyfocuses on two fundamental problems, the codeword design problemand the representation problem of abiotic information, formassively parallel processing with DNA molecules. The first problemrequires libraries of DNA sequences to be designed so that specificduplexes are formed during annealing while simultaneously preventingother undesirable hybridizations from occurring in the course of acomputation in the tube. The second involves a search for efficient andcost-effective methods of representing non-biological information in DNAsequences for storage and retrieval of large amouns of data (tera- andpeta-byte scales). Two approaches are treated, namely thermodynamic andcombinatoric-computational. Both experimental and theoretical resultsare described. A reference list of major works in the area is given.Finally, some open problems deemed important for their possible impacton encoding of abiotic information representation and processing arediscussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adleman L (1994) Molecular computation of solutions of combinatorial problems. Science 266: 1021–1024

    Google Scholar 

  • Allawi HT and SantaLucia J Jr (1997) Thermodynamics and NMR of internal G·Cmis matches in DNA. Biochemistry 36: 10581–10594

    Article  Google Scholar 

  • Andronescu M, Dees D, Slaybaugh L, Zhao Y, Condon AE, Cohen B and Skiena S (2002) Algorithms for testing that DNA word designs avoid secondary structures. In: Ohuchi Hagiya A. (ed), pp. 182–195

  • Arita M and Kobayashi S (2002) DNA sequence design using templates. New Generation Computing 20(3): 263–277

    Google Scholar 

  • Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K, Wang-Iverson P and Bonitz SG (1993) Current Protocols in Molecular Biology. Greene Publishing Associates and Wiley-Interscience, New York

    Google Scholar 

  • Bahnzhaf W, Daida J, Eiben AE, Garzon MH, Hanovar V, Jakiela M and Smith RE (1999) Proc. of GECCO-99, the Genetic and Evolutionary Computation Conference, Orlando, Florida, Morgan Kaufmann, San Mateo, CA

  • Baum E (1995) Building an associative memory vastly larger than the brain. Science 268: 583–585

    Google Scholar 

  • Baum E (1996) DNA sequences useful for computation. In: Landweber and Baum (eds), pp. 122–127

  • Ben-Dor A, Karp R, Schwikowski B and Yakhini Z (2000) Universal DNA tag systems: A combinatorial design scheme. In: Proceedings of the Fourth International Conference on Computational Molecular Biology, ACM. Tokyo, Japan, April 8–11

  • Bi H, Chen J, Deaton R, Garzon M, Rubin H and Wood D (2003) A PCR-based protocol for in vitroselection of non-crosshybridizing oligonucleotides. J. of Natural Computing 2(3): 417–426

    Article  Google Scholar 

  • Bloomfield VA, Crothers DM and Tinoco I Jr (2000) Nucleic Acids: Structures, Properties, and Functions. University Science Books, Sausalito, CA

    Google Scholar 

  • Boneh D, Dunworth C, Lipton RJ and Sgall J (1996) Making DNA computers error-resistant. In: Landweber and Baum (eds), pp. 163–170

  • Borer PN, Dengler B, Tinoco I Jr. and Uhlenbeck OC (1974) Stability of ribonucleic acid double-stranded helices. J. Mol. Biol. 86: 843–853

    Article  Google Scholar 

  • Brenneman B and Condon A (2001) Sequence Design for Biomolecular Computation. In press. Available at http://www.cs.ubc.edu/~condon/papers/wordsurvey.ps

  • Cantor CR and Schimmel PR (1980) Biophysical Chemistry, Part III: The Behavior of Biological Macromolecules. Freeman, New York

    Google Scholar 

  • Carbone A and Seeman N (2002) Circuits and programmable self-assembling DNA structures. Proc. of the National Academy of Science USA 99: 12577–12582

    Article  Google Scholar 

  • Condon A and Rozenberg G eds (2000) Proc. 6th Int. Workshop on DNA-Based Computers DNA 2000, Leiden, The Netherlands. Lecture Notes in Computer Science LNCS 2054, Springer-Verlag

  • Cukras A, Faulhammer D, Lipton R and Landweber L (1998) Chess games: A model for RNA-based computation. In: Kari et al. (eds), pp. 35–45

  • Deaton R, Kim JW and Chen J (2003) Design and test of non-crosshybridizing oligonucleotide building blocks for DNA computers and nanostructures. Appl. Phys. Lett. 82: 1305–1307

    Article  Google Scholar 

  • Deaton RJ, Chen J, Bi H, Garzon M, Rubin H and Wood DH (2002a) A PCR-based protocol for in vitroselection of non-crosshybridizing oligonucleotides. In: Ohuchi Hagiya A (eds), pp. 105–114

  • Deaton RJ, Chen J, Bi H and Rose JA (2002b) A software tool for generating noncrosshybridizing libraries of DNA oligonucleotides. In: Ohuchi Hagiya A (eds), pp. 211–220

  • Deaton R, Murphy R, Rose J, Garzon M, Franceschetti D and Stevens SE Jr (1997a) Good encodings for DNA solution to combinatorial problems. Proc. IEEE Conference on Evolutionary Computation, IEEE Computer Society Press (1997), pp. 267–271

  • Deaton R, Garzon M and Rose JA (1997b) A DNA based artificial immune system for self-nonself discrimination. Proc. of the IEEE Int. Conference on Systems, Man and Cybernetics, Orlando. IEEE Press, Piscataway, NJ, pp. 369–374

    Google Scholar 

  • Deaton R, Murphy R, Garzon M, Franceschetti D and Stevens E Jr (1996) Good encodings for DNA solution to combinatorial problems. Second Princeton Conference on DNA-based Computers, June 1996. DIMACS series of the American Mathematical Society 44 (1998), pp. 247–258

    Google Scholar 

  • Faulhammer D, Cukras A, Lipton R and Landweber L (1999) When the knight falls: On constructing an RNA computer. In: Winfree D. Gifford (ed.), pp. 1–7

  • Feldkamp U, Saghafi S, Banzhaf W and Rouche H (2001) DNA sequence generator: A program for the construction of DNA sequences. In: Jonoska and Seeman, 2002, pp. 23–32

  • Fraenkel AS (1997) Protein folding, spin glass and computational complexity. In: Rubin and Wood (eds), pp. 101–121

  • Garey MR and Johnson DS (1979) Computers and Intractability. Freeman, New York

    Google Scholar 

  • Garzon M (ed) (2003a) Biomolecular machines and artificial evolution. Special Issue of the Journal of Genetic Programming and Evolvable Machines 4(2). Kluwer Academic Publishers

  • Garzon M, Blain D, Bobba K, Neel A and West M (2003b) Self-assembly of DNA-like structures in-silico. In: Garzon (ed.), pp. 185–200

  • Garzon M, Neel A and Chen H (2003c) Efficiency and reliability of DNA-based memories. In: E. Cantu-Paz et al. (eds), Proc. GECCO-2003, The Genetic and Evolutionary Programming Conference. Springer-Verlag Lecture Notes in Computer Science 2723, pp. 379–389

  • Garzon M, Bobba K and Hyde B (2003d) Digital information encoding on DNA. In: Jonoska N, Paun G, Rozenberg G (eds), Aspects of Molecular Computing. Springer-Verlag, Lecture Notes in Computer Science 2950, pp. 152–166

  • Garzon M, Neel A and Bobba K (2003e) Efficiency and reliability of semantic retrieval in DNA-based memories. In: J. Reif and J. Chen (eds) Proc. DNA9-2003, Int. Workshop on DNA-based Computers. Springer-Verlag Lecture Notes in Computer Science, in press

  • Garzon M and Oehmen C (2001) Biomolecular computing in virtual test tubes. In: Jonoska and Seeman, 2002, pp. 117–128

  • Garzon M and Deaton RJ (2000) Biomolecular computing: A definition. Kunstliche Intelligenz 1/00: 63–72

    Google Scholar 

  • Garzon M and Deaton RJ (1999) Biomolecular computing and programming. IEEE Trans. on Evolutionary Comp. 3(2): 36–50

    Google Scholar 

  • Garzon M, Deaton R, Rose JA and Franceschetti DR (1999b) Soft molecular computing. In: Winfree and Gifford (eds), pp. 89–98

  • Garzon M, Neathery PI, Deaton R, Murphy RC, Franceschetti DR and Stevens SE Jr (1997a) A new metric for DNA computing. In: Koza et al., 1997, pp. 472–478

  • Garzon M, Deaton R, Neathery P, Murphy RC, Franceschetti DR and Stevens E Jr (1997b) On the encoding problem for DNA computing. Poster at The Third DIMACS Workshop on DNA-based Computing, U of Pennsylvania. Preliminary Proceedings, pp. 230–237

  • Gray DM(1997) Derivation of nearest-neighbor properties from data on nucleic acid oligomer. I. Simple sets of independent sequences and the influence of absent nearest-neighbors. Biopolymers 42: 783–793

    Article  Google Scholar 

  • Gray DM and Tinoco I Jr (1970) A new approach to the study of sequence-dependent properties of polynucleotides. Biopolymers 9: 223–244

    Article  Google Scholar 

  • Hagiya M and Ohuchi A eds (2002) Proc. 8th Int. Meeting on DNA-Based Computers. Springer-Verlag Lecture Notes in Computer Science LNCS 2568. Springer-Verlag

  • Hartemink AJ and Gifford DK (1997) Thermodynamic simulation of deoxyoligonucleotide hybridization of DNA computation. In: Rubin and Wood (eds), pp. 25–37

  • Head T and Yamamura M and Gal S (2001) Relativized code concepts and multi-tube DNA dictionaries, in press

  • Head T, Yamamura M and Gal S (1999) Aqueous computing: Writing on molecules. Proceedings of the Congress on Evolutionary Computing (CEC'99)

  • Heitsch CE, Condon AE and Hood HH (2002) From RNA secondary structure to coding theory: A combinatorial approach. In: Hagiya and Ohuchi, pp. 125–136

  • Hinze T, Hatnik U and Sturm M (2001) An object oriented simulation of real occurring molecular biological processes for DNA computing and its experimental verification. In: Jonoska and Seeman, 2002, pp. 1–13

  • Holland JH (1992) Adaptation in Natural and Artificial Systems. MIT Press, Cambridge, MA

    Google Scholar 

  • Hopcroft, J and Ullman J (1979). Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Reading, MA

    Google Scholar 

  • Hussini S, Kari L, Rubin H and Konstantinidis S (2001) Coding properties of DNA languages. In: Jonoska and Seeman, 2002, pp. 57–69

  • Jonoska, N and Seeman N eds (2001) Proc. 7th Int. Workshop on DNA-Based Computers DNA 2000, Tampa, Florida. Lecture Notes in Computer Science LNCS 2340, Springer-Verlag, 2002

  • Kari L, Kitto R and Thierrin G (2000) Codes, envolutions and DNA encoding. In: Proc. Workshop on Coding Theory, London, Ontario, in press

  • Kari L, Rubin H and Wood D eds (1998) Proc. 4th DIMACS workshop on DNA Computers, University of Pennsylvania, 1998. Special issue of Biosystems 52: 1–3 (1999)

  • Karp R, Kenyon C and Waarts O (1996) Error-resilient DNA Computation. Proc. 7th Annual Symposium on Discrete Algorithms SODA, pp. 458–467

  • Karp RM and Wigderson A (1985) A fast parallel algorithm for the maximal independent set problem. Journal of the Association for Computing Machinery 32: 762–773

    Google Scholar 

  • Klein JP, Leete TH and Rubin H (1997) A biomolecular implementation of logically reversible computation with minimal energy dissipation. In: Rubin and Wood, pp. 15–23

  • Kool ET (1999) New molecular strategies for joining, pairing, and amplifying. In: Winfree and Gifford, p. 62

  • Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H and Riolo RL (eds) (1998) Proc. 3rd Annual Genetic Programming Conference. Morgan Kaufmann, San Mateo, California

    Google Scholar 

  • Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H and Riolo RL (eds) (1997) Proc. 2nd Annual Genetic Programming Conference. Morgan Kaufmann, San Mateo, California

    Google Scholar 

  • Landweber, L and Kari L (1999) The evolution of cellular computing: Nature's solution to a computational problem. In: Kari et al., 1998, pp. 3–13

  • Landweber LF and Baum EB (eds) (1996) DNA Based Computers II. Proc. of the Second DIMACS Workshop on DNA-Based Computers, Princeton University. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI: American Mathematical Society, vol. 44 (1998)

  • Luby M (1986) A simple parallel algorithm for the maximal independent set problem. SIAM Journal on Computing 15: 1036–1053

    Article  Google Scholar 

  • Mao C, LaBean TH, Reif JH and Seeman NC (2000) Logical computation using algorithmic self-assembly of DNA triple-crossover molecules. Nature 407: 493–496

    Article  Google Scholar 

  • Marathe A, Condon AE and Corn RM(1999) On combinatorial DNA word design. In:Winfree and Gifford, 1999, pp. 75–88

  • McCaskill JS (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29: 1105–1119

    Article  Google Scholar 

  • Mezard M, Parisi G and Virasoro MA (1987) Spin Glass Theory and Beyond.World Scientific, Singapore

    Google Scholar 

  • Mitchell J and Yurke B (2001) DNA scissors. In: Jonoska and Seeman, 2002, pp. 263–268

  • Monasson R, Zecchina R, Kirkpatrick S, Selman B and Troyansky L (1999) Determining computational complexity from characteristic phase transitions. Nature 400: 133–137

    Article  Google Scholar 

  • Mount D (2001) Bioinformatics: Sequence and Genome Analysis. Spring Harbor Lab Press

  • MD Nusser M and Deaton RJ (2003) DNA computing with in vitroselection. In: Garzon, pp. 173–183

  • Poland D and Scheraga HA (1970) Theory of Helix-Coil Transitions in Biopolymers. Academic Press, Inc., New York

    Google Scholar 

  • Reif JH, LaBean TH, Pirrung M, Rana VS, Guo B, Kingsford C and Wickman GS (2001) Experimental construction of very-large scale databases with associative search capability. In: Jonoska and Seeman, 2002, pp. 231–247

  • Roman J (1995) The Theory of Error-Correcting Codes. Springer-Verlag, Berlin

    Google Scholar 

  • Rose JA, Deaton RJ, Hagiya M and Suyama A (2000) The fidelity of the tag-antitag system. In: Jonoska and Seeman, 2001, pp. 138–149

  • Rose JA, Deaton R, Franceschetti DR, Garzon MH, Stevens SE Jr (1999) A statistical mechanical treatment of error in the annealing biostep of DNA computation. Banzhaf et al., pp. 1829–1834

  • Rose JA, Deaton R, Garzon M, Murphy RC, Franceschetti DR and Stevens SE Jr (1997) The effect of uniform melting temperatures on the efficiency of DNA computing. In: Rubin and Wood, pp. 35–42

  • Rubin H and Wood D (eds) (1997) Third DIMACS Workshop on DNA-Based Computers, The University of Pennsylvania, 1997. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI. American Mathematical Society 48 (1999)

  • SantaLucia J (2001) DNA hybridization is not digital. Plenary address at the Seventh International Meeting on DNA Based Computers

  • SantaLucia J Jr (1998) A unified view of polymer, dumbell, and oligonucleotide nearestneighbor thermodynamics. Proc. Natl. Acad. Sci. 95: 1460–1465

    Article  Google Scholar 

  • SantaLucia J Jr, Allawi HT and Seneviratne PA (1996) Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 35: 3555–3562

    Article  Google Scholar 

  • Shin SY, Kim DM, Lee IH and Zhang BT (2002) Evolutionary Sequence Generation for Reliable DNA Computing. In IEEE Congress on Evolutionary Computation 2002, in press

  • Shor PW (1996) Fault-tolerant quantum computation. Proc. IEEE Foundations of Computer Science FOCS, 56–65

  • Smith TF and Waterman MS (1981) The identification of common molecular subsequences. J. Mol. Biol. 147: 195–197

    Article  Google Scholar 

  • Wartell RM and Benight AS (1985) Thermal denaturation of DNA molecules. Physics Reports 126: 67–107

    Article  Google Scholar 

  • Watson JD, Hopkins NH, Roberts JW, Steitz JA and Weiner AM (1987) Molecular Biology of the Gene, 4th ed. The Benjamin/Cummings Publishing Co., Menlo Park, CA

    Google Scholar 

  • Weigt M and Hartmann AK (2000) Number of guards needed by a museum: A phase transition in vertex covering of random graphs. Phys. Rev. Lett. 84: 6118–6121

    Article  Google Scholar 

  • West M, Garzon MH and Blain D (2003) DNA-like genomes for evolution in silico. In: E. Cantu-Paz et al. (eds), Proc. GECCO-2003, The Genetic and Evolutionary Programming Conference. Springer-Verlag Lecture Notes in Computer Science 2723, pp. 413–424

  • Wetmur JG (1997) Physical chemistry of nucleic acid hybridization. In: Rubin and Wood, pp. 1–23

  • Winfree E and Gifford D eds. (1999) Fifth International Metting on DNA Based Computers, MIT, Boston, MA. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, RI. American Mathematical Society, 44 (2000)

  • Winfree E, Liu F, Wenzler LA and Seeman NC (1998) Design and self-assembly of two-dimensional DNA crystals. Nature 394: 539–544

    Article  Google Scholar 

  • Yamamoto M, Yamashita J, Shiba T, Hirayama T, Takiya S, Suzuki K, Munekata M and Ohuchi A (1999) A study on the hybridization process in DNA computing. In: Winfree and Gifford, pp. 101–110

  • Yurke B and Mills AP Jr (2003) Using DNA to power Nanostructures. In: Garzon, 2003a, pp. 111–122

  • Yurke B, Tauberfield AJ, Mills AP Jr, Simmel FC and Neumann JL (2000) A DNA-fuelled molecular machine made of DNA. Nature 406: 605–608

    Article  Google Scholar 

  • Zhang BT and Shin SY (1998) Molecular algorithms for efficient and reliable DNA computing. In: Koza et al., 1998, pp. 735–742

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Max H. Garzon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Garzon, M.H., Deaton, R.J. Codeword design and information encoding in DNA ensembles. Natural Computing 3, 253–292 (2004). https://doi.org/10.1023/B:NACO.0000036818.27537.c9

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:NACO.0000036818.27537.c9

Navigation