Abstract
DNA sequences are sequences with elements from the quaternary DNA alphabet {A, C, G, T}. An important property of them is their directedness and ability to form duplexes as a result of hybridization process, i.e., coalescing two oppositely directed sequences. In biological experiments exploiting this property it is necessary to generate an ensemble of such sequences (DNA codes) consisting of pairs of DNA sequences referred to as Watson-Crick duplexes. Furthermore, for any two words of the DNA code that do not form a Watson-Crick duplex, hybridization energy—stability measure of a potential DNA duplex—is upper bounded by a constant specified by conditions of an experiment. This problem can naturally be interpreted in terms of coding theory. Continuing our previous works, we consider a nonadditive similarity function for two DNA sequences, which most adequately models their hybridization energy. For the maximum cardinality of DNA codes based on this similarity, we establish a Singleton upper bound and present an example of an optimal construction. Using ensembles of DNA codes with special constraints on codewords, which we call Fibonacci ensembles, we obtain a random-coding lower bound on the maximum cardinality of DNA codes under this similarity function.
Similar content being viewed by others
References
Levenshtein, V.I., Binary Codes Capable of Correcting Deletions, Insertions, and Reversals, Dokl. Akad. Nauk SSSR, 1965, vol. 163, no. 4, pp. 845–848 [Soviet Phys. Dokl. (Engl. Transl.), 1966, vol. 10, no. 8, pp. 707–710].
Levenshtein, V.I., Elements of Coding Theory, in Diskretnaya matematika i matematicheskie voprosy kibernetiki (Discrete Mathematics and Mathematical Problems of Cybernetics), Moscow: Nauka, 1974, pp. 207–305.
Levenshtein, V.I., Efficient Reconstruction of Sequences from Their Subsequences and Supersequences, J. Combin. Theory, Ser. A, 2001, vol. 93, no. 2, pp. 310–332.
MacWilliams, F.J. and Sloane, N.J.A., The Theory of Error-Correcting Codes, Amsterdam: North-Holland, 1977. Translated under the title Teoriya kodov, ispravlyayushchikh oshibki, Moscow: Svyaz’, 1979.
Tenengolts, G.M., Nonbinary Codes, Correcting Single Deletions or Insertions, IEEE Trans. Inform. Theory, 1984, vol. 30, no. 5, pp. 766–769.
Dancik, V., Expected Length of Longest Common Subsequence, PhD Thesis, Univ. of Warwick, UK, 1994.
D’yachkov, A.G., Macula, A.J., Pogozelski, W.K., Renz, T.E., Rykov, V.V., and Torney, D.C., A Weighted Insertion-Deletion Stacked Pair Thermodynamic Metric for DNA Codes, DNA Computing (Proc. 10th Int. Workshop on DNA Computing, Milan, Italy, June 7–10, 2004), Ferretti, C., Mauri, G., and Zandron, C., Eds., Lect. Notes Comp. Sci, vol. 3384, Berlin: Springer, 2005, pp. 90–103.
Bishop, M.A., D’yachkov, A.G., Macula, A.J., Renz, T.E., and Rykov, V.V., Free Energy Gap and Statistical Thermodynamic Fidelity of DNA Codes, J. Comput. Biol., 2007, vol. 14, no. 8, pp. 1088–1104.
SantaLucia J., Jr., A Unified View of Polymer, Dumbbell, and Oligonucleotide DNA Nearest-Neighbor Thermodynamics, Proc. Natl. Acad. Sci. USA, 1998, vol. 95, no. 4, pp. 1460–1465.
D’yachkov, A.G. and Voronina, A.N., DNA Codes for Additive Stem Similarity, Probl. Peredachi Inf., 2009, vol. 45, no. 2, pp. 56–77 [Probl. Inf. Trans. (Engl. Transl.), 2009, vol. 45, no. 2, pp. 124–144].
D’yachkov, A.G., Erdős, P.L., Macula, A.J., Rykov, V.V., Torney, D.C., Tung, C.-S., Vilenkin, P.A., and White, P.S., Exordium for DNA Codes, J. Comb. Optim., 2003, vol. 7, no. 4, pp. 369–379.
D’yachkov, A.G., Vilenkin, P.A., Ismagilov, I.K., Sarbaev, R.S., Macula, A., Torney, D., and White, S., On DNA Codes, Probl. Peredachi Inf., 2005, vol. 41, no. 4, pp. 57–77 [Probl. Inf. Trans. (Engl. Transl.), 2005, vol. 41, no. 4, pp. 349–367].
King, O.D. and Gaborit, P., Linear Constructions for DNA Codes, Theoret. Comput. Sci., 2005, vol. 334, no. 1–3, pp. 99–113.
D’yachkov, A.G., Macula, A.J., Renz, T.E., and Rykov, V.V., Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences, in Proc. 2008 IEEE Int. Sympos. on Information Theory, Toronto, Canada, July 6–11, 2008, pp. 2292–2296.
D’yachkov, A.G., Voronina, A.N., Macula, A.J., Renz, T.E., and Rykov, V.V., On Critical Relative Distance of DNA Codes for Additive Stem Similarity, in Proc. 2010 IEEE Int. Sympos. on Information Theory (ISIT’2010), Austin, Texas, USA, June 13–18, 2010, P. 1325–1329.
Dyachkov, A.G., Voronina, A.N., Volkova, J.A., and Polyanskii, N.A., On Optimal DNA Codes for Additive and Non-Additive Stem Similarity, in Proc. 7th Int. Workshop on Coding and Cryptography (WCC’2011), Paris, France, April 11–15, 2011, pp. 313–322.
Cameron, P.J., Combinatorics: Topics, Techniques, Algorithms, Cambridge, UK: Cambridge Univ. Press, 1994.
Reingold, E.M., Nievergelt, J., and Deo, N., Combinatorial Algorithms: Theory and Practice, Englewood Cliffs, N.J.: Prentice-Hall, 1977. Translated under the title Kombinatornye algoritmy: teoriya i praktika, Moscow: Mir, 1980.
Voronina, A.N., Probability-Theoretic and Combinatorial Problems in DNA Sequence Coding Theory, Cand. Sci. (Phys.-Math.) Dissertation, Moscow: Moscow State Univ., 2010.
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.G. D’yachkov, A.N. Kuzina, N.A. Polyansky, A. Macula, V.V. Rykov, 2014, published in Problemy Peredachi Informatsii, 2014, Vol. 50, No. 3, pp. 51–75.
Rights and permissions
About this article
Cite this article
D’yachkov, A.G., Kuzina, A.N., Polyansky, N.A. et al. DNA codes for nonadditive stem similarity. Probl Inf Transm 50, 247–269 (2014). https://doi.org/10.1134/S0032946014030041
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0032946014030041