Abstract
We develop and study the concept of similarity functions for q-ary sequences. For the case q = 4, these functions can be used for a mathematical model of the DNA duplex energy [1,2], which has a number of applications in molecular biology. Based on these similarity functions, we define a concept of DNA codes [1]. We give brief proofs for some of our unpublished results [3] connected with the well-known deletion similarity function [4–6]. This function is the length of the longest common subsequence; it is used in the theory of codes that correct insertions and deletions [5]. Principal results of the present paper concern another function, called the similarity of blocks. The difference between this function and the deletion similarity is that the common subsequences under consideration should satisfy an additional biologically motivated [2] block condition, so that not all common subsequences are admissible. We prove some lower bounds on the size of an optimal DNA code for the block similarity function. We also consider a construction of close-to-optimal DNA codes which are subcodes of the parity-check one-error-detecting code in the Hamming metric [7].
Similar content being viewed by others
REFERENCES
D'yachkov, A.G., Erdos, P.L., Macula, A.J., Rykov, V.V., Torney, D.C., Tung, C.-S., Vilenkin, P.A., and White, P.S., Exordium for DNA Codes, J. Combin. Optimization, 2003, vol. 7, no.4, pp. 369–379.
D'yachkov, A.G., Macula, A.J., Pogozelski, W.K., Renz, T.E., Rykov, V.V., and Torney, D.C., A Weighted Insertion-Deletion Stacked Pair Thermodynamic Metric for DNA Codes, in Proc. 10th Int. Workshop on DNA Computing, Milan, Italy, 2004, pp. 90–103.
D'yachkov, A.G., Torney, D.C., Vilenkin, P.A., and White, P.S., On a Class of Codes for Insertion-Deletion Metric, in Proc. 2002 IEEE Int. Symp. on Information Theory, Lausanne, Switzerland, 2002, pp. 372.
Levenshtein, V.I., Binary Codes Capable of Correcting Deletions, Insertions, and Reversals, Dokl. Akad. Nauk SSSR, 1965, vol. 163, no.4, pp. 845–848 [Soviet Phys. Dokl. (Engl. Transl.), 1966, vol. 10, no. 8, pp. 707–710].
Levenshtein, V.I., Elements of Coding Theory, in Diskretnaya matematika i matematicheskie voprosy kibernetiki (Discrete Mathematics and Mathematical Problems of Cybernetics), Moscow: Nauka, 1974, pp. 207–305.
Levenshtein, V.I., Efficient Reconstruction of Sequences from Their Subsequences and Supersequences, J. Combin. Theory, Ser. A, 2001, vol. 93, no.2, pp. 310–332.
MacWilliams, F.J. and Sloane, N.J.A., The Theory of Error-Correcting Codes, Amsterdam: North-Holland, 1977. Translated under the title Teoriya kodov, ispravlyayushchikh oshibki, Moscow: Svyaz', 1979.
D'yachkov, A.G. and Torney, D.C., On Similarity Codes, IEEE Trans. Inform. Theory, 2000, vol. 46, no.4, pp. 1558–1564.
Vilenkin, P.A., Asymptotic Problems of Combinatorial Coding Theory and Information Theory, Cand. Sci. (Phys.-Math.) Dissertation, Moscow: Moscow State Univ., 2000.
Adleman, L., Molecular Computation of Solutions to Combinatorial Problems, Science, 1994, vol. 266, pp. 1021–1024.
Dancik, V., Expected Length of Longest Common Subsequence, PhD Thesis, Univ. of Warwick, UK, 1994.
Tenengolts, G.M., Nonbinary Codes, Correcting Single Deletions or Insertions, IEEE Trans. Inform. Theory, 1984, vol. 30, no.5, pp. 766–769.
Levenshtein, V.I., Bounds for Deletion-Insertion Correcting Codes, in Proc. 2002 IEEE Int. Symp. on Information Theory, Lausanne, Switzerland, 2002, pp. 371.
Author information
Authors and Affiliations
Additional information
__________
Translated from Problemy Peredachi Informatsii, No. 4, 2005, pp. 57–77.
Original Russian Text Copyright © 2005 by D'yachkov, Vilenkin, Ismagilov, Sarbaev, Macula, Torney, White.
Rights and permissions
About this article
Cite this article
D'yachkov, A.G., Vilenkin, P.A., Ismagilov, I.K. et al. On DNA Codes. Probl Inf Transm 41, 349–367 (2005). https://doi.org/10.1007/s11122-006-0004-3
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11122-006-0004-3