Construction of DNA codes by using algebraic number theory

https://doi.org/10.1016/j.ffa.2015.10.008Get rights and content
Under an Elsevier user license
open archive

Abstract

The canonical structure of DNA has four bases – Thymine (T), Adenine (A), Cytosine (C), and Guanine (G) – and DNA codes are regarded as words over the alphabet set Σ={A,C,G,T}, satisfying certain combinatorial conditions. Good DNA codes are desirable for DNA computation, DNA microarray technologies and molecular barcodes, etc. One of the main tasks in DNA code designing is to build more codewords and better GC-content for given fixed word length n. Existing heuristic methods work well for small n. In this paper, we present a systematic method for constructing good DNA codes for large n by using irreducible cyclic codes. Being different from traditional DNA constructions, our method is based on algebraic number theory rather than classical heuristic algorithms and the conventional coding theory. Furthermore, comparing with the traditional DNA codes, our codes have larger number of codewords and better GC-content. As far as we know, it is the very first time to utilize irreducible cyclic codes for constructing a type of DNA codes.

MSC

94B05
94B15
20G40
12E20
05E15

Keywords

DNA codes
Irreducible cyclic codes
Gauss sum
Gauss period

Cited by (0)