Skip to main content
Log in

DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Data-hiding in deoxyribonucleic acid (DNA) sequences can be used to develop an organic memory and to track parent genes in an offspring as well as in genetically modified organism. However, the main concerns regarding data-hiding in DNA sequences are the survival of organism and successful extraction of watermark from DNA. This implies that the organism should live and reproduce without any functional disorder even in the presence of the embedded data. Consequently, performing synonymous substitution in amino acids for watermarking becomes a primary option. In this regard, a hybrid watermark embedding strategy that employs synonymous substitution in both twofold and fourfold codons of amino acids is proposed. This work thus presents a high-capacity and mutation-resistant watermarking technique, DNA-LCEB, for hiding secret information in DNA of living organisms. By employing the different types of synonymous codons of amino acids, the data storage capacity has been significantly increased. It is further observed that the proposed DNA-LCEB employing a combination of synonymous substitution, lossless compression, encryption, and Bose–Chaudary–Hocquenghem coding is secure and performs better in terms of both capacity and robustness compared to existing DNA data-hiding schemes. The proposed DNA-LCEB is tested against different mutations, including silent, miss-sense, and non-sense mutations, and provides substantial improvement in terms of mutation detection/correction rate and bits per nucleotide. A web application for DNA-LCEB is available at http://111.68.99.218/DNA-LCEB.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Agarwal H (2010) Matlab implementation, analysis and comparison of RSA family cryptosystems. In: Presented at the IEEE conference on computational intelligence and computing research (ICCIC). doi:10.1109/ICCIC.2010.5705873

  2. Ailenberg M, Rotstein OD (2009) An improved Huffman coding method for archiving text, images, and music characters in DNA. Biotechniques 47:747–754

    Article  PubMed  CAS  Google Scholar 

  3. Arita M, Ohashi Y (2004) Secret signatures inside genomic DNA. Biotechnol Prog 20:1605–1607

    Article  PubMed  CAS  Google Scholar 

  4. Balado FE, Haughton D (2010) Performance of DNA data embedding algorithms under substitution mutations. In: Presented at the 2010 IEEE international conference on bioinformatics and biomedicine workshops, Hong Kong, pp 201–206

  5. Bose RC, Chaudhuri R (1960) On a class of error correction binary group codes. Inf Control 3(1):68–79

    Article  Google Scholar 

  6. Chang CC, Lu T-C, Chang Y-F, Lee C-T (2007) Reversible data hiding schemes for deoxyribonucleic acid (DNA) medium. Int J Innov Comput Inf Control 3:1145–1160

    Google Scholar 

  7. Church GM, Gao Y, Kosuri S (2012) Next generation digital information storage in DNA. Science 07:2012

    Google Scholar 

  8. Cipra BA (1993) The ubiquitous Reed–Solomon codes. SIAM News 26-1

  9. Clelland CT, Risca V, Bancroft C (1999) Hiding data in DNA microdots. Nature 399:533–534

    Article  PubMed  CAS  Google Scholar 

  10. Crick F, Watson JD (1953) Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171:737–738

    Article  PubMed  Google Scholar 

  11. Daemen J, Rijmen V (1999) The block cipher rijndael . In: Third international conference, CARDIS’98, Louvain-la-Neuve, Belgium, September 14–16, 1998. Proceedings, pp 277–284. doi:10.1007/10721064_26

  12. Gehani A, LaBean TH, Reif JH (2004) DNA based cryptography. Comput J IMACS DNA Based Comput Am Math Soc USA 2950:34–50

    Google Scholar 

  13. Gonzalez RC, Woods RE (2002) Digital image processing. Pearson Education, New Delhi

    Google Scholar 

  14. Hayat M, Khan A, Yeasin M (2012) Prediction of membrane proteins using split amino acid ensemble classification. Amino Acids 42:2447–2460

    Article  PubMed  CAS  Google Scholar 

  15. Heider D, Barnekow A (2007) DNA-based watermarks using the DNA-Crypt algorithm. Comput J BMC Bioinform 8:176–187

    Article  Google Scholar 

  16. Heider D, Barnekow A (2008) DNA watermarks: a proof of concept. Comput J BMC Mol Biol 9:45–49

    Article  Google Scholar 

  17. Heider D, Kessler D, Barnekow A (2008) Watermarking sexually reproducing diploid organisms. Bioinformatics 24:1961–1962

    Article  PubMed  CAS  Google Scholar 

  18. Heider D, Pyka M, Barnekow A (2009) DNA watermarks in non-coding regulatory sequences. BMC Res Notes 2:125

    Article  PubMed  PubMed Central  Google Scholar 

  19. Khan A, Mirza AM (2007) Genetic perceptual shaping: utilizing cover image and conceivable attack information using genetic programming. Inf Fusion 8:354–365

    Article  Google Scholar 

  20. Khan A, Tahir SF, Majid A, Chor T-S (2008) Machine learning based adaptive watermark decoding in view of an anticipated attack. Pattern Recognit 41:2594–2610

    Article  Google Scholar 

  21. Kim H (2008) DNA repair Ku proteins in gastric cancer cells and pancreatic acinar cells. Amino Acids 34(2):195–202

  22. Liss M, Daubert D, Brunner K, Kliche K, Hammes U, Leiherer A et al (2012) Embedding permanent watermarks in synthetic genes. PLoS One 7:10

    Article  Google Scholar 

  23. Liu G, Liu H, Kadir A (2014) Hiding message into DNA sequence through DNA coding and chaotic maps. Med Biol Eng Comput 52(9):741–747. doi:10.1007/s11517-014-1177-3

  24. Miller F (1882) Telegraphic code to insure privacy and secrecy in the transmission of telegrams. C.M. Cornwell

  25. Modegi T (2005) Watermark embedding techniques for DNA sequences using codon usage bias features. In: Presented at the 16th international conference on genome informatics

  26. Mousa H, Moustafa K, Abdel-Wahed W, Hadhoud M (2011) Data hiding based on contrast mapping using DNA medium. Int Arab J Inf Technol 8:147–154

    Google Scholar 

  27. Naveed M, Khan A (2011) GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic. Amino Acids 42:1825

    Article  Google Scholar 

  28. NCBI (2012) GenBank. www.ncbi.nlm.nih.gov/genbank/

  29. NIoSaT (NIST) (2001) Federal information processing standards publication (FIPS 197). Advanced encryption standard (AES)

  30. Shimanovsky B, Feng J, Potkon M (2003) Hiding data in DNA. In: Presented at the revised papers from the 5th international workshop on information hiding, IH 2002 Noordwijkerhout, The Netherlands. Lecture Notes in Computer Science, vol 2578, pp 373–386

  31. Shiu HJ, Ng KL, Feng JF, Lee RCT, Huang CH (2010) Data hiding method based upon DNA sequences. Inf Sci 180:12

    Article  Google Scholar 

  32. Smith GC, Fiddes CC, Hawkings JP, Cox JPL (2003) Some possible codes for encrypting data in DNA. Biotechnol Lett 25:1125–1130

    Article  PubMed  CAS  Google Scholar 

  33. Tu C, Liang J, Tran TD (2003) Adaptive runlength coding. IEEE Signal Process Lett 10:61–64

    Article  Google Scholar 

  34. Usman I, Khan A (2010) BCH coding and intelligent watermark embedding: employing both frequency and strength selection. Appl Soft Comput J 10:332–343

    Article  Google Scholar 

  35. Wong PC, Wong K-K, Foote H (2003) Organic data memory using the DNA approach. Commun ACM 46:95–98

    Article  Google Scholar 

  36. Yachie N, Ohashi Y, Tomita M (2008) Stabilizing synthetic data in the DNA of living organisms. Syst Synth Biol 2:19–25

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work is supported by ICT R&D, Pakistan research grant project; ICTRDF/TR&D/2012/62-DEWS and COMSTECH-TWAS Joint Research Grants Program for Young Scientist; 12-216 RG/ITC/AS-C; UNESCO FR: 3240270865. We also thank Mr. Khurram Jawad for his help in improving the write-up of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asifullah Khan.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 153 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hafeez, I., Khan, A. & Qadir, A. DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids. Med Biol Eng Comput 52, 945–961 (2014). https://doi.org/10.1007/s11517-014-1194-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-014-1194-2

Keywords

Navigation