Skip to main content
Log in

DNA motif discovery using chemical reaction optimization

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

DNA motif discovery means to find short similar sequence elements within a set of nucleotide sequences. It has become a compulsory need in bioinformatics for its useful applications such as compression, summarization, and clustering algorithms. Motif discovery is an NP-hard problem and exact algorithms cannot solve it in polynomial time. Many optimization algorithms were proposed to solve this problem. However, none of them can show its supremacy by overcoming all the obstacles. Chemical Reaction Optimization (CRO) is a population based metaheuristic algorithm that can easily fit for the optimization problem. Here, we have proposed an algorithm based on Chemical Reaction Optimization technique to solve the DNA motif discovery problem. The four basic operators of CRO have been redesigned for this problem to search the solution space locally as well as globally. Two additional operators (repair functions) have been proposed to improve the quality of the solutions. They have been applied to the final solution after the iteration stage of CRO to get a better one. Using the flexible mechanism of elementary operators of CRO along with the additional operators (repair functions), it is possible to determine motif more precisely. Our proposed method is compared with other traditional algorithms such as Gibbs sampler, AlignACE (Aligns Nucleic Acid Conserved Elements), MEME (Multiple Expectation Maximization for Motif Elicitation), and ACRI (Ant-Colony-Regulatory-Identification) by testing real-world datasets. The experimental results show that the proposed algorithm can give better results than other traditional algorithms in quality and in less running time. Besides, statistical tests have been performed to show the superiority of the proposed algorithm over other state-of-the-arts in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. https://drive.google.com/open?id=1cEFVklnTfc5QZMxtSPhLSPFN25nJwm6K

References

  1. Douglas Harper. motif. (1848, n.d.) Dictionary.com Unabridged. In https://www.dictionary.com/browse/motif

  2. El Haj Mohamed AS, Elloumi M, Thompson JD (2016) Motif discovery in protein sequences, pattern recognition—analysis and applications, S. Ramakrishnan, IntechOpen, 14th Dec 2016, https://doi.org/10.5772/65441. https://www.intechopen.com/books/pattern-recognition-analysis-and-applications/motif-discovery-in-protein-sequences

  3. Zambelli F, Pesole G, Pavesi G (2012) Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief Bioinform 14(2):225–237

    Article  Google Scholar 

  4. Wikipedia contributors. Position. Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 1 Jan. 2019. Web. 13 May. 2019

  5. Fan Y, Wu W, Liu R, Yang W (2013) An iterative algorithm for motif discovery. Procedia Comput Sci 24:25–29. ISSN 1877-0509. https://doi.org/10.1016/j.procs.2013.10.024. (http://www.sciencedirect.com/science/article/pii/S1877050913011666)

  6. Huan HX et al (2015) An efficient ant colony algorithm for DNA motif finding. In: Knowledge and systems engineering. Springer, Cham, pp 589–601

  7. Neuwald AF, Liu JS, Lawrence CE (1995) Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci 4(8):1618–1632

    Article  Google Scholar 

  8. Bailey TL et al (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34(suppl2):W369–W373

    Article  Google Scholar 

  9. Gutierrez JB, Frith M, Nakai K (2015) A genetic algorithm for motif finding based on statistical significance. In: International conference on bioinformatics and biomedical engineering. Springer, Cham

  10. Che D, Song Y, Rasheed K (2005) MDGA: motif discovery using a genetic algorithm. In: Proceedings of the 7th annual conference on Genetic and evolutionary computation. ACM

  11. Liu FFM et al (2004) FMGA: finding motifs by genetic algorithm. In: Proceedings. Fourth IEEE symposium on bioinformatics and bioengineering. IEEE

  12. Al Daoud E (2013) Efficient DNA motif discovery using modified genetic algorithm. Int J Comput Intell Appl 12(03):1350017

    Article  Google Scholar 

  13. Huo H, Zhao Z, Stojkovic V, Liu L (2010) Optimizing genetic algorithm for motif discovery. Math Comput Model 52(11–12): 2011–2020. ISSN 0895-7177 https://doi.org/10.1016/j.mcm.2010.06.003. (http://www.sciencedirect.com/science/article/pii/S0895717710002748)

  14. Yang C-H, Liu Y-T, Chuang L-Y (2011) DNA motif discovery based on ant colony optimization and expectation maximization. In: Proceedings of the International multi conference of engineers and computer scientists. vol 1

  15. Bouamama S, Boukerram A, Al-Badarneh AF (2010) Motif finding using ant colony optimization. In: International conference on swarm intelligence. Springer, Berlin

  16. Liu W, Chen H, Chen L (2013) An ant colony optimization based algorithm for identifying gene regulatory elements. Comput Biol Med 43(7): 922–932. ISSN 0010-4825. https://doi.org/10.1016/j.compbiomed.2013.04.008. (http://www.sciencedirect.com/science/article/pii/S0010482513000978)

  17. Claeys M et al (2012) MotifSuite: workflow for probabilistic motif detection and assessment. Bioinformatics 28(14):1931–932

    Article  Google Scholar 

  18. Liu X, Brutlag DL, Liu JS (2000) BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Biocomputing 2001:127–138

    Google Scholar 

  19. Kirkpatrick S Jr, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MathSciNet  Google Scholar 

  20. Hu J, Li B, Kihara D (2005) Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res 33(15):4899–4913

    Article  Google Scholar 

  21. Wingender E et al (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24(1):238–241

    Article  Google Scholar 

  22. Lam AYS, Li VOK (2012) Chemical reaction optimization: a tutorial. Memet Comput 4(1):3–17

    Article  Google Scholar 

  23. Islam MR, Khaled Saifullah CM (2019) Mahmud MR (2019) Chemical reaction optimization: survey on variants. Evolut Intell 12(3):395–420

    Article  Google Scholar 

  24. Lam AYS, Li VOK, Xu J (2012) On the convergence of chemical reaction optimization for combinatorial optimization. IEEE Trans Evolut Comput 17(5):605–620

    Article  Google Scholar 

  25. Chaabani A, Bechikh S, Said LB (2018) A new co-evolutionary decomposition-based algorithm for bi-level combinatorial optimization. Appl Intell 48(9):2847–2872

    Article  Google Scholar 

  26. Khaled Saifullah CM, Md Rafiqul I (2016) Chemical reaction optimization for solving shortest common supersequence problem. Comput Biol Chem 64:82–93

    Article  Google Scholar 

  27. Islam MR et al (2018) Chemical reaction optimization for solving longest common subsequence problem for multiple string. Soft Comput. https://doi.org/10.1007/s00500-018-3200-3

    Article  Google Scholar 

  28. Rayhanul K, Rafiqul I (2019) Chemical reaction optimization for RNA structure prediction. Appl Intell 49(2):352–375

    Article  Google Scholar 

  29. Rafiqul Islam M, Mahmud R, Pritom RM (2019) Transportation scheduling optimization by a ollaborative strategy in supply chain management with TPL using chemical reaction. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04218-5

    Article  Google Scholar 

  30. Lam AYS, Li VOK (2009) Chemical-reaction-inspired metaheuristic for optimization. IEEE Trans Evolut Comput 14(3):381–399

    Article  Google Scholar 

  31. Islam MR, Islam MS, Sakeef N (2019) RNA Secondary Structure Prediction with Pseudoknots using chemical reaction optimization algorithm. IEEE/ACM Trans Comput Biol Bioinform. https://doi.org/10.1109/TCBB.2019.2936570

    Article  Google Scholar 

  32. Islam MR et al (2019) Optimization of protein folding using chemical reaction optimization in HP cubic lattice model. Neural Comput Appl 32:3117–3134

    Article  Google Scholar 

  33. Blekas K, Fotiadis DI, Likas A (2003) Greedy mixture learning for multiple motif discovery in biological sequences. Bioinformatics 19(5):607–617

    Article  Google Scholar 

  34. Attwood TK et al (2000) PRINTS-S: the database formerly known as PRINTS. Nucleic Acids Res 28(1):225–227

    Article  Google Scholar 

  35. Hofmann K, Bucher P, Falquet L, Bairoch A (1999) The PROSITE database, its status in 1999. Nucleic Acids Res 27(1):215–219. https://doi.org/10.1093/nar/27.1.215

    Article  Google Scholar 

  36. Stormo GD, Hartzell GW (1989) Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci 86(4):1183–1187

    Article  Google Scholar 

  37. Harbison CT et al (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431(7004):99

    Article  Google Scholar 

  38. Roth FP et al (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat Biotechnol 16(10):939

    Article  Google Scholar 

  39. Shao L, Chen Y, Abraham A (2009) Motif discovery using evolutionary algorithms. In: 2009 international conference of soft computing and pattern recognition. IEEE 2009

  40. Zhu J, Zhang MQ (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics (Oxford, England) 15(7):607–611

    Article  Google Scholar 

  41. Sun J, Zhang Q, Tsang EPK (2005) DE/EDA: a new evolutionary algorithm for global optimization. Inf Sci 169(3–4):249–262

    Article  MathSciNet  Google Scholar 

  42. Wolfger H et al (1997) The yeast ATP binding cassette (ABC) protein genes PDR10 and PDR15 are novel targets for the Pdr1 and Pdr3 transcriptional regulators. FEBS Lett 418(3):269–274

    Article  Google Scholar 

  43. Chan T-M, Leung K-S, Lee K-H (2007) TFBS identification by position-and consensus-led genetic algorithm with local filtering. In: Proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM

  44. Bryne JC et al (2007) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36(suppl1):D102–D106

    Article  Google Scholar 

  45. Tompa M et al (2005) (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23(1):137

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumit Kumar Saha.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saha, S.K., Islam, M.R. & Hasan, M. DNA motif discovery using chemical reaction optimization. Evol. Intel. 14, 1707–1726 (2021). https://doi.org/10.1007/s12065-020-00444-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-020-00444-2

Keywords

Navigation