Skip to main content

Advertisement

Log in

Motif identification method based on Gibbs sampling and genetic algorithm

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The regulation of gene expression is the key of organism genetic mechanism. Motif identification is an important step in constructing expression regulatory network. Based on Gibbs sampling method, this work constructed position weight matrix, thereby proposing motif recognition method based on genetic algorithm. Scoring function is defined to update the population and obtain the convergence matrix of position weight, achieving the identification of motifs with different length. Simulation and experimental data sets were utilized to verify the accuracy and execution time of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. D’heaseleer, P.: What are DNA sequence motifs. Natl. Biotechnol. 24(4), 423–425 (2006)

    Article  Google Scholar 

  2. Latchman, D.S.: Transcription Factors: A Practical Approach. Oxford University Press, Oxford (1993)

    Google Scholar 

  3. Wu, B., et al.: Identify target genes involved in transcription factor GCF2 that promotes cell migration in tumor cell BEL-7404. Genomics Appl. Biol. 34(1), 35–40 (2015)

    Google Scholar 

  4. Haruka, O., Wataru, I.: MOCCS: clarifying DNA-binding motif ambiguity using ChIP-Seq data. Comput. Biol. Chem. 63, 62–72 (2016)

    Article  Google Scholar 

  5. Bussemaker, H.J., Li, H., Siggia, E.D.: Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc. Natl. Acad. Sci. USA 97(18), 10096–10100 (2000)

    Article  MathSciNet  Google Scholar 

  6. Sinha, S., Tompa, M.: Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 30(24), 5549–5560 (2002)

    Article  Google Scholar 

  7. Sinha, S., Tompa, M.: YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31(13), 3586–3588 (2003)

    Article  Google Scholar 

  8. Brazma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. Comput. Biol. 5, 279–305 (1998)

    Article  Google Scholar 

  9. Du, Y.H., Wang, Z.Z.: Review on computational prediction of transcription factor blinding sites. Life Sci. Res. 10(2), 24–31 (2006)

    Google Scholar 

  10. Li, T.T., Jiang, B., Wang, X.W.: Tutorial for computational analysis of transcription factor binding sites. Acta Biophys. Sin. 24(5), 334–347 (2008)

    Google Scholar 

  11. Hertz, G., Stormo, G.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7–8), 563–577 (1999)

    Article  Google Scholar 

  12. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011)

    Article  Google Scholar 

  13. Lawrence, C., Altschul, S.H.: Combinatorial approaches to finding subtle signals in DNA sequence. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB-2000), pp. 269–278. AAAI Press, San Diego (2000)

  14. Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane repeats. Protein Sci. 4(8), 1618–1632 (1995)

    Article  Google Scholar 

  15. Surujon, D., Ratner, D.I.: Use of a probabilistic motif search to identify histidine phosphotransfer domain-containing proteins. PLoS ONE 11, 1–18 (2016)

    Article  Google Scholar 

  16. Stine, M.: Motif discovery in upstream sequences of coordinately expressed genes. In: Proceedings of the CEC’03, pp. 1596–1603. [s. n.], Memphis (2003)

  17. Liu, F.F.M.: FMGA: finding motifs by genetic algorithm. In: Proceedings of the BIBE’04, pp. 459–466. IEEE Press, Taichung (2004)

  18. Che, D.S.: MDGA: motif discovery using a genetic algorithm. In: Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 447–452. [s. n.], Washington D.C. (2005)

  19. Congdon, C.B.: Preliminary results for GAMI: a genetic algorithms approach to motif inference. In: Proceedings of the Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1–8. IEEE Press, [S. l.] (2005)

  20. Paul, T.K., Iba, H.: Identification of weak motifs in multiple biological sequences using genetic algorithm. In: Proceedings of the GECCO’06, pp. 271–278. [s. n.], Seattle (2006)

  21. Zhang, F., Tan, J., Xie, J.B.: Comparison, analysis and optimization of motif finding based on different algorithms. Comput. Eng. 35(22), 94–96 (2009)

    Google Scholar 

  22. Watson, J.D., Crick, F.H.C.: A structure for DNA. Nature 171, 737–738 (1953)

    Article  Google Scholar 

  23. Vaidyanathan, P.P.: Genomics and proteomics: a signal processor’s tour. Circuits Syst. 4(4), 6–29 (2004)

    Google Scholar 

  24. Lenhard, B., Wasserman, W.W.: TFBS: computational framework for transcription factor binding sites analysis. Bioinform. Appl. Note 18(8), 1135–1136 (2002)

    Article  Google Scholar 

  25. Hou, L., Qian, M.P., Zhu, Y.P.: Advances on bioinformatic research in transcription factor binding sites. HEREDITAS 31(4), 365–373 (2009)

    Article  Google Scholar 

  26. Waterman, M.S., Arratia, R., Galas, D.J.: Pattern recognition in several sequences: consensus and alignment. Bull. Math. Biol. 46, 515–527 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  27. Hertz, G.Z., Stormo, G.D.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15, 563–577 (1999)

    Article  Google Scholar 

  28. Crooks, G.E., Hon, G., Chandonia, J.M., et al.: Web Logo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004)

    Article  Google Scholar 

  29. Schuster, B., Schultz, J., Rahmann, S.: HMM logos for visualization of protein families. BMC Bioinform. 5, 7 (2004)

    Article  Google Scholar 

  30. Kok, W.Y., Oon, Y.B., Lee, N.K.: Perception enhancement using visual attributes in sequence motif visualization. BioRxiv 31, 1–8 (2016). doi:10.1101/066928

    Google Scholar 

  31. Tang, Z.G., Yang, B.R., Yang, J.: New outlier detection algorithm based on Markov chain. Syst. Eng. Electron. 32(12), 2721–2724 (2010)

    MathSciNet  MATH  Google Scholar 

  32. Hughes, J., Estep, P., Tavazoie, S., Church, G.: Computational identification of Cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296(5), 1205–1214 (2000)

    Article  Google Scholar 

  33. Martin, T., Nan, L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005)

    Article  Google Scholar 

  34. Zhou, Qingyuan: Research on heterogeneous data integration model of group enterprise based on cluster computing. Clust. Comput. 19(3), 1275–1282 (2016)

    Article  Google Scholar 

  35. Zhou, Q., Luo, J.: Artificial neural network based grid computing of E-government scheduling for emergency management. Comput. Syst. Sci. Eng. 30(5), 327–335 (2015)

    Google Scholar 

  36. Xu, Z., Zhang, H., Hu, C., Mei, L., Xuan, J., Choo, K.R., Sugumaran, V., Zhu, Y.: Building knowledge base of urban emergency events based on crowdsourcing of social media. Concurr. Comput.: Pract. Exp. 28(15), 4038–4052 (2016)

    Article  Google Scholar 

  37. Xu, Z., Zhang, H., Sugumaran, V., Choo, K.R., Mei, L., Zhu, Y.: Participatory sensing-based semantic and spatial analysis of urban emergency events using mobile social media. EURASIP J. Wireless Commun. Netw. 2016, 44 (2016)

    Article  Google Scholar 

  38. Xu, Z., Hu, C., Mei, L.: Video structured description technology based intelligence analysis of surveillance videos for public security applications. Multimedia Tools Appl. 75(19), 12155–12172 (2016)

    Article  Google Scholar 

  39. Xu, Z., Wei, X., Liu, Y., Mei, L., Hu, C., Choo, K.R., Zhu, Y., Sugumaran, V.: Building the search pattern of web users using conceptual semantic space model. IJWGS 12(3), 328–347 (2016)

    Article  Google Scholar 

  40. Xu, Z., Mei, L., Hu, C., Liu, Y.: The big data analytics and applications of the surveillance system using video structured description technology. Clust. Comput. 19(3), 1283–1292 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochun Sheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sheng, X., Wang, K. Motif identification method based on Gibbs sampling and genetic algorithm. Cluster Comput 20, 33–41 (2017). https://doi.org/10.1007/s10586-016-0699-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-016-0699-x

Keywords

Navigation