A new fast fuzzy Cocke–Younger–Kasami algorithm for DNA strings analysis

Molina-Lozano, Herón

doi:10.1007/s13042-011-0042-z

A new fast fuzzy Cocke–Younger–Kasami algorithm for DNA strings analysis

Original Article
Published: 17 August 2011

Volume 2, pages 209–218, (2011)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Herón Molina-Lozano¹

157 Accesses
5 Citations
Explore all metrics

Abstract

In this paper we present a variation of the Cocke–Younger–Kasami algorithm (CYK algorithm) for the analysis of fuzzy free context languages applied to DNA strings. We propose a variation of the original CYK algorithm where we prove that the computational order of the new CYK algorithm is O(n). We prove that the new algorithm only uses O(2n) memory locations. The fuzzy context-free grammar (FCFG) is obtained from the DNA. The algorithm can be used to find regulatory motifs among other applications. In order to demonstrate the applications of the proposed algorithm, we present two examples. In the first example, we prove that it is possible to define a fuzzy grammar for a prototype DNA sequence and then find the membership grade of any arbitrary sequence against this specific pattern. As a second example, we construct a fuzzy grammar from the alignment of promoters obtained by a logo sequence algorithm for the Escherichia coli K12 DNA string, and then show how the proposed method can be used for discovery of the regulatory motifs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

Error tolerance for the recognition of faulty strings in a regulated grammar using fuzzy sets

Article 10 July 2018

Stochastic Context-Free Grammars, Regular Languages, and Newton’s Method

References

Asveld PRJ (2005) Fuzzy context-free languages-part 2: recognition and parsing algorithms. Theor Comput Sci 347:191–213
Article MathSciNet MATH Google Scholar
Brendel V, Busse H (1984) Genome structure described by formal languages. Nucleic Acids Res 12:2561–2568
Article Google Scholar
Collado-Vides J (1989) A transformational grammar approach to the study of the regulation of gene expression. J Theory Biol 136:403–425
Article Google Scholar
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York
Hawkins J, Boden M (2005) The applicability of recurrent neural networks for biological sequence analysis. IEEE/ACM Trans Comput Biol Bioinform 2:243–253
Article Google Scholar
Head T (1987) Formal languages theory and DNA. Bull Math Biol 49
Hopcroft JE, Rajeev Motwai R, Ullman JD (2002) Introduction to automata theory, languages and computation. Addison-Wesley, Reading
Jang J-S, Sun CT, Mitzutani E (1997) Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice Hall, Englewood Cliffs
Jones NC, Pevzner PA (2004) An introduction to bioinformatics algorithms. MIT Press, Cambridge
Koski T (2001) Hidden Markov models for bioinformatics. Kluwer Academic Publishers, Dordrecht
Lee ET, Zadeh LA (1969) Note on fuzzy languages. Inf Sci 1:421–434
Article MathSciNet Google Scholar
Linz P (2006) Formal languages and automata, 4th edn. Jones and Bartlett Publishers, Sudbury
Molina-Lozano H (2010) A fast fuzzy Cocke–Younger–Kasami algorithm for DNA and RNA string analysis. In: Mexican International Conference on Artificial Intelligence, MICAI
Molina-Lozano H, Vallejo-Clemente E, Morett-Sánchez J (2008) DNA sequence analysis using fuzzy grammars. In: IEEE World Congress on Computational Intelligence
Mordeson JN, Malik DS (2002) Fuzzy automata and languages: theory and applications. Chapman and Hall/CRC, Boca Raton
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequences of two proteins. J Mol Biol 44:443–453
Article Google Scholar
Searls D (1992) The linguistics of DNA. Am Sci 80:579–591
Google Scholar
Searls D (1993) Artificial intelligence and molecular biology. In: Hunter L (eds). AAAI Press, pp 47–120
Searls DB (2002) The languages of genes. Nature 420:211–217
Google Scholar
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197
Article Google Scholar
Database of Protein Domains, Families and Functional Sites. Available at: http://www.expasy.ch/prosite
Harrison MA (1978) Introduction to formal language theory. Addison-Wesley, Reading
Schneider TD (1996) New approaches in mathematical biology: information theory and molecular machines. In: Chela-Flores J, Raulin F (eds) Chemical evolution: physics of the origin and evolution of life. Kluwer Academic Publishers, Dordrecht, pp 313–321

Download references

Acknowledgments

This work was supported by the Instituto de Ciencia y Tecnologia del Distrito Federal (ICyTDF) under project No. PICCT08-22. We also thank the support of the IPN (SIP-IPN, COFFA-IPN and PIFI-IPN). Any opinions, findings and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the sponsoring agency.

Author information

Authors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan de Dios Bátiz s/n casi esq, Miguel Othón de Mendizábal, Col. Nueva Industrial Vallejo, CP 07738, México, DF, Mexico
Herón Molina-Lozano

Authors

Herón Molina-Lozano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Herón Molina-Lozano.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Molina-Lozano, H. A new fast fuzzy Cocke–Younger–Kasami algorithm for DNA strings analysis. Int. J. Mach. Learn. & Cyber. 2, 209–218 (2011). https://doi.org/10.1007/s13042-011-0042-z

Download citation

Received: 25 April 2011
Accepted: 26 July 2011
Published: 17 August 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s13042-011-0042-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new fast fuzzy Cocke–Younger–Kasami algorithm for DNA strings analysis

Abstract

Access this article

Similar content being viewed by others

Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

Error tolerance for the recognition of faulty strings in a regulated grammar using fuzzy sets

Stochastic Context-Free Grammars, Regular Languages, and Newton’s Method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new fast fuzzy Cocke–Younger–Kasami algorithm for DNA strings analysis

Abstract

Access this article

Similar content being viewed by others

Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

Error tolerance for the recognition of faulty strings in a regulated grammar using fuzzy sets

Stochastic Context-Free Grammars, Regular Languages, and Newton’s Method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation