Skip to main content
Log in

Probabilistic Pattern Matching and the Evolution of Stochastic Regular Expressions

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The use of genetic programming for probabilistic pattern matching is investigated. A stochastic regular expression language is used. The language features a statistically sound semantics, as well as a syntax that promotes efficient manipulation by genetic programming operators. An algorithm for efficient string recognition based on approaches in conventional regular language recognition is used. When attempting to recognize a particular test string, the recognition algorithm computes the probabilities of generating that string and all its prefixes with the given stochastic regular expression. To promote efficiency, intermediate computed probabilities that exceed a given cut-off value will pre-empt particular interpretation paths, and hence prune unconstructive interpretation. A few experiments in recognizing stochastic regular languages are discussed. Application of the technology in bioinformatics is in progress.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. D. Angluin, “Computational learning theory: survey and selected bibliography,” in Proceedings of the 24th Annual ACM Symposium on the Theory of Computing,ACMPress: NewYork, 1992, pp. 351–369.

    Google Scholar 

  2. K.S. Fu and T.L. Booth, “Grammatical inference: introduction and survey–Part I,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 5, no. 1, pp. 95–111, January 1975.

    Google Scholar 

  3. K.S. Fu and T.L. Booth, “Grammatical inference: introduction and survey–Part II,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 5, no. 4, pp. 409–423, July 1975.

    Google Scholar 

  4. Y. Sakakibara, “Recent advances of grammatical inference,” Theoretical Computer Science, vol. 185, pp. 15–45, 1997.

    Google Scholar 

  5. K.S. Fu and T. Huang, “Stochastic grammars and languages,” International Journal of Computer and Information Sciences, vol. 1, no. 2, pp. 135–170, 1972.

    Google Scholar 

  6. K.S. Fu, Syntactic Pattern Recognition and Applications, Prentice-Hall: Englewood Cliffs, NJ, 1982.

    Google Scholar 

  7. K. Lari and S.J. Young, “The estimation of stochastic contextfree grammars using the Inside-Outside algorithm,” Computer Speech and Language, vol. 4, pp. 35–56, 1990.

    Google Scholar 

  8. E. Charniak, Statistical Language Learning, MIT Press: Cambridge, MA, 1993.

    Google Scholar 

  9. J.E. Hopcroft and J.D. Ullman, Introduction toAutomata Theory, Languages, and Computation, Addison Wesley: Reading, MA, 1979.

    Google Scholar 

  10. R.C. Carrasco and M.L. Forcada, “Inferring stochastic regular grammars with recurrent neural networks,” in Proceedings 3rd International Colloqium on Grammatical Inference (ICGI96), edited by I. Miclet and C. de la Huguera, Springer-Verlag: Berlin 1996, pp. 274–286. LNAI 1147.

    Google Scholar 

  11. R.C. Carrasco and J. Oncina, “Learning deterministic regular grammars from stochastic samples in polynomial time,” Technical Report DLSI-96–01, Universidad de Alicante, April 1998.

  12. F.J. Maryanski and T.L. Booth, “Inference of finite-state probabilistic grammars,” IEEE Transactions on Computers, vol. C26, pp. 521–536, 1977.

    Google Scholar 

  13. A. van der Mude and A.Walker, “On the inference of stochastic regular grammars,” Information and Control, vol. 38, pp. 310–329, 1978.

    Google Scholar 

  14. L.R. Rabiner and B.H. Juang, “An introduction to Hidden markov models,” in IEEE ASSP Magazine, pp. 4–16, January 1986.

  15. L.R. Rabiner, “Atutorial on Hidden markov models and selected applications in speech recognition,” in Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, February 1989.

    Google Scholar 

  16. V.K. Garg, R. Kumar, and S.I. Marcus, “Probabilistic language framework for stochastic discrete event systems,” Technical Report 96–18, Institute for Systems Research, April 1996. http://www.isr.umc.edu/.

  17. H. Zhou and J.J. Grefenstette, “Induction of finite automata by genetic algorithms,” in Proc. 1986 IEEE Intl. Conference on Systems, Man, and Cybernetics, Atlanta, GA, IEEE Press: New York, 1986, pp. 170–174.

    Google Scholar 

  18. B.D. Dunay, F.E. Petry, and B.P. Buckles, “Regular language induction with genetic programming,” in Proc. 1st IEEE Conference on Evolutionary Computation, 1994, pp. 396–400.

  19. P. Dupont, “Regular grammatical inference from positive and negative samples by genetic search: the GIG method,” in 2nd Intl. Coll. on Grammatical Inference and Applications, Springer-Verlag: Berlin, 1994, pp. 236–245.

    Google Scholar 

  20. M. Tomita, “Dynamic construction of finite automata from examples using hill-climbing,” in 4th Annual Conf. of the Cognitive Science Soc., 1982, pp. 105–108.

  21. S. Brave, “Evolving deterministic finite automata using cellular encoding,” in Proc. Genetic Programming 1997, Stanford University, CA, USA, edited by J.R. Koza et al., Morgan Kaufmann: Los Altos, CA, 1997, pp. 39–44.

    Google Scholar 

  22. T. Longshaw, “Evolutionary learning of large grammars,” in Proc. Genetic Programming 1997, Stanford University, CA, USA, edited by J.R. Koza et al., Morgan Kaufmann, Los Altos, CA, 1997, pp. 406–409.

    Google Scholar 

  23. B. Svingen, “Learning regular languages using genetic programming,” in Proc. Genetic Programming 1998, edited by J.R. Koza et al., Morgan Kaufmann, Los Altos, CA, 1998, pp. 374–376.

    Google Scholar 

  24. P. Wyard, “Context free grammar induction using genetic algorithms,” in Proceedings 4th International Conference on Genetic Algorithms, 1991, pp. 514–518.

  25. M.M. Lankhorst, “Grammatical inference with a genetic algorithm,” in Proceedings of the 1994 EUROSIM Conference on Massively Parallel Processing Applications and Development, 1994, pp. 423–430.

  26. S. Lucas, “Structuring chromosomes for context-free grammar evolution,” in Proceedings 1st International Conference on Evolutionary Computation, IEEE Press: New York, 1994, pp. 130–135.

    Google Scholar 

  27. S. Sen and J. Janakiraman, “Learning to construct pushdown automata for accepting deterministic context-free languages,” Applications of Artificial Intelligence X: Knowledge-Based Systems, vol. 1707, pp. 207–213, 1992.

    Google Scholar 

  28. M.M. Lankhorst, “Agenetic algorithm for the induction of pushdown automata,” in 1995 IEEE Intl. Conference on Evolutionary Computation, IEEE Press: New York, 1995.

    Google Scholar 

  29. B.D. Dunay and F.E. Petry, “Solving complex problems with genetic algorithms,” in Proc. 6th Intl. Conf. on Genetic Algorithms, edited by L. Eshelman, Morgan Kaufmann: Los Altos, CA, 1995.

    Google Scholar 

  30. M. Schwehm and A. Ost, “Inference of stochastic regular grammars by massively parallel genetic algorithms,” in Proc. 6th Intl. Conf. on Genetic Algorithms, Morgan-Kaufmann: Los Altos, CA, 1995.

    Google Scholar 

  31. T.E. Kammeyer and R.K. Belew, “Stochastic context-free grammar induction with a genetic algorithm using local search,” in Foundations of Genetic Algorithms IV, edited by R.K. Belew and M. Vode, Morgan-Kaufmann: Los Altos, CA, 1997.

    Google Scholar 

  32. J. Stoy, Denotational Semantics, MIT Press: Cambridge, MA, 1977.

    Google Scholar 

  33. K. Subrahmaniam, A Primer in Probability, Marcel Dekker: New York, 1979.

    Google Scholar 

  34. M. Sipser, Introduction to the Theory of Computation,PWSPub. Co., 1996.

  35. M. Hennessy, The Semantics of Programming Languages–An Elementary Introduction Using Structural Operational Semantics, John Wiley and Sons: New York, 1990.

    Google Scholar 

  36. W.F. Clocksin and C.S. Mellish, Programming in Prolog, 4th ed, Springer-Verlag: Berlin, 1994.

  37. B.J. Ross, “Logic-based genetic programming with definite clause translation grammars,” in Proc. GECCO-99, edited by J.R. Koza et al., 1999.

  38. P.A. Whigham, “Grammatically-based genetic programming,” in Proceedings Workshop on Genetic Programming: From Theory to Real-World Applications, edited by J.P. Rosca, 1995, pp. 31–41.

  39. M.L. Wong and K.S. Leung, “Learning programs in different paradigms using genetic programming,” in Proceedings 4th Congress of the Italian Association for AI, 1995, pp. 353–364.

  40. A. Geyer-Shulz, “The next 700 programming languages for genetic programming,” in Proc. Genetic Programming 1997, Stanford University, CA, USA, edited by J.R. Koza et al., Morgan Kaufmann: Los Altos, CA, 1997, pp. 128–136.

    Google Scholar 

  41. H. Abramson and V. Dahl, Logic Grammars, Springer-Verlag, 1989.

  42. W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, 2nd edn. Cambridge University Press: Cambridge, 1992.

    Google Scholar 

  43. A. Brazma, I. Jonassen, I. Eidhammer, and D. Gilbert, “Approaches to the automatic discovery of patterns in biosequences,” Technical Report 113, Department of Informatics, University of Bergen, Norway, December 1995.

    Google Scholar 

  44. K. Hofmann, P. Bucher, L. Falquet, and A. Bairoch, “The PROSITE database, its status in 1999,” Nucleic Acids Research, vol. 27, no. 1, pp. 215–219, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ross, B.J. Probabilistic Pattern Matching and the Evolution of Stochastic Regular Expressions. Applied Intelligence 13, 285–300 (2000). https://doi.org/10.1023/A:1026524328760

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026524328760

Navigation