Learning stochastic finite automata from experts

de la Higuera, Colin

doi:10.1007/BFb0054066

Colin de la Higuera¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1433))

Included in the following conference series:

International Colloquium on Grammatical Inference

129 Accesses

Abstract

We present in this paper a new learning problem called learning distributions from experts. In the case we study the experts are stochastic deterministic finite automata (sdfa). We deal with the situation arising when wanting to learn sdfa from unrepeated examples. This is intended to model the situation where the data is not generated automatically, but in an order dependent of its probability, as would be the case with the data presented by a human expert. It is then impossible to use frequency measures directly in order to construct the underlying automaton or to adjust its probabilities. In this paper we prove that although a polynomial identification with probability one is not always possible, a wide class of automata can successfully, and for this criterion, be identified. As the framework is new the problem leads to a variety of open problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abe, N. & Warmuth, M.K. (1992): On the Computational Complexity of Approximating Distributions by Probabilistic Automata. Machine Learning 9, pp. 205–260.
MATH Google Scholar
Angluin, D. (1982): Inference of reversible languages. Journal of the ACM 29 (3), pp. 741–765
Article MATH MathSciNet Google Scholar
Carrasco, R.C. & Oncina J. (1994): Learning Stochastic Regular Grammars by means of a State Merging Method. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 139–152). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar
García, P. & Vidal, E. (1990): Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (9), pp. 920–925.
Article Google Scholar
Goan, T., Benson, N. & Etzioni, O. (1996): A Grammar Inference Algorithm for the World Wide Web. In Proceedings of the 1996 AAAI Spring Symposium on Machine Learning in Information Access (MLIA '96), Stanford, CA, AAAI Press.
Google Scholar
Gold, E.M. (1967): Language identification in the limit. Inform.&Control. 10, pp. 447–474.
Article MATH Google Scholar
Gold, E.M. (1978): Complexity of automaton identification from given data. Information and Control 37, pp. 302–320.
Article MATH MathSciNet Google Scholar
de la Higuera, C., Oncina, J. & Vidal, E. (1996): Identification of dfa: data-dependant Vs data-independent algorithms. Proceedings of the International Colloquium on Grammatical Inference ICGI-96 (pp. 313–326). Lecture Notes in Artificial Intelligence 1147, Springer-Verlag.
Google Scholar
Hoeffding, W. (1963): Probability inequalities for sums of bounded random variables. American Statistical Association Journal 58, pp. 13–30.
Article MATH MathSciNet Google Scholar
Kearns, M., Mansour, Y., Ron, D., Rubinfeld, R., Shapire, R.E. & Sellie, L. (1994): On the learnability of discrete distributions. In Proceedings of the 24th Annual ACM Symp. on Theory of Computing.
Google Scholar
Lari, K. & Young, S.J. (1990): The estimation of stochastic context free grammars using the inside outside algorithm, Comput. Speech. Language 4, pp 35–56.
Article Google Scholar
Lucas, S., Vidal, E., Amiri, A., Hanlon, S. & Amengual, J.C. (1994): A comparison of syntactic and statistical techniques for off-line OCR. Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 168–179). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar
Ney, H. (1995): Stochastic grammars and Pattern Recognition. In Speech Recognition and Understanding, edited by P. Laface and R. de Mori, Springer-Verlag, pp. 45–360.
Google Scholar
Oncina, J. & García, P. (1992): Inferring regular languages in polynomial time. In Pattern Recognition and Image Analysis, World Scientific.
Google Scholar
Rabiner, L. &Juang, B. H. (1993): Fundamentals of Speech Recognition. Prentice-Hall.
Google Scholar
Ron, D., Singer, Y. & Tishby, N. (1995): On the Learnability and Usage of Acyclic Probabilistic Finite Automata. Proceedings of COLT 1995, pp 31–40.
Google Scholar
Rulot, H. & Vidal, E. (1987): Modelling (Sub)string-Length-Based Constraints through a grammatical Inference Method. In Pattern Recognition: Theory and Applications. Eds: Devijver and Kittler, pp.451–459, Springer Verlag.
Google Scholar
Sakakibara, Y. (1997): Recent Advances of grammatical inference. Theoretical Computer Science 185, pp. 1545.
Article MathSciNet Google Scholar
Stolcke, A. & Omohundro, S. (1994): Inducing Probabilistic Grammars by Bayesian Model Merging. In Proceedings of the International Colloquium on Grammatical Inference ICGI-94 (pp. 106–118). Lecture Notes in Artificial Intelligence 862, Springer-Verlag.
Google Scholar

Download references

Author information

Authors and Affiliations

EURISE, Université de Saint-Etienne, France
Colin de la Higuera

Authors

Colin de la Higuera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vasant Honavar Giora Slutzki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de la Higuera, C. (1998). Learning stochastic finite automata from experts. In: Honavar, V., Slutzki, G. (eds) Grammatical Inference. ICGI 1998. Lecture Notes in Computer Science, vol 1433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054066

Download citation

DOI: https://doi.org/10.1007/BFb0054066
Published: 23 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64776-8
Online ISBN: 978-3-540-68707-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics