Efficient learning of regular expressions from good examples

Brāzma, Alvis; Čerāns, Kārlis

doi:10.1007/3-540-58520-6_55

Alvis Brāzma¹ &
Kārlis Čerāns¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 872))

Included in the following conference series:

173 Accesses
4 Citations

Abstract

We consider the problem of restoring regular expressions from expressive examples. We define the class of unambiguous regular expressions, the notion of the union number of an expression showing how many union operations can occur directly under any single iteration, and the notion of an expressive example. We present a polynomial time algorithm which tries to restore an unambiguous regular expression from one expressive example. We prove that if the union number of the expression is 0 or 1 and the example is long enough, then the algorithm correctly restores the original expression from one good example. The proof relies on original investigations in theory of covering symbol sequences (words) by different sets of generators. The algorithm has been implemented and we also report computer experiments which show that the proposed method is quite practical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Angluin. A note on the number of queries to identify regular languages. Information and Computation, 51:76–87, 1981.
Google Scholar
D. Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, 1987.
Google Scholar
D. Angluin. Inference of reversible languages, J.ACM, 29, p. 741–765, 1982.
Google Scholar
J. Barzdin. Some rules of inductive inference and their use for program synthesis. In Proc. of IFIP 1983, North Holland, 333–338, 1983.
Google Scholar
A. Brazma. Inductive synthesis of dot expressions. Lecture Notes in Computer Science, 502, 156–212, 1991.
Google Scholar
A. Brazma. Learning a subclass of regular expressions by recognizing periodic repetitions. Proceedings of the Fourth Scandinavian Conference on AI, IOS Press, the Netherlands, 1993, p. 236–242.
Google Scholar
A. Brazma. Efficient identification of regular expressions from representative examples. In Proceedings of Sixth Annual Workshop on Computational Learning Theory COLT'93, ACM press, 1993, p.236–242.
Google Scholar
A. Brazma, K. Cerans. Efficient Learning of Regular Expressions from Good Examples. Technical report, LU-IMCS-TR-CS-94-1, Riga, 1994.
Google Scholar
R.L. Constable. The role of fiinite automata in the development of modern computing theory. In Proc of The Kleene Symposium, North-Holland, 61–83, 1980.
Google Scholar
E.M. Gold. Language identification in the limit. Inform. contr., 10:447–474, 1967.
Google Scholar
R. Freivalds, E. Kinber, R. Wiehagen. Inductive inference from good examples. Lecture Notes in Artificial Intelligence, 397, 1–18, 1989.
Google Scholar
M. Kearns, L. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. In Proceedings of the 1988 Workshop on Computational Learning Theory, Morgan Kaufman, 359–370, 1988.
Google Scholar
E. Kinber. Learning a class of regular expressions via restricted subset queries, Lecture Notes in Artificial Intelligence, 642, 232–243, 1992.
Google Scholar
R.C. Lyndon, M.P. Schutzenberger. The equation a ^M=b^Nc^P in a free group, Michigan Math. J. 9, 289–298, 1962.
Google Scholar
S. Muggleton. Inductive Acquisition of Expert Knowledge, Turings Institute Press, 1990.
Google Scholar
L. Pitt. Inductive Inference, DFAs, and Computational Complexity. Lecture Notes in Artificial Intelligence, 397:18–44, Springer-Verlag, 1989
Google Scholar
N. Tanida, T. Yokomori. Polynomial-time identification of strictly regular languages in the limit. IEICE Trans. Inf. & Syst., V E75-D, 1992, 125–132.
Google Scholar
L.G. Valiant. A theory of the learnable. Comm. Assoc. Comp. Mach., 27(11):1134–1142, 1984.
Google Scholar
R. Wiehagen. From inductive inference to algorithmic learning. Proc. Third Work-shop on Algorithmic Learning Theory, ALT'92, Sawado, 1992, 13–24.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics and Computer Science, University of Latvia, 29 Rainis Blvd., LV-1459, Riga, Latvia
Alvis Brāzma & Kārlis Čerāns

Authors

Alvis Brāzma
View author publications
You can also search for this author in PubMed Google Scholar
Kārlis Čerāns
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Setsuo Arikawa Klaus P. Jantke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brāzma, A., Čerāns, K. (1994). Efficient learning of regular expressions from good examples. In: Arikawa, S., Jantke, K.P. (eds) Algorithmic Learning Theory. AII ALT 1994 1994. Lecture Notes in Computer Science, vol 872. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58520-6_55

Download citation

DOI: https://doi.org/10.1007/3-540-58520-6_55
Published: 03 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58520-6
Online ISBN: 978-3-540-49030-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics