Abstract
We consider the problem of restoring regular expressions from expressive examples. We define the class of unambiguous regular expressions, the notion of the union number of an expression showing how many union operations can occur directly under any single iteration, and the notion of an expressive example. We present a polynomial time algorithm which tries to restore an unambiguous regular expression from one expressive example. We prove that if the union number of the expression is 0 or 1 and the example is long enough, then the algorithm correctly restores the original expression from one good example. The proof relies on original investigations in theory of covering symbol sequences (words) by different sets of generators. The algorithm has been implemented and we also report computer experiments which show that the proposed method is quite practical.
Preview
Unable to display preview. Download preview PDF.
References
D. Angluin. A note on the number of queries to identify regular languages. Information and Computation, 51:76–87, 1981.
D. Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, 1987.
D. Angluin. Inference of reversible languages, J.ACM, 29, p. 741–765, 1982.
J. Barzdin. Some rules of inductive inference and their use for program synthesis. In Proc. of IFIP 1983, North Holland, 333–338, 1983.
A. Brazma. Inductive synthesis of dot expressions. Lecture Notes in Computer Science, 502, 156–212, 1991.
A. Brazma. Learning a subclass of regular expressions by recognizing periodic repetitions. Proceedings of the Fourth Scandinavian Conference on AI, IOS Press, the Netherlands, 1993, p. 236–242.
A. Brazma. Efficient identification of regular expressions from representative examples. In Proceedings of Sixth Annual Workshop on Computational Learning Theory COLT'93, ACM press, 1993, p.236–242.
A. Brazma, K. Cerans. Efficient Learning of Regular Expressions from Good Examples. Technical report, LU-IMCS-TR-CS-94-1, Riga, 1994.
R.L. Constable. The role of fiinite automata in the development of modern computing theory. In Proc of The Kleene Symposium, North-Holland, 61–83, 1980.
E.M. Gold. Language identification in the limit. Inform. contr., 10:447–474, 1967.
R. Freivalds, E. Kinber, R. Wiehagen. Inductive inference from good examples. Lecture Notes in Artificial Intelligence, 397, 1–18, 1989.
M. Kearns, L. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. In Proceedings of the 1988 Workshop on Computational Learning Theory, Morgan Kaufman, 359–370, 1988.
E. Kinber. Learning a class of regular expressions via restricted subset queries, Lecture Notes in Artificial Intelligence, 642, 232–243, 1992.
R.C. Lyndon, M.P. Schutzenberger. The equation a M=bNcP in a free group, Michigan Math. J. 9, 289–298, 1962.
S. Muggleton. Inductive Acquisition of Expert Knowledge, Turings Institute Press, 1990.
L. Pitt. Inductive Inference, DFAs, and Computational Complexity. Lecture Notes in Artificial Intelligence, 397:18–44, Springer-Verlag, 1989
N. Tanida, T. Yokomori. Polynomial-time identification of strictly regular languages in the limit. IEICE Trans. Inf. & Syst., V E75-D, 1992, 125–132.
L.G. Valiant. A theory of the learnable. Comm. Assoc. Comp. Mach., 27(11):1134–1142, 1984.
R. Wiehagen. From inductive inference to algorithmic learning. Proc. Third Work-shop on Algorithmic Learning Theory, ALT'92, Sawado, 1992, 13–24.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brāzma, A., Čerāns, K. (1994). Efficient learning of regular expressions from good examples. In: Arikawa, S., Jantke, K.P. (eds) Algorithmic Learning Theory. AII ALT 1994 1994. Lecture Notes in Computer Science, vol 872. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58520-6_55
Download citation
DOI: https://doi.org/10.1007/3-540-58520-6_55
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58520-6
Online ISBN: 978-3-540-49030-2
eBook Packages: Springer Book Archive