Learning of regular expressions by pattern matching

Brāzma, Alvis

doi:10.1007/3-540-59119-2_194

Learning of regular expressions by pattern matching

Alvis Brāzma¹

Conference paper
First Online: 01 January 2005

167 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 904))

Abstract

We consider the problem of restoring regular expressions from good examples. We describe a natural learning algorithm for obtaining a “plausible” regular expression from one example. The algorithm is based on finding the longest substring which can be matched by some part of the so far obtained expression. We believe that the algorithm to a certain extent mimics humans guessing regular expressions from the same sort of examples. We show that for regular expressions of bounded length successful learning takes time linear in the length of the example, provided that the example is “good”. Under certain natural restrictions the run-time of the learning algorithm is polynomial also in unsuccessful cases. In the end we discuss the computer experiment of learning regular expressions via the described algorithm, showing that the proposed learning method is quite practical.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

D.Angluin. A note on the number of queries to identify regular languages. Information and Computation, 51:76–87, 1981.
Google Scholar
D.Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, 1987.
Google Scholar
J.Barzdin, G.Barzdin, K.Apsitis, U.Sarkans. Towards efficient inductive synthesis of expressions from input/output examples. In Proc of th 4th Workshop on Algorithmic Learning Theory (ALT'93), Lect. Notes in Artific. Intel., Springer, 1993, 59–72.
Google Scholar
A.Brazma. Efficient identification of regular expressions from representative examples. In Proc. of 6th Annual Workshop on Comput. Learn. Theory COLT'93, ACM Press, 1993, p.236–242.
Google Scholar
A.Brazma, K.Cerans. Efficient learning of regular expressions from good examples. In proc. of 4th Intern. Workshop on Analogical and Inductive Inference (AII'94), Lecture Notes in Artificial Intelligence, Vol 872, 1994, pp.76–90.
Google Scholar
A.Brazma. Efficient algorithms for learning simple regular expressions from noisy examples. In proc. of 5th International Workshop on Algorithmic Learning Theory (ALT'94), Lecture Notes in Artificial Intelligence, Vol 872, pp.260–271.
Google Scholar
R.Freivalds, E.Kinber, R.Wiehagen. Inductive inference from good examples. Lecture Notes in Artificial Intelligence, 397, 1–18, 1989.
Google Scholar
E.M.Gold. Language identification in the limit. Inform. contr., 10:447–474, 1967.
Google Scholar
E.Kinber. Learning a class of regular expressions via restricted subset queries, Lecture Notes in Artificial Intelligence, 642, 232–243, 1992.
Google Scholar
S.Muggleton. Inductive Acquisition of Expert Knowledge, Turings Institute Press, 1990.
Google Scholar
L.Pitt. Inductive Inference, DFAs, and Computational Complexity. Lecture Notes in Artificial Intelligence, 397:18–44, Springer-Verlag, 1989
Google Scholar
N.Tanida, T.Yokomori. Polynomial-time identification of strictly regular languages in the limit. IEICE Trans. Inf. & Syst., V E75-D, 1992, 125–132.
Google Scholar
L.G.Valiant. A theory of the learnable. Comm. Assoc. Comp. Mach., 27(11):1134–1142, 1984.
Google Scholar
R.Wiehagen. From inductive inference to algorithmic learning. Proc. Third Workshop on Algorithmic Learning Theory, ALT'92, Sawado, 1992, 13–24.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics and Computer Science, University of Latvia, 29 Rainis Blvd., LV-1459, Riga, Latvia
Alvis Brāzma

Authors

Alvis Brāzma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Paul Vitányi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brāzma, A. (1995). Learning of regular expressions by pattern matching. In: Vitányi, P. (eds) Computational Learning Theory. EuroCOLT 1995. Lecture Notes in Computer Science, vol 904. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59119-2_194

Download citation

DOI: https://doi.org/10.1007/3-540-59119-2_194
Published: 01 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-59119-1
Online ISBN: 978-3-540-49195-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics