Abstract
Grammatical inference consists in learning formal grammars for unknown languages when given learning data. Classically this data is raw: strings that belong to the language and eventually strings that do not. We present in this paper the possibility of learning when presented with additional information such as the knowledge that the hidden language belongs to some known language, or that the strings are typed, or that specific patterns have to/can appear in the strings. We propose a general setting to deal with these cases and provide algorithms that can learn deterministic finite automata in these conditions. Furthermore the number of examples needed to correctly identify can diminish drastically with the quality of the added information. We show that this general setting can cope with several well known learning tasks.
This work was done when the second author visited the Departamento de Lenguajes y Sistemas Informáticos of the University of Alicante, Spain. The visit was sponsored by the Spanish Ministry of Education.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D. Angluin. Learning regular sets from queries and counterexamples. Information and Control, 39:337–350, 1987.
M. Bernard and A. Habrard. Learning stochastic logic programs. Int. Conf. on Inductive Logic Programming, Work in progress session, 2001.
H. Boström. Theory-Guided Induction of Logic Programs by Inference of Regular Languages. In Int. Conf. on Machine Learning, 1996.
R. Carrasco and J. Oncina. Learning stochastic regular grammars by means of a state merging method. In ICGI’94, number 862 in LNAI, pages 139–150, 1994.
C. de la Higuera. Characteristic sets for polynomial grammatical inference. Machine Learning, 27:125–138, 1997.
C. de la Higuera and M. Bernard. Apprentissage de programmes logiques par inférence grammaticale. Revue dÍntelligence Artificielle, 14(3):375–396, 2001.
C. de la Higuera, J. Oncina, and E. Vidal. Identification of DFA: data-dependent versus data-independent algorithm. In ICGI’96, number 1147 in LNAI, pages 313–325, 1996.
P. Dupont, L. Miclet, and E. Vidal. What is the search space of the regular inference? In ICGI’ 94, number 862 in LNAI, pages 25–37, 1994.
H. Fernau. Identification of function distinguishable languages. In Int. Conf. on Algorithmic Learning Theory, volume 1968 of LNCS, pages 116–130, 2000.
H. Fernau. Learning xml grammars. In Machine Learning and Data Mining in Pattern Recognition MLDM’01, number 2123 in LNCS, pages 73–87, 2001.
K. S. Fu and T. L. Booth. Grammatical inference: Introduction and survey. part i and ii. IEEE Transactions on Syst. Man. and Cybern., 5:59–72 and 409–423, 1975.
T. Goan, N. Benson, and O. Etzioni. A grammar inference algorithm for the world wide web. In Proc. of AAAI Spring Symp. on Machine Learning in Information Access., 1996.
E. M. Gold. Language identification in the limit. Information and Control, 10(5):447–474, 1967.
E. M. Gold. Complexity of automaton identification from given data. Information and Control, 37:302–320, 1978.
K. J. Lang, B. A. Pearlmutter, and R. A. Price. Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In ICGI’98, number 1433 in LNAI, pages 1–12, 1998.
S. Muggleton. Inductive Logic Programming. In The MIT Encyclopedia of the Cognitive Sciences (MITECS). MIT Press, 1999.
J. Oncina and P. García. Identifying regular languages in polynomial time. In Advances in Structural and Syntactic Pattern Recognition, pages 99–108. 1992.
D. Ron, Y. Singer, and N. Tishby. On the learnability and usage of acyclic probabilistic finite automata. In Proc. of COLT 1995, pages 31–40, 1995.
Y. Sakakibara. Recent advances of grammatical inference. Theoretical Computer Science, 185:15–45, 1997.
Y. Sakakibara and H. Muramatsu. Learning context-free grammars from partially structured examples. In ICGI’00, number 1891 in LNAI, pages 229–240, 2000.
L. G. Valiant. A theory of the learnable. Com. of the ACM, 27(11):1134–1142, 1984.
M. Young-Lai and F. W. Tompa. Stochastic grammatical inference of text database structure. Machine Learning, 40(2):111–137, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kermorvant, C., de la Higuera, C. (2002). Learning Languages with Help. In: Adriaans, P., Fernau, H., van Zaanen, M. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2002. Lecture Notes in Computer Science(), vol 2484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45790-9_13
Download citation
DOI: https://doi.org/10.1007/3-540-45790-9_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44239-4
Online ISBN: 978-3-540-45790-9
eBook Packages: Springer Book Archive