Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples

Lemay, A.; Niehren, J.; Gilleron, R.

doi:10.1007/11872436_21

A. Lemay²³,
J. Niehren²⁴ &
R. Gilleron²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4201))

Included in the following conference series:

International Colloquium on Grammatical Inference

557 Accesses

Abstract

We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (n-NSTTs). These are tree automata that capture the class of monadic second-order definable n-ary queries. We show that n-NSTTs defined polynomially bounded n-ary queries can be learned from polynomial time and data. An application in Web information extraction yields encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning Tree Languages

Predictive Top-Down Parsing for Hyperedge Replacement Grammars

Learning Sequential Tree-to-Word Transducers

References

Carme, J., Gilleron, R., Lemay, A., Niehren, J.: Interactive learning of node selecting tree transducer. Machine Learning (2006)
Google Scholar
Carme, J., Lemay, A., Niehren, J.: Learning node selecting tree transducer from completely annotated examples. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 91–102. Springer, Heidelberg (2004)
Chapter Google Scholar
Carme, J., Niehren, J., Tommasi, M.: Querying unranked trees with stepwise tree automata. In: van Oostrom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 105–118. Springer, Heidelberg (2004)
Chapter Google Scholar
Chidlovskii, B.: Wrapping web information providers by transducer induction. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 61–73. Springer, Heidelberg (2001)
Chapter Google Scholar
Corbí, A., Oncina, J., García, P.: Learning regular languages from a complete sample by error correcting techniques. IEEE, 4/1–4/7 (1993)
Google Scholar
de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning 27, 125–137 (1997)
Article MATH Google Scholar
Gold, E.M.: Complexity of automaton identification from given data. Inf. Cont. 37, 302–320 (1978)
Article MATH MathSciNet Google Scholar
Gottlob, G., Koch, C.: Monadic queries over tree-structured data. In: 17th Annual IEEE Symposium on Logic in Computer Science, pp. 189–202 (2002)
Google Scholar
Hosoya, H., Pierce, B.: Regular expression pattern matching for XML. Journal of Functional Programming 6(13), 961–1004 (2003)
Article MathSciNet Google Scholar
Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1-2), 15–68 (2000)
Article MATH MathSciNet Google Scholar
Martens, W., Niehren, J.: On the minimization of XML schemas and tree automata for unranked trees. Journal of Computer and System Science (2006)
Google Scholar
Miklau, G., Suciu, D.: Containment and equivalence for a fragment of xpath. Journal of the ACM 51(1), 2–45 (2004)
Article MathSciNet Google Scholar
Muslea, I., Minton, S., Knoblock, C.: Active learning with strong and weak views: a case study on wrapper induction. In: IJCAI 2003, pp. 415–420 (2003)
Google Scholar
Neven, F., Van Den Bussche, J.: Expressiveness of structured document query languages based on attribute grammars. Journal of the ACM 49(1), 56–100 (2002)
Article MathSciNet Google Scholar
Niehren, J., Planque, L., Talbot, J.M., Tison, S.: N-ary queries by tree automata. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 217–231. Springer, Heidelberg (2005)
Chapter Google Scholar
Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. Pattern Recognition and Image Analysis, 49–61 (1992)
Google Scholar
Oncina, J., García, P.: Inference of recognizable tree sets. Tech. report, Universidad de Alicante, DSIC-II/47/93 (1993)
Google Scholar
Raeymaekers, S., Bruynooghe, M., Van den Bussche, J.: Learning (k,l)-contextual tree languages for information extraction. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 305–316. Springer, Heidelberg (2005)
Chapter Google Scholar
Thatcher, J.W., Wright, J.B.: Generalized finite automata with an application to a decision problem of second-order logic. Math. System Theory 2, 57–82 (1968)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Mostrare project of INRIA Futurs, LIFL, University of Lille 3, Lille, France
A. Lemay & R. Gilleron
Mostrare project of INRIA Futurs, LIFL, INRIA Futurs, Lille, France
J. Niehren

Authors

A. Lemay
View author publications
You can also search for this author in PubMed Google Scholar
J. Niehren
View author publications
You can also search for this author in PubMed Google Scholar
R. Gilleron
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, 223-8522, Yokohama, Japan
Yasubumi Sakakibara
Dept. of Computer Science, Kyoto Sangyo University, Kamigamo Motoyama, Kita-ku, Kyoto, Japan
Satoshi Kobayashi
Japan Biological Informatics Consortium, 10F TIME24 Building, 2-45 Aomi, Koto-ku, 135-8073, Tokyo, Japan
Kengo Sato
Department of Information and Communication Engineering, Graduate School of Electro-Communications, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, 182-8585, Tokyo, Japan
Tetsuro Nishino
Department of Information and Communication Engineering, Faculty of Electro-Communications, The University of Electro-Communications, Chofugaoka 1–5–1, Chofu, 182-8585, Tokyo, Japan
Etsuji Tomita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lemay, A., Niehren, J., Gilleron, R. (2006). Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_21

Download citation

DOI: https://doi.org/10.1007/11872436_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45264-5
Online ISBN: 978-3-540-45265-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning Tree Languages

Predictive Top-Down Parsing for Hyperedge Replacement Grammars

Learning Sequential Tree-to-Word Transducers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning Tree Languages

Predictive Top-Down Parsing for Hyperedge Replacement Grammars

Learning Sequential Tree-to-Word Transducers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation