Skip to main content

Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples

  • Conference paper
Grammatical Inference: Algorithms and Applications (ICGI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4201))

Included in the following conference series:

  • 557 Accesses

Abstract

We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (n-NSTTs). These are tree automata that capture the class of monadic second-order definable n-ary queries. We show that n-NSTTs defined polynomially bounded n-ary queries can be learned from polynomial time and data. An application in Web information extraction yields encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Carme, J., Gilleron, R., Lemay, A., Niehren, J.: Interactive learning of node selecting tree transducer. Machine Learning (2006)

    Google Scholar 

  2. Carme, J., Lemay, A., Niehren, J.: Learning node selecting tree transducer from completely annotated examples. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 91–102. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Carme, J., Niehren, J., Tommasi, M.: Querying unranked trees with stepwise tree automata. In: van Oostrom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 105–118. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Chidlovskii, B.: Wrapping web information providers by transducer induction. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 61–73. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Corbí, A., Oncina, J., García, P.: Learning regular languages from a complete sample by error correcting techniques. IEEE, 4/1–4/7 (1993)

    Google Scholar 

  6. de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning 27, 125–137 (1997)

    Article  MATH  Google Scholar 

  7. Gold, E.M.: Complexity of automaton identification from given data. Inf. Cont. 37, 302–320 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  8. Gottlob, G., Koch, C.: Monadic queries over tree-structured data. In: 17th Annual IEEE Symposium on Logic in Computer Science, pp. 189–202 (2002)

    Google Scholar 

  9. Hosoya, H., Pierce, B.: Regular expression pattern matching for XML. Journal of Functional Programming 6(13), 961–1004 (2003)

    Article  MathSciNet  Google Scholar 

  10. Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1-2), 15–68 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  11. Martens, W., Niehren, J.: On the minimization of XML schemas and tree automata for unranked trees. Journal of Computer and System Science (2006)

    Google Scholar 

  12. Miklau, G., Suciu, D.: Containment and equivalence for a fragment of xpath. Journal of the ACM 51(1), 2–45 (2004)

    Article  MathSciNet  Google Scholar 

  13. Muslea, I., Minton, S., Knoblock, C.: Active learning with strong and weak views: a case study on wrapper induction. In: IJCAI 2003, pp. 415–420 (2003)

    Google Scholar 

  14. Neven, F., Van Den Bussche, J.: Expressiveness of structured document query languages based on attribute grammars. Journal of the ACM 49(1), 56–100 (2002)

    Article  MathSciNet  Google Scholar 

  15. Niehren, J., Planque, L., Talbot, J.M., Tison, S.: N-ary queries by tree automata. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 217–231. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. Pattern Recognition and Image Analysis, 49–61 (1992)

    Google Scholar 

  17. Oncina, J., García, P.: Inference of recognizable tree sets. Tech. report, Universidad de Alicante, DSIC-II/47/93 (1993)

    Google Scholar 

  18. Raeymaekers, S., Bruynooghe, M., Van den Bussche, J.: Learning (k,l)-contextual tree languages for information extraction. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 305–316. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  19. Thatcher, J.W., Wright, J.B.: Generalized finite automata with an application to a decision problem of second-order logic. Math. System Theory 2, 57–82 (1968)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lemay, A., Niehren, J., Gilleron, R. (2006). Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_21

Download citation

  • DOI: https://doi.org/10.1007/11872436_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45264-5

  • Online ISBN: 978-3-540-45265-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics