Skip to main content

Probabilistic k-Testable Tree Languages

  • Conference paper
Grammatical Inference: Algorithms and Applications (ICGI 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1891))

Included in the following conference series:

Abstract

In this paper, we present a natural generalization of k-gram models for tree stochastic languages based on the k-testable class. In this class of models, frequencies are estimated for a probabilistic regular tree grammar wich is bottom-up deterministic. One of the advantages of this approach is that the model can be updated in an incremental fashion. This method is an alternative to costly learning algorithms (as inside-outside-based methods) or algorithms that require larger samples (as many state merging/splitting methods).

Work supported by the Spanish Comisión Interministerial de Ciencia y Tecnologá through grant TIC97-0941.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Brown, P.F., Della Pietra, V.J., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)

    Google Scholar 

  2. Charniak, E.: Statistical Language Learning. MIT Press, Cambridge (1993)

    Google Scholar 

  3. Charniak, E.: Tree-bank grammars. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference, Menlo Park, pp. 1031–1036. AAAI Press/MIT Press (1996)

    Google Scholar 

  4. Chung, K.L.: Markov Chains with Stationary Transition Probabilities, 2nd edn. Springer, Berlin (1967)

    MATH  Google Scholar 

  5. Carrasco, R.C., Oncina, J., Calera-Rubio, J.: Stochastic inference of regular tree languages. Machine Learning (2000) (to appear)

    Google Scholar 

  6. Chaudhuri, R., Rao, A.N.V.: Approximating grammar probabilities: Solution of a conjecture. Journal of the ACM 33(4), 702–705 (1986)

    Article  MathSciNet  Google Scholar 

  7. Calera-Rubio, J., Carrasco, R.C.: Computing the relative entropy between regular tree languages. Information Processing Letters 68(6), 283–289 (1998)

    Article  MathSciNet  Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons, New York (1991)

    Book  MATH  Google Scholar 

  9. García, P.: Learning k-testable tree sets from positive data. Technical Report DSIC-ii-1993-46, DSIC, Universidad Politécnica de Valencia (1993)

    Google Scholar 

  10. García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(9), 920–925 (1990)

    Article  Google Scholar 

  11. Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press, Cambridge (1998)

    Google Scholar 

  12. Knuutila, T.: Inference of k-testable tree languages. In: Bunke, H. (ed.) Advances in Structural and Syntactic Pattern Recognition, Proc. Intl. Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland, World Scientific, Singapore (1993)

    Google Scholar 

  13. Ney, H., Essen, U., Kneser, R.: On the estimation of small probabilities by leaving-one-out. IEEE Trans. on Pattern Analysis and Machine Intelligence 17(12), 1202–1212 (1995)

    Article  Google Scholar 

  14. Rubin, F.: Experiments in text file compression. Communications of the ACM 19(11), 617–623 (1976)

    Article  Google Scholar 

  15. Sakakibara, Y.: Efficient learning of context-free grammars from positive structural examples. Information and Computation 97(1), 23–60 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  16. Stolcke, A., Segal, J.: Precise n-gram probabilities from stochastic context-free grammars. Technical Report TR-94-007, International Computer Science Institute, Berkeley, CA (January 1994)

    Google Scholar 

  17. Stolcke, A.: An efficient context-free parsing algorithm that computes prefix probabilities. Computational Linguistics 21(2), 165–201 (1995)

    MathSciNet  Google Scholar 

  18. Wetherell, C.S.: Probabilistic languages: A review and some open questions. ACM Computing Surveys 12(4), 361–379 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  19. Yokomori, T.: On polynomial-time learnability in the limit of strictly deterministic automata. Machine Learning 19, 153–179 (1995)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rico-Juan, J.R., Calera-Rubio, J., Carrasco, R.C. (2000). Probabilistic k-Testable Tree Languages. In: Oliveira, A.L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2000. Lecture Notes in Computer Science(), vol 1891. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45257-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45257-7_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41011-9

  • Online ISBN: 978-3-540-45257-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics