Abstract
In this paper, we present a natural generalization of k-gram models for tree stochastic languages based on the k-testable class. In this class of models, frequencies are estimated for a probabilistic regular tree grammar wich is bottom-up deterministic. One of the advantages of this approach is that the model can be updated in an incremental fashion. This method is an alternative to costly learning algorithms (as inside-outside-based methods) or algorithms that require larger samples (as many state merging/splitting methods).
Work supported by the Spanish Comisión Interministerial de Ciencia y Tecnologá through grant TIC97-0941.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brown, P.F., Della Pietra, V.J., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)
Charniak, E.: Statistical Language Learning. MIT Press, Cambridge (1993)
Charniak, E.: Tree-bank grammars. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference, Menlo Park, pp. 1031–1036. AAAI Press/MIT Press (1996)
Chung, K.L.: Markov Chains with Stationary Transition Probabilities, 2nd edn. Springer, Berlin (1967)
Carrasco, R.C., Oncina, J., Calera-Rubio, J.: Stochastic inference of regular tree languages. Machine Learning (2000) (to appear)
Chaudhuri, R., Rao, A.N.V.: Approximating grammar probabilities: Solution of a conjecture. Journal of the ACM 33(4), 702–705 (1986)
Calera-Rubio, J., Carrasco, R.C.: Computing the relative entropy between regular tree languages. Information Processing Letters 68(6), 283–289 (1998)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons, New York (1991)
García, P.: Learning k-testable tree sets from positive data. Technical Report DSIC-ii-1993-46, DSIC, Universidad Politécnica de Valencia (1993)
García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(9), 920–925 (1990)
Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press, Cambridge (1998)
Knuutila, T.: Inference of k-testable tree languages. In: Bunke, H. (ed.) Advances in Structural and Syntactic Pattern Recognition, Proc. Intl. Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland, World Scientific, Singapore (1993)
Ney, H., Essen, U., Kneser, R.: On the estimation of small probabilities by leaving-one-out. IEEE Trans. on Pattern Analysis and Machine Intelligence 17(12), 1202–1212 (1995)
Rubin, F.: Experiments in text file compression. Communications of the ACM 19(11), 617–623 (1976)
Sakakibara, Y.: Efficient learning of context-free grammars from positive structural examples. Information and Computation 97(1), 23–60 (1992)
Stolcke, A., Segal, J.: Precise n-gram probabilities from stochastic context-free grammars. Technical Report TR-94-007, International Computer Science Institute, Berkeley, CA (January 1994)
Stolcke, A.: An efficient context-free parsing algorithm that computes prefix probabilities. Computational Linguistics 21(2), 165–201 (1995)
Wetherell, C.S.: Probabilistic languages: A review and some open questions. ACM Computing Surveys 12(4), 361–379 (1980)
Yokomori, T.: On polynomial-time learnability in the limit of strictly deterministic automata. Machine Learning 19, 153–179 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rico-Juan, J.R., Calera-Rubio, J., Carrasco, R.C. (2000). Probabilistic k-Testable Tree Languages. In: Oliveira, A.L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2000. Lecture Notes in Computer Science(), vol 1891. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45257-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-45257-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41011-9
Online ISBN: 978-3-540-45257-7
eBook Packages: Springer Book Archive