Stochastic k-testable Tree Languages and Applications

Rico-Juan, Juan Ramón; Calera-Rubio, Jorge; Carrasco, Rafael C.

doi:10.1007/3-540-45790-9_16

Juan Ramón Rico-Juan⁶,
Jorge Calera-Rubio⁶ &
Rafael C. Carrasco⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2484))

Included in the following conference series:

International Colloquium on Grammatical Inference

336 Accesses

Abstract

In this paper, we describe a generalization for tree stochastic languages of the k-gram models. These models are based on the k-testable class, a subclass of the languages recognizable by ascending tree automata. One of the advantages of this approach is that the probabilistic model can be updated in an incremental fashion. Another feature is that backing-off schemes can be defined. As an illustration of their applicability, they have been used to compress tree data files at a better rate than string-based methods.

Work supported by the Spanish Comisión Interministerial de Ciencia y Tecnología through grant TIC2000-1599-C02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On Tree Substitution Grammars

Top-down tree edit-distance of regular tree languages

Article 11 July 2018

Learning Tree Languages

References

Peter F. Brown, Vincent J. Della Pietra, Peter V. deSouza, Jenifer Lai, and Robert L. Mercer. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–479, 1992.
Google Scholar
Rafael C. Carrasco, Mikel L. Forcada, M. Ángeles Valdés-Muñoz, and Ramón P. Neco. Stable encoding of finite-state machines in discrete-time recurrent neural nets with sigmoid units. Neural Computation, 12(9):2129–2174, 2000.
Article Google Scholar
Eugene Charniak. Statistical Language Learning. MIT Press, 1993.
Google Scholar
R. Chaudhuri, S. Pham, and O.N. Garcia. Solution of an open problem on probabilistic grammars. IEEE Transactions on Computers, 32(8):758–750, 1983.
Article Google Scholar
K. L. Chung. Markov Chains with Stationary Transition Probabilities. Springer, Berlin, 2 edition, 1967.
MATH Google Scholar
John G. Cleary and Ian H. Witten. Data compression using adaptive coding and partial string matching. IEEE Transactions on Communicaton, 32(4):396–402, 1984.
Article Google Scholar
Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons, New York, NY, USA, 1991.
MATH Google Scholar
Pedro García. Learning k-testable tree sets from positive data. Technical Report DSIC-ii-1993-46, DSIC, Universidad Politécnica de Valencia, 1993.
Google Scholar
Pedro García and Enrique Vidal. Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(9):920–925, sep 1990.
Google Scholar
Frederick Jelinek. Statistical Methods for Speech Recognition. The MIT Press, Cambridge, Massachusetts, 1998.
Google Scholar
T. Knuutila and M. Steinby. The inference of tree languages from finite samples: an algebraic approach. Theoretical Computer Science, 129:337–367, 1994.
Article MATH MathSciNet Google Scholar
Timo Knuutila. Inference of k-testable tree languages. In H. Bunke, editor, Advances in Structural and Syntactic Pattern Recognition (Proc. Intl. Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland). World Scientific, aug 1993.
Google Scholar
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19:313–330, 1993.
Google Scholar
H. Ney, U. Essen, and R. Kneser. On the estimation of small probabilities by leaving-one-out. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17(12):1202–1212, 1995.
Article Google Scholar
Maurice Nivat and Andreas Podelski. Minimal ascending and descending tree automata. SIAM Journal on Computing, 26(1):39–58, 1997.
Article MATH MathSciNet Google Scholar
J.R. Rico-Juan, J. Calera-Rubio, and R.C. Carrasco. Stochastic k-testable tree languages and applications. http://www.dlsi.ua.es/~calera/fulltext02.ps.gz, 2002.
G. Rozenberg and A. Salomaa, editors. Handbook of Formal Languages Springer, 1997.
Google Scholar
Frank Rubin. Experiments in text file compression. Communications of the ACM, 19(11):617–623, 1976.
Article Google Scholar
Yasubumi Sakakibara. Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97(1):23–60, March 1992.
Google Scholar
J.A. Sánchez and J.M. Benedí. Consistency of stochastic context-free grammars from probabilistic estimation based on growth transformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(9):1052–1055, 1997.
Article Google Scholar
Andreas Stolcke. An efficient context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2): 165–201, 1995.
MathSciNet Google Scholar
I. H. Witten, A. Moffat, and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kauffman Publishing, San Francisco, 2nd edition, 1999.
Google Scholar
I. H. Witten, R.M. Neal, and J. G. Cleary. Arithmetic coding for data compression. Communications of the ACM, 30(6):520–540, 1987.
Article Google Scholar
Takashi Yokomori. On polynomial-time learnability in the limit of strictly deterministic automata. Machine Learning, 19(2):153–179, 1995.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, E-03071, Alacant, Spain
Juan Ramón Rico-Juan, Jorge Calera-Rubio & Rafael C. Carrasco

Authors

Juan Ramón Rico-Juan
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Calera-Rubio
View author publications
You can also search for this author in PubMed Google Scholar
Rafael C. Carrasco
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Perot Systems Nederland B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
ILLC/Computation and Complexity Theory, Universiteit van Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
School of Electrical Engineering and Computer Science, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
Henning Fernau
Wilhelm-Schickard-Institut für Informatik, Universität Tübingen, Sand 13, 72076, Tübingen, Germany
Henning Fernau
FNWI/ILLC, Cognitive Systems and Information Processing Group, Universiteit van Amsterdam, Room B-5.39, Nieuwe Achtergracht 166, 1018 WV, Amsterdam, The Netherlands
Menno van Zaanen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rico-Juan, J.R., Calera-Rubio, J., Carrasco, R.C. (2002). Stochastic k-testable Tree Languages and Applications. In: Adriaans, P., Fernau, H., van Zaanen, M. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2002. Lecture Notes in Computer Science(), vol 2484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45790-9_16

Download citation

DOI: https://doi.org/10.1007/3-540-45790-9_16
Published: 05 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44239-4
Online ISBN: 978-3-540-45790-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Stochastic k-testable Tree Languages and Applications

Abstract

Access this chapter

Preview

Similar content being viewed by others

On Tree Substitution Grammars

Top-down tree edit-distance of regular tree languages

Learning Tree Languages

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Stochastic k-testable Tree Languages and Applications

Abstract

Access this chapter

Preview

Similar content being viewed by others

On Tree Substitution Grammars

Top-down tree edit-distance of regular tree languages

Learning Tree Languages

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation