Grammatical Inference Using Suffix Trees

Geertzen, Jeroen; van Zaanen, Menno

doi:10.1007/978-3-540-30195-0_15

Jeroen Geertzen²⁰ &
Menno van Zaanen²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3264))

Included in the following conference series:

International Colloquium on Grammatical Inference

371 Accesses
4 Citations

Abstract

The goal of the Alignment-Based Learning (ABL) grammatical inference framework is to structure plain (natural language) sentences as if they are parsed according to a context-free grammar. The framework produces good results even when simple techniques are used. However, the techniques used so far have computational drawbacks, resulting in limitations with respect to the amount of language data to be used. In this article, we propose a new alignment method, which can find possible constituents in time linear in the amount of data. This solves the scalability problem and allows ABL to be applied to larger data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

van Zaanen, M.: Bootstrapping Structure into Language: Alignment-Based Learning. PhD thesis, University of Leeds, Leeds, UK (2002)
Google Scholar
van Zaanen, M.: Theoretical and practical experiences with Alignment-Based Learning. In: Proceedings of the Australasian Language Technology Workshop, Melbourne, Australia, pp. 25–32 (2003)
Google Scholar
Harris, Z.S.: Structural Linguistics. 7th (1966) edn. University of Chicago Press, Chicago (1951) ;Formerly Entitled: Methods in Structural Linguistics
Google Scholar
van Zaanen, M.: Implementing Alignment-Based Learning. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS (LNAI), vol. 2484, pp. 312–314. Springer, Heidelberg (2002)
Chapter Google Scholar
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the Association for Computing Machinery 21, 168–173 (1974)
MATH MathSciNet Google Scholar
van Zaanen, M., Adriaans, P.: Alignment-Based Learning versus EMILE: A comparison. In: Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC), Amsterdam, the Netherlands, pp. 315–322 (2001)
Google Scholar
Adriaans, P.: Language Learning from a Categorial Perspective. PhD thesis, University of Amsterdam, Amsterdam, the Netherlands (1992)
Google Scholar
Weiner, P.: Linear pattern matching algorithms. In: Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, pp. 1–11. IEEE Computer Society Press, USA (1973)
Chapter Google Scholar
McCreight, E.M.: A space-economical suffix tree construction algorithm. Journal of the Association for Computing Machinery 23, 262–272 (1976)
MATH MathSciNet Google Scholar
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)
Article MATH MathSciNet Google Scholar
Geertzen, J.: String alignment in grammatical inference: what suffix trees can do. Technical Report ILK-0311, ILK, Tilburg University, Tilburg, The Netherlands (2003)
Google Scholar
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn treebank. Computational Linguistics 19, 313–330 (1993)
Google Scholar
Charniak, E.: Statistical parsing with a context-free grammar and word statistics. In: Proceedings of the Fourteenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence (AAAI), pp. 598–603 (1997)
Google Scholar
Collins, M.: Three generative, lexicalised models for statistical parsing. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) and the 8th Meeting of the European Chapter of the Association for Computational Linguistics (EACL),Association for Computational Linguistics (ACL), Madrid, Spain, pp. 16–23 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

ILK, Computational Linguistics, Tilburg University, Tilburg, The Netherlands
Jeroen Geertzen & Menno van Zaanen

Authors

Jeroen Geertzen
View author publications
You can also search for this author in PubMed Google Scholar
Menno van Zaanen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Informatics and Telecommunications, National Centre for Scientific Research “Demokritos”, Athens, Greece
Georgios Paliouras
Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, 223-8522, Yokohama, Japan
Yasubumi Sakakibara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geertzen, J., van Zaanen, M. (2004). Grammatical Inference Using Suffix Trees. In: Paliouras, G., Sakakibara, Y. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2004. Lecture Notes in Computer Science(), vol 3264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30195-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-30195-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23410-4
Online ISBN: 978-3-540-30195-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics