How Many Dots Are Really Needed for Head-Driven Chart Parsing?

Smrž, Pavel; Kadlec, Vladimír

doi:10.1007/11611257_46

Pavel Smrž²⁰ &
Vladimír Kadlec²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3831))

Included in the following conference series:

International Conference on Current Trends in Theory and Practice of Computer Science

829 Accesses

Abstract

This paper presents an improved form of head-driven chart parser that is appropriate for large context-free grammars. The basic method – HDddm (Head-Driven dependent dot move) – is introduced first. Both variants that improve the basic approach are based on the same idea – to reduce the number of chart edges by modifying the form of items (dotted rules). The first one “unifies” the items that share the analyzed part of the relevant rule (thus, only one dot is needed to mark the position before and after the covered part). The second method applies the inverse strategy, it “eliminates” the parts that have not been covered yet (no dot needed). All the discussed alternatives are described in the form of parsing schemata. We also shortly mention a tricky technique (employing a special trie-like data structure developed originally for Scrabble) that enables minimizing the extra information needed in the algorithms. We demonstrate the advantages of the described methods by the significant decrease in the number of edges for charts. The results are given for the standard set of testing grammars (and respective inputs) as well as for a large and highly ambiguous Czech grammar.

This work was partially supported by the Grant Agency of the Czech Republic under the project 201/05/2781 and by the Ministry of Education of the Czech Republic, Research Plan MSM 6383917201.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chelba, C., Jelinek, F.: Exploiting Syntactic Structure for Language Modeling. In: Boitet, C., Whitelock, P. (eds.) Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and Seventeenth International Conference on Computational Linguistics, pp. 225–231. Morgan Kaufmann Publishers, San Francisco (1998)
Google Scholar
Smrž, P., Horák, A.: Probabilistic Head-Driven Chart Parsing of Czech Sentences. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 81–86. Springer, Heidelberg (2000)
Chapter Google Scholar
Kay, M.: Head Driven Parsing. In: Proceedings of International Workshop on Parsing Technologies, Pittsburg (1989)
Google Scholar
Satta, G., Stock, O.: Head-Driven Bidirectional Parsing: A Tabular Method. In: Proceedings of IWPT 1989, Pitsburg, pp. 43–51 (1989)
Google Scholar
Sikkel, K.: Parsing Schemata: A Framework for Specification and Analysis of Parsing Algorithm. Springer, Berlin (1996)
Google Scholar
Smrž, P., Horák, A.: Large Scale Parsing of Czech. In: Proceedings of Efficiency in Large-Scale Parsing Systems Workshop, COLING 2000, pp. 43–50. Universitaet des Saarlandes, Saarbrucken (2000)
Google Scholar
Horák, A., Kadlec, V., Smrž, P.: Enhancing Best Analysis Selection and Parser Comparison. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, p. 461. Springer, Heidelberg (2002)
Chapter Google Scholar
Sikkel, K., op den Akker, R.: Predictive Head-Corner Parsing. In: Proceedings of IWPT 1993, Tilburg/Durbuy, pp. 267–276 (1993)
Google Scholar
Leermakers, R.: A Recursive Ascent Earley Parser. Information Processing Letters 41(2), 87–91 (1992)
Article MATH Google Scholar
Moore, R.C.: Improved Left-Corner Chart Parsing for Large Context-Free Grammars. In: Proceedings of the 6th IWPT, Trento, Italy, pp. 171–182 (2000)
Google Scholar
Nederhof, M.J., Satta, G.: An Extended Theory of Head-Driven Parsing. In: Meeting of the Association for Computational Linguistics, pp. 210–217 (1994)
Google Scholar
Fredkin, E.: Trie Memory. CACM 3(9), 490–499 (1960)
Google Scholar
Chappelier, J.C., Rajman, M.: A Practical Bottom-up Algorithm for On-line Parsing with Stochastic Context-Free Grammars. In: Technical Report No 98/284, Département Informatique, EPFL, Lausanne, Switzerland (1998)
Google Scholar
Daciuk, J., Watson, R.E., Watson, B.W.: Incremental Construction of Acyclic Finite-State Automata and Transducers. In: Finite State Methods in Natural Language Processing, Bilkent University, Ankara, Turkey (1998)
Google Scholar
Gordon, S.A.: A Faster Scrabble Move Generation Algorithm. Software – Practice and Experience 24(2), 219–232 (1994)
Article Google Scholar
Kadlec, V.: Optimizing Head Positions for Head-Driven Chart Parsing. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206. Springer, Heidelberg (2004) (submitted for review)
Google Scholar
Smrž, P., Horák, A.: Implementation of Efficient and Portable Parser for Czech. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 105–108. Springer, Heidelberg (1999)
Chapter Google Scholar
Kadlec, V., Smrž, P.: PACE - Parser Comparison and Evaluation. In: Proceedings of the 8th International Workshop on Parsing Technologies, IWPT 2003, Le Chesnay Cedex, France, INRIA, Domaine de Voluceau, Rocquencourt, pp. 211–212 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Brno University of Technology, Božetěchova 2, 612 66, Brno, Czech Republic
Pavel Smrž
Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Vladimír Kadlec

Authors

Pavel Smrž
View author publications
You can also search for this author in PubMed Google Scholar
Vladimír Kadlec
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodárenskou věží 2, 182 07, Prague 8, Czech Republic
Jiří Wiedermann & Július Štuller &
Department of Information and Computer Sciences, University of Utrecht, P.O. Box 80.089, 3508, Utrecht, TB, The Netherlands
Gerard Tel
Faculty of Mathematics and Physics, Charles University, Prague
Jaroslav Pokorný
Institute of Informatics and Software Engineering Faculty of Informatics and Information technologies, Slovak University of Technology, Ilkovičova 3, 842 16, Bratislava
Mária Bieliková

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Smrž, P., Kadlec, V. (2006). How Many Dots Are Really Needed for Head-Driven Chart Parsing?. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2006: Theory and Practice of Computer Science. SOFSEM 2006. Lecture Notes in Computer Science, vol 3831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611257_46

Download citation

DOI: https://doi.org/10.1007/11611257_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31198-0
Online ISBN: 978-3-540-32217-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics