Where Do Parsing Errors Come From

Müürisep, Kaili; Nigol, Helen

doi:10.1007/978-3-540-87391-4_22

Kaili Müürisep¹ &
Helen Nigol²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

945 Accesses
2 Citations

Abstract

This paper discusses some issues of developing a parser for spoken Estonian which is based on an already existing parser for written language, and employs the Constraint Grammar framework.

When we used a corpus of face-to-face everyday conversations as the training and testing material, the parser gained the recall 97.6% and the precision 91.8%. The parsing of institutional phone calls turned out to be a more complicated task, with the recall dropping by 3%. In this paper, we will focus on parsing nonfluent speech using a rule-based parser. We will give an overview of parsing errors and ways to overcome them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hennoste, T., Lindström, L., Rääbis, A., Toomet, P., Vellerind, R.: Tartu University Corpus of Spoken Estonian. In: Seilenthal, T., Nurk, A., Palo, T. (eds.) Congressus Nonus Internationalis Fenno-Ugristarum. Pars IV. Dissertationes sectionum: Linguistica I, Tartu, pp. 345–351 (2000)
Google Scholar
Müürisep, K., Puolakainen, T., Muischnek, K., Koit, M., Roosmaa, T., Uibo, H.: A New Language for Constraint Grammar: Estonian. In: Proc. of Conference Recent Advances in Natural Language Processing, Borovets, Bulgaria, pp. 304–310 (2003)
Google Scholar
Karlsson, F., Anttila, A., Heikkilä, J., Voutilainen, A.: Constraint Grammar: a Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin (1995)
Google Scholar
Müürisep, K., Uibo, H.: Shallow Parsing of Spoken Estonian Using Constraint Grammar. In: Henrichsen, P.J., Skadhauge, P.R. (eds.) Treebanking for Discourse and Speech. Proc. of NODALIDA 2005 Special Session. Copenhagen Studies in Language 32, pp. 105–118. Samfundslitteratur (2006)
Google Scholar
Müürisep, K., Nigol, H.: Disfluency Detection and Parsing of Transcribed Speech of Estonian. In: Vetulani, Z. (ed.) Proc.of Human Language Technologies as a Challenge for Computer Science and Linguistics. 3rd Language & Technology Conference, Poznan, Poland, pp. 483–487. Fundacja Uniwersitetu im. A. Mickiewicza (2007)
Google Scholar
Nigol, H.: Parsing Manually Detected and Normalized Disfluencies in Spoken Estonian. In: Proc. of NODALIDA 2007, Tartu (2007)
Google Scholar
Charniak, E., Johnson, M.: Edit detection and parsing for transcribed speech. In: Proc. of NAACL 2001, pp. 118–126 (2001)
Google Scholar
Lease, M., Johnson, M.: Early deletion of fillers in processing conversational speech. In: Proc. HLT-NAACL 2006, companion volume: short papers, pp. 73–76 (2006)
Google Scholar
Core, M.G., Schubert, L.K.: A Syntactic Framework for Speech Repairs and Other Disruptions. In: Proc. of 37^th Ann. Meet. of the ACL, pp. 413–420 (1999)
Google Scholar
Johannessen, J.B., Jørgensen, F.: Annotating and Parsing Spoken Language. In: Henrichsen, P.J., Skadhauge, P.R. (eds.) Treebanking for Discourse and Speech. Proc. of NODALIDA 2005 Special Session. Copenhagen Studies in Language 32, pp. 83–103. Samfundslitteratu (2006)
Google Scholar
Heeman, P., Allen, J.: Tagging Speech Repairs. ARPA Workshop on Human Language Technolog, pp. 187–192 (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, University of Tartu, J. Liivi 2, 50409, Tartu, Estonia
Kaili Müürisep
Institute of Estonian and General Linguistics, University of Tartu, Ülikooli 18, 50090, Tartu, Estonia
Helen Nigol

Authors

Kaili Müürisep
View author publications
You can also search for this author in PubMed Google Scholar
Helen Nigol
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Müürisep, K., Nigol, H. (2008). Where Do Parsing Errors Come From. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-87391-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics