Skip to main content

Automatic Recognition of the Function of Singular Neuter Pronouns in Texts and Spoken Data

  • Conference paper
Anaphora Processing and Applications (DAARC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5847))

Included in the following conference series:

Abstract

We describe the results of unsupervised (clustering) and supervised (classification) learning experiments with the purpose of recognising the function of singular neuter pronouns in Danish corpora of written and spoken language. Danish singular neuter pronouns comprise personal and demonstrative pronouns. They are very frequent and have many functions such as non-referential, cataphoric, deictic and anaphoric. The antecedents of discourse anaphoric singular neuter pronouns can be nominal phrases of different gender and number, verbal phrases, adjectival phrases, clauses or discourse segments of different size and they can refer to individual and abstract entities. Danish neuter pronouns occur in more constructions and have different distributions than the corresponding English pronouns it, this and that. The results of the classification experiments show a significant improvement of the performance with respect to the baseline in all types of data. The best results were obtained on text data, while the worst results were achieved on free-conversational, multi-party dialogues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyd, A., Gegg-Harrison, W., Byron, D.: Identifying non-referential it: a machine learning approach incorporating linguistically motivated patterns. In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in NLP, Ann Arbor Michigan, June 2005, pp. 40–47 (2005)

    Google Scholar 

  2. Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing. A Case Study in Part of Speech Tagging. Computational Linguistics 21(4), 543–565 (1995)

    Google Scholar 

  3. Byron, D.K.: Resolving pronominal reference to abstract entities. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pp. 80–87 (2002)

    Google Scholar 

  4. Carletta, J.: Assessing agreement on classification tasks: the kappa statistics. Computational Linguistics 22(2), 249–254 (1996)

    Google Scholar 

  5. Cohen, J.: Weighted kappa; nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bullettin 70, 213–220 (1968)

    Article  Google Scholar 

  6. Daelemans, W., Hoste, V., De Meulder, F., Naudts, B.: Combined optimization of feature selection and algorithm parameters in machine learning of language. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 84–95. Springer, Heidelberg (2003)

    Google Scholar 

  7. Eckert, M., Strube, M.: Dialogue acts, synchronising units and anaphora resolution. Journal of Semantics 17(1), 51–89 (2001)

    Article  Google Scholar 

  8. Evans, R.: A comparison of Rule-Based and Machine Learning Methods for Identifying Non-nominal It. In: Christodoulakis, D.N. (ed.) NLP 2000. LNCS (LNAI), vol. 1835, pp. 233–240. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  9. Gregersen, F.: The LANCHART Corpus of Spoken Danish. Report from a corpus in progress. In: Current Trends in Research on Spoken Language in the Nordic Countries, pp. 130–143. Oulu University Press (2007)

    Google Scholar 

  10. Grønnum, N.: DanPASS - A Danish Phonetically Annotated Spontaneous Speech Corpus. In: Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odijk, J., Tapias, D. (eds.) Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genova, Italy (May 2006)

    Google Scholar 

  11. Hansen, D.H.: Træning og brug af Brill-taggeren på danske tekster. Ontoquery technical report, Center for Sprogteknologi, Copenhagen (2000)

    Google Scholar 

  12. Hoste, V., Hendrickx, I., Daelemans, W.: Disambiguation of the Neuter Pronoun and Its Effect on Pronominal Coreference Resolution. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 48–55. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  13. Jongejan, B., Hansen, D.H.: The CST Lemmatiser Technical report, Centre for Language Technology (2001)

    Google Scholar 

  14. Keson, B., Norling-Christensen, O.: PAROLE-DK. Technical report, Det Danske Sprog- og Litteraturselskab (1998), http://korpus.dsl.dk/e-resurser/parole-korpus.php

  15. Lyons, J.: Semantics, vol. I-II. Cambridge University Press, Cambridge (1977)

    Google Scholar 

  16. Maegaard, B., Offersgaard, L., Henriksen, L., Jansen, H., Lepetit, X., Navarretta, C., Povlsen, C.: The MULINCO corpus and corpus platform. In: Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odijk, J., Tapias, D. (eds.) Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genova, Italy, May 2006, pp. 2148–2153 (2006)

    Google Scholar 

  17. Mitkov, R., Hallett, C.: Comparing Pronoun Resolution Algorithms. Computational Intelligence 23(2), 262–297 (2007)

    Article  MathSciNet  Google Scholar 

  18. Mitkov, R., Evans, R., Orasan, C.: A New, Fully Automatic Version of Mitkov’s Knowledge-Poor Pronoun Resolution Method. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 168–186. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  19. Müller, C.: Resolving it, this and that in unrestricted multi-party dialog. In: Proceedings of ACL 2007, pp. 816–823. Prague (2007)

    Google Scholar 

  20. Navarretta, C.: The use and resolution of Intersentential Pronominal Anaphora in Danish Discourse. Ph.D. thesis. University of Copenhagen (February 2002)

    Google Scholar 

  21. Navarretta, C.: Resolving individual and abstract anaphora in texts and dialogues. In: Proceedings of the 20th International Conference of Computational Linguistics, COLING 2004, Geneva, Switzerland, pp. 233–239 (2004)

    Google Scholar 

  22. Navarretta, C., Olsen, S.: Annotating abstract pronominal anaphora in the DAD project. In: Proceedings of LREC 2008, Marrakesh, Morocco (May 2008)

    Google Scholar 

  23. Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, August 2002, pp. 730–736 (2002)

    Google Scholar 

  24. Orasan, C.: PALinkA: a highly customizable tool for discourse annotation. In: Proceedings of the 4th SIGdial Workshop on Discourse and Dialog, Sapporo, pp. 39–43 (2003)

    Google Scholar 

  25. Strube, M., Müller, C.: A machine learning approach to pronoun resolution in spoken dialogue. In: Proceedings of the ACL 2003, pp. 168–175 (2003)

    Google Scholar 

  26. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Navarretta, C. (2009). Automatic Recognition of the Function of Singular Neuter Pronouns in Texts and Spoken Data. In: Lalitha Devi, S., Branco, A., Mitkov, R. (eds) Anaphora Processing and Applications. DAARC 2009. Lecture Notes in Computer Science(), vol 5847. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04975-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04975-0_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04974-3

  • Online ISBN: 978-3-642-04975-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics