Skip to main content

Taggers Gonna Tag: An Argument against Evaluating Disambiguation Capacities of Morphosyntactic Taggers

  • Conference paper
Text, Speech and Dialogue (TSD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

Abstract

Usually tagging of inflectional languages is performed in two stages: morphological analysis and morphosyntactic disambiguation. A number of papers have been published where the evaluation is limited to the second part, without asking the question of what a tagger is supposed to do. In this article we highlight this important question and discuss possible answers. We also argue that a fair evaluation requires assessment of the whole system, which is very rarely the case in the literature. Finally we show results of the full evaluation of three Polish morphosyntactic taggers. The discrepancy between our results and those published earlier is striking, showing that these issues do make a practical difference.

Work financed by Innovative Economy Programme, POIG.01.01.02-14-013/09.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Hajič, J., Krbec, P., Květoň, P., Oliva, K., Petkevič, V.: Serial combination of rules and statistics: A case study in Czech tagging. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 268–275. Association for Computational Linguistics (2001)

    Google Scholar 

  2. Hajič, J., Vidová-Hladká, B.: Tagging inflective languages: Prediction of morphological categories for a rich, structured tagset. In: Proceedings of the COLING - ACL Conference, ACL, pp. 483–490 (1998)

    Google Scholar 

  3. Karwańska, D., Przepiórkowski, A.: On the evaluation of two Polish taggers. [18]

    Google Scholar 

  4. Schmid, H., Laws, F.: Estimation of conditional probabilities with decision trees and an application to fine-grained POS tagging. In: Proceedings of COLING 2008, vol. 1, pp. 777–784. Association for Computational Linguistics (2008)

    Google Scholar 

  5. Daelemans, W., Zavrel, J., Van den Bosch, A., van der Sloot, K.: MBT: Memory-Based Tagger, version 3.2. Technical Report 10-04, ILK (2010)

    Google Scholar 

  6. Acedański, S., Przepiárkowski, A.: Towards the adequate evaluation of morphosyntactic taggers. In: Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Poster Session, Beijing, pp. 1–8 (2010)

    Google Scholar 

  7. Piasecki, M.: Polish tagger TaKIPI: Rule based construction and optimisation. Task Quarterly 11, 151–167 (2007)

    Google Scholar 

  8. Acedański, S.: A Morphosyntactic Brill Tagger for Inflectional Languages. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 3–14. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Śniatowski, T., Piasecki, M.: Combining Polish Morphosyntactic Taggers. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 359–369. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Radziszewski, A., Śniatowski, T.: A memory-based tagger for Polish. In: Proceedings of the 5th Language & Technology Conference, Poznań (2011)

    Google Scholar 

  11. Przepiórkowski, A., Murzynowski, G.: Manual annotation of the National Corpus of Polish with Anotatornia. [18]

    Google Scholar 

  12. Hajič, J.: Morphological tagging: Data vs. dictionaries. In: Proceedings of the 6th Applied Natural Language Processing and the 1st NAACL Conference, pp. 94–101 (2000)

    Google Scholar 

  13. Przepiórkowski, A., Górski, R.L., Łaziński, M., Pęzik, P.: Recent developments in the National Corpus of Polish. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, LREC 2010, ELRA, Valletta, Malta (2010)

    Google Scholar 

  14. Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, pp. 152–155. Association for Computational Linguistics, Morristown (1992)

    Chapter  Google Scholar 

  15. Woliński, M.: Morfeusz — a Practical Tool for the Morphological Analysis of Polish. In: Intelligent Information Processing and Web Mining, pp. 511–520 (2006)

    Google Scholar 

  16. Radziszewski, A., Śniatowski, T.: Maca — a configurable tool to integrate Polish morphological data. In: Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation (2011)

    Google Scholar 

  17. Radziszewski, A., Wardyński, A., Śniatowski, T.: WCCL: A morpho-syntactic feature toolkit. In: Proceedings of the Balto-Slavonic Natural Language Processing Workshop. Springer (2011)

    Google Scholar 

  18. Goźdź-Roszkowski, S. (ed.): The proceedings of Practical Applications in Language and Computers PALC 2009. Frankfurt am Main, Peter Lang (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Radziszewski, A., Acedański, S. (2012). Taggers Gonna Tag: An Argument against Evaluating Disambiguation Capacities of Morphosyntactic Taggers. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32790-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32789-6

  • Online ISBN: 978-3-642-32790-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics