Skip to main content

Analyzing Likert Scale Inter-annotator Disagreement

  • Chapter
  • First Online:

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 151))

Abstract

Assessment of annotation reliability is typically undertaken as a quality assurance measure in order to provide a sound fulcrum for establishing the answers to research questions that require the annotated data. We argue that the assessment of inter-rater reliability can provide a source of information more directly related to the background research. The discussion is anchored in the analysis of conversational dominance in the MULTISIMO corpus. Other research has explored factors in dialogue (e.g. big-five personality traits and conversational style of participants) as predictors of independently perceived dominance. Rather than assessing the contributions of experimental factors to perceived dominance as a unitary aggregated response variable following verification of an acceptable level of inter-rater reliability, we use the variability in inter-annotator agreement as a response variable. We argue the general applicability of this in exploring research hypotheses that focus on qualities assessed with multiple annotations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    With complete compliance with the terms of consent provided by the participants, 18 of these dialogues are represented in the version of the corpus publicly available.

  2. 2.

    This survey was conducted independently, and rankings are reported in a database related to the game, http://familyfeudfriends.arjdesigns.com//, last accessed 11.05.2018.

  3. 3.

    Correctness of the answers and their rankings is determined by responses to an independent survey of sample of 100 people.

  4. 4.

    The first two rows of this table are provided for sake of completeness—it does not appear rational to propose a Likert scale with only one point, and if the experimental question required only two points, it seems unlikely that one would approach the binary judgement using a Likert scale.

References

  1. Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)

    Article  Google Scholar 

  2. Beigman Klebanov, B., Beigman, E.: From annotator agreement to noise models. Comput. Linguist. 35(4), 495–503 (2009)

    Article  Google Scholar 

  3. Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)

    Google Scholar 

  4. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)

    Article  Google Scholar 

  5. Geertzen, J., Bunt, H.: Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, pp. 126–133. Association for Computational Linguistics (2006)

    Google Scholar 

  6. Koutsombogera, M., Costello, R., Vogel, C.: Quantifying dominance in the multisimo corpus. In: Baranyi P., Esposito A., Földesi P., Mihálydeák T. (eds.) 9th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2018), pp. 147–152. IEEE (2018)

    Google Scholar 

  7. Koutsombogera, M., Vogel, C.: Modeling collaborative multimodal behavior in group dialogues: The MULTISIMO corpus. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Paris, France (2018)

    Google Scholar 

  8. Krippendorff, K.: Content Analysis: An Introduction to its Methodology. Sage, Thousand Oaks, CA (2004)

    Google Scholar 

  9. Krippendorff, K.: Reliability in content analysis: some common misconceptions and recommendations. Hum. Commun. Res. 30(3), 411–433 (2004)

    Google Scholar 

  10. Pham-Gia, T., Hung, T.L.: The mean and median absolute deviations. Math. Comput. Model. 34, 921–936 (2001)

    Article  MathSciNet  Google Scholar 

  11. Reidsma, D., Carletta, J.: Reliability measurement without limits. Comput. Linguist. 34(3), 319–326 (2008)

    Article  Google Scholar 

  12. Veronis, J.: A study of polysemy judgements and inter-annotator agreement. In: Programme and Advanced Papers of the Senseval Workshop, Herstmonceux (1998). http://www.itri.brighton.ac.uk/events/senseval/ARCHIVE/PROCEEDINGS/interannotator.ps. URL last verified Feb 2019

Download references

Acknowledgements

The research leading to these results has received funding from (a) the ADAPT Centre for Digital Content Technology, funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund, and (b) the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 701621 (MULTISIMO).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Koutsombogera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Vogel, C., Koutsombogera, M., Costello, R. (2020). Analyzing Likert Scale Inter-annotator Disagreement. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Approaches to Dynamics of Signal Exchanges. Smart Innovation, Systems and Technologies, vol 151. Springer, Singapore. https://doi.org/10.1007/978-981-13-8950-4_34

Download citation

Publish with us

Policies and ethics