Analyzing Likert Scale Inter-annotator Disagreement

Vogel, Carl; Koutsombogera, Maria; Costello, Rachel

doi:10.1007/978-981-13-8950-4_34

Analyzing Likert Scale Inter-annotator Disagreement

Carl Vogel⁷,
Maria Koutsombogera⁷ &
Rachel Costello⁷

Chapter
First Online: 19 September 2019

1043 Accesses
6 Citations

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 151))

Abstract

Assessment of annotation reliability is typically undertaken as a quality assurance measure in order to provide a sound fulcrum for establishing the answers to research questions that require the annotated data. We argue that the assessment of inter-rater reliability can provide a source of information more directly related to the background research. The discussion is anchored in the analysis of conversational dominance in the MULTISIMO corpus. Other research has explored factors in dialogue (e.g. big-five personality traits and conversational style of participants) as predictors of independently perceived dominance. Rather than assessing the contributions of experimental factors to perceived dominance as a unitary aggregated response variable following verification of an acceptable level of inter-rater reliability, we use the variability in inter-annotator agreement as a response variable. We argue the general applicability of this in exploring research hypotheses that focus on qualities assessed with multiple annotations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
With complete compliance with the terms of consent provided by the participants, 18 of these dialogues are represented in the version of the corpus publicly available.
2.
This survey was conducted independently, and rankings are reported in a database related to the game, http://familyfeudfriends.arjdesigns.com//, last accessed 11.05.2018.
3.
Correctness of the answers and their rankings is determined by responses to an independent survey of sample of 100 people.
4.
The first two rows of this table are provided for sake of completeness—it does not appear rational to propose a Likert scale with only one point, and if the experimental question required only two points, it seems unlikely that one would approach the binary judgement using a Likert scale.

References

Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)
Article Google Scholar
Beigman Klebanov, B., Beigman, E.: From annotator agreement to noise models. Comput. Linguist. 35(4), 495–503 (2009)
Article Google Scholar
Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)
Article Google Scholar
Geertzen, J., Bunt, H.: Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, pp. 126–133. Association for Computational Linguistics (2006)
Google Scholar
Koutsombogera, M., Costello, R., Vogel, C.: Quantifying dominance in the multisimo corpus. In: Baranyi P., Esposito A., Földesi P., Mihálydeák T. (eds.) 9th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2018), pp. 147–152. IEEE (2018)
Google Scholar
Koutsombogera, M., Vogel, C.: Modeling collaborative multimodal behavior in group dialogues: The MULTISIMO corpus. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Paris, France (2018)
Google Scholar
Krippendorff, K.: Content Analysis: An Introduction to its Methodology. Sage, Thousand Oaks, CA (2004)
Google Scholar
Krippendorff, K.: Reliability in content analysis: some common misconceptions and recommendations. Hum. Commun. Res. 30(3), 411–433 (2004)
Google Scholar
Pham-Gia, T., Hung, T.L.: The mean and median absolute deviations. Math. Comput. Model. 34, 921–936 (2001)
Article MathSciNet Google Scholar
Reidsma, D., Carletta, J.: Reliability measurement without limits. Comput. Linguist. 34(3), 319–326 (2008)
Article Google Scholar
Veronis, J.: A study of polysemy judgements and inter-annotator agreement. In: Programme and Advanced Papers of the Senseval Workshop, Herstmonceux (1998). http://www.itri.brighton.ac.uk/events/senseval/ARCHIVE/PROCEEDINGS/interannotator.ps. URL last verified Feb 2019

Download references

Acknowledgements

The research leading to these results has received funding from (a) the ADAPT Centre for Digital Content Technology, funded under the SFI Research Centres Programme (Grant 13/RC/2106) and co-funded under the European Regional Development Fund, and (b) the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 701621 (MULTISIMO).

Author information

Authors and Affiliations

School of Computer Science and Statistics, Trinity College Dublin, Dublin 2, Ireland
Carl Vogel, Maria Koutsombogera & Rachel Costello

Authors

Carl Vogel
View author publications
You can also search for this author in PubMed Google Scholar
Maria Koutsombogera
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Costello
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Koutsombogera .

Editor information

Editors and Affiliations

Department of Psychology, University of Campania Luigi Vanvitelli, Caserta, Italy
Anna Esposito
Tecnocampus, Mataró, Spain
Marcos Faundez-Zanuy
Department of Civil, Environment, Energy and Materials Engineering, Mediterranea University of Reggio Calabria, Reggio Calabria, Italy
Francesco Carlo Morabito
Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino, Turin, Italy
Eros Pasero

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vogel, C., Koutsombogera, M., Costello, R. (2020). Analyzing Likert Scale Inter-annotator Disagreement. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Approaches to Dynamics of Signal Exchanges. Smart Innovation, Systems and Technologies, vol 151. Springer, Singapore. https://doi.org/10.1007/978-981-13-8950-4_34

Download citation

DOI: https://doi.org/10.1007/978-981-13-8950-4_34
Published: 19 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8949-8
Online ISBN: 978-981-13-8950-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics