skip to main content
10.1145/2786451.2786492acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Crowdsourcing ground truth for Question Answering using CrowdTruth

Published: 28 June 2015 Publication History

Abstract

Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.

References

[1]
Omar Alonso, Daniel E Rose, and Benjamin Stewart. Crowdsourcing for relevance evaluation. In ACM SigIR Forum, volume 42, pages 9--15. ACM, 2008.
[2]
Lora Aroyo and Chris Welty. The Three Sides of CrowdTruth. Journal of Human Computation, 2014.
[3]
Lora Aroyo and Chris Welty. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1):15--24, 2015.
[4]
Jin Ha Lee and Xiao Hu. Generating ground truth for music mood classification using mechanical turk. In Proc. of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pages 129--138. ACM, 2012.

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2017)CSQuaRE: Approach for Quality Control in CrowdsourcingWeb Engineering10.1007/978-3-319-60131-1_47(592-599)Online publication date: 1-Jun-2017
  • (2016)Exploiting Disagreement Through Open-Ended Tasks for Capturing Interpretation SpacesProceedings of the 13th International Conference on The Semantic Web. Latest Advances and New Domains - Volume 967810.1007/978-3-319-34129-3_56(873-882)Online publication date: 29-May-2016

Index Terms

  1. Crowdsourcing ground truth for Question Answering using CrowdTruth

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WebSci '15: Proceedings of the ACM Web Science Conference
    June 2015
    366 pages
    ISBN:9781450336727
    DOI:10.1145/2786451
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Crowdsourcing
    2. Disagreement
    3. Gold Standard Annotation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WebSci '15
    Sponsor:
    WebSci '15: ACM Web Science Conference
    June 28 - July 1, 2015
    Oxford, United Kingdom

    Acceptance Rates

    Overall Acceptance Rate 245 of 933 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
    • (2017)CSQuaRE: Approach for Quality Control in CrowdsourcingWeb Engineering10.1007/978-3-319-60131-1_47(592-599)Online publication date: 1-Jun-2017
    • (2016)Exploiting Disagreement Through Open-Ended Tasks for Capturing Interpretation SpacesProceedings of the 13th International Conference on The Semantic Web. Latest Advances and New Domains - Volume 967810.1007/978-3-319-34129-3_56(873-882)Online publication date: 29-May-2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media