Skip to main content

s-AWARE: Supervised Measure-Based Methods for Crowd-Assessors Combination

  • Conference paper
  • First Online:
Book cover Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12260))

Abstract

Ground-truth creation is one of the most demanding activities in terms of time, effort, and resources needed for creating an experimental collection. For this reason, crowdsourcing has emerged as a viable option to reduce the costs and time invested in it.

An effective assessor merging methodology is crucial to guarantee a good ground-truth quality. The classical approach involve the aggregation of labels from multiple assessors using some voting and/or classification methods. Recently, Assessor-driven Weighted Averages for Retrieval Evaluation (AWARE) has been proposed as an unsupervised alternative, which optimizes the final evaluation measure, rather than the labels, computed from multiple judgments.

In this paper, we propose s-AWARE, a supervised version of AWARE. We tested s-AWARE against a range of state-of-the-art methods and the unsupervised AWARE on several TREC collections. We analysed how the performance of these methods changes by increasing assessors’ judgement sparsity, highlighting that s-AWARE is an effective approach in a real scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The original AWARE methodology considered additional ways to quantify “closeness”, i.e. Frobenious norm, Kullback-Leibler Divergence (KLD), and AP Correlation (APC). Here, we focus on the two approaches which produced the best and most stable results across different configurations.

References

  1. Allan, J., Harman, D.K., Kanoulas, E., Li, D., Van Gysel, C., Voorhees, E.M.: TREC 2017 common core track overview. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Sixth Text REtrieval Conference (TREC 2017). National Institute of Standards and Technology (NIST), Special Publication 500-324, Washington, USA (2018)

    Google Scholar 

  2. Alonso, O.: The Practice of Crowdsourcing. Morgan & Claypool Publishers, San Rafael (2019)

    Book  Google Scholar 

  3. Alonso, O., Mizzaro, S.: Using crowdsourcing for TREC relevance assessment. Inf. Process. Manag. 48(6), 1053–1066 (2012). https://doi.org/10.1016/j.ipm.2012.01.004. ISSN 0306-4573

    Article  Google Scholar 

  4. Clough, P., Sanderson, M., Tang, J., Gollins, T., Warner, A.: Examining the limits of crowdsourcing for relevance assessment. IEEE Internet Comput. 17(4), 32–38 (2013). https://doi.org/10.1109/mic.2012.95

    Article  Google Scholar 

  5. Ferrante, M., Ferro, N., Maistro, M.: AWARE: exploiting evaluation measures to combine multiple assessors. ACM Trans. Inf. Syst. 36(2), 1–38 (2017). https://doi.org/10.1145/3110217

    Article  Google Scholar 

  6. Georgescu, M., Zhu, X.: Aggregation of crowdsourced labels based on worker history. In: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS 2014). ACM, New York (2014). https://doi.org/10.1145/2611040.2611074. ISBN 9781450325387

  7. Hosseini, M., Cox, I.J., Milić-Frayling, N., Kazai, G., Vinay, V.: On aggregating labels from multiple crowd workers to infer relevance of documents. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 182–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_16

    Chapter  Google Scholar 

  8. Inel, O., et al.: Studying topical relevance with evidence-based crowdsourcing. In: Cuzzocrea, A., et al. (eds.) Proceedings of 27th International Conference on Information and Knowledge Management (CIKM 2018), pp. 1253–1262. ACM Press, New York (2018)

    Google Scholar 

  9. Nellapati, R., Peerreddy, S., Singhal, P.: Skierarchy: Extending the power of crowdsourcing using a hierarchy of domain experts, crowd and machine learning. In: Proceedings of the TREC 2012 Crowdsourcing Track, pp. 1–11 (2012)

    Google Scholar 

  10. Sanderson, M.: Test collection based evaluation of information retrieval systems. Found. Trends Inf. Retr. (FnTIR) 4(4), 247–375 (2010)

    Article  Google Scholar 

  11. Smucker, M.D., Kazai, G., Lease, M.: Overview of the TREC 2012 crowdsourcing track. In: Voorhees, E.M., Buckland, L.P. (eds.) Proceedings of The Twenty-First Text REtrieval Conference (TREC 2012). National Institute of Standards and Technology (NIST), Special Publication 500-298, Washington, USA (2013)

    Google Scholar 

  12. Tang, W., Lease, M.: Semi-supervised consensus labeling for crowdsourcing. In: Proceedings of the SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval (CIR), pp. 36–41. ACM, New York (2011). ISBN 9781450325387

    Google Scholar 

  13. Tao, D., Cheng, J., Yu, Z., Yue, K., Wang, L.: Domain-weighted majority voting for crowdsourcing. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 163–174 (2019). https://doi.org/10.1109/tnnls.2018.2836969

    Article  MathSciNet  Google Scholar 

  14. Tian, T., Zhu, J., Qiaoben, Y.: Max-margin majority voting for learning from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2480–2494 (2019). https://doi.org/10.1109/tpami.2018.2860987

    Article  Google Scholar 

  15. Voorhees, E.M.: Overview of the TREC 2004 robust track. In: Voorhees, E.M., Buckland, L.P. (eds.) Proceedings of The Thirteenth Text REtrieval Conference (TREC 2004). National Institute of Standards and Technology (NIST), Special Publication 500-261, Washington, USA (2004)

    Google Scholar 

  16. Voorhees, E.M., Harman, D.K.: Overview of the eigth Text REtrieval Conference (TREC-8). In: Voorhees, E.M., Harman, D.K. (eds.) Proceedings of The Eighth Text REtrieval Conference (TREC-8), pp. 1–24. National Institute of Standards and Technology (NIST), Special Publication 500-246, Washington, USA (1999)

    Google Scholar 

  17. Whiting, S., Perez, J., Zuccon, G., Leelanupab, T., Jose, J.: University of Glasgow (qirdcsuog) at TREC crowdsourcing 2001: TurkRank - network based worker ranking in crowdsourcing. In: Proceedings of The Twentieth Text REtrieval Conference, TREC 2011, Gaithersburg, Maryland, USA, 15–18 November 2011, pp. 1–7, January 2011

    Google Scholar 

  18. Yilmaz, E., Aslam, J.A., Robertson, S.E.: A new rank correlation coefficient for information retrieval. In: Chua, T.S., Leong, M.K., Oard, D.W., Sebastiani, F. (eds.) Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pp. 587–594. ACM Press, New York, USA (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Piazzon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ferrante, M., Ferro, N., Piazzon, L. (2020). s-AWARE: Supervised Measure-Based Methods for Crowd-Assessors Combination. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2020. Lecture Notes in Computer Science(), vol 12260. Springer, Cham. https://doi.org/10.1007/978-3-030-58219-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58219-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58218-0

  • Online ISBN: 978-3-030-58219-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics