s-AWARE: Supervised Measure-Based Methods for Crowd-Assessors Combination

Ferrante, Marco; Ferro, Nicola; Piazzon, Luca

doi:10.1007/978-3-030-58219-7_2

Marco Ferrante¹⁸,
Nicola Ferro¹⁹ &
Luca Piazzon¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12260))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

876 Accesses
1 Citations

Abstract

Ground-truth creation is one of the most demanding activities in terms of time, effort, and resources needed for creating an experimental collection. For this reason, crowdsourcing has emerged as a viable option to reduce the costs and time invested in it.

An effective assessor merging methodology is crucial to guarantee a good ground-truth quality. The classical approach involve the aggregation of labels from multiple assessors using some voting and/or classification methods. Recently, Assessor-driven Weighted Averages for Retrieval Evaluation (AWARE) has been proposed as an unsupervised alternative, which optimizes the final evaluation measure, rather than the labels, computed from multiple judgments.

In this paper, we propose s-AWARE, a supervised version of AWARE. We tested s-AWARE against a range of state-of-the-art methods and the unsupervised AWARE on several TREC collections. We analysed how the performance of these methods changes by increasing assessors’ judgement sparsity, highlighting that s-AWARE is an effective approach in a real scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The original AWARE methodology considered additional ways to quantify “closeness”, i.e. Frobenious norm, Kullback-Leibler Divergence (KLD), and AP Correlation (APC). Here, we focus on the two approaches which produced the best and most stable results across different configurations.

References

Allan, J., Harman, D.K., Kanoulas, E., Li, D., Van Gysel, C., Voorhees, E.M.: TREC 2017 common core track overview. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Sixth Text REtrieval Conference (TREC 2017). National Institute of Standards and Technology (NIST), Special Publication 500-324, Washington, USA (2018)
Google Scholar
Alonso, O.: The Practice of Crowdsourcing. Morgan & Claypool Publishers, San Rafael (2019)
Book Google Scholar
Alonso, O., Mizzaro, S.: Using crowdsourcing for TREC relevance assessment. Inf. Process. Manag. 48(6), 1053–1066 (2012). https://doi.org/10.1016/j.ipm.2012.01.004. ISSN 0306-4573
Article Google Scholar
Clough, P., Sanderson, M., Tang, J., Gollins, T., Warner, A.: Examining the limits of crowdsourcing for relevance assessment. IEEE Internet Comput. 17(4), 32–38 (2013). https://doi.org/10.1109/mic.2012.95
Article Google Scholar
Ferrante, M., Ferro, N., Maistro, M.: AWARE: exploiting evaluation measures to combine multiple assessors. ACM Trans. Inf. Syst. 36(2), 1–38 (2017). https://doi.org/10.1145/3110217
Article Google Scholar
Georgescu, M., Zhu, X.: Aggregation of crowdsourced labels based on worker history. In: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS 2014). ACM, New York (2014). https://doi.org/10.1145/2611040.2611074. ISBN 9781450325387
Hosseini, M., Cox, I.J., Milić-Frayling, N., Kazai, G., Vinay, V.: On aggregating labels from multiple crowd workers to infer relevance of documents. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 182–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_16
Chapter Google Scholar
Inel, O., et al.: Studying topical relevance with evidence-based crowdsourcing. In: Cuzzocrea, A., et al. (eds.) Proceedings of 27th International Conference on Information and Knowledge Management (CIKM 2018), pp. 1253–1262. ACM Press, New York (2018)
Google Scholar
Nellapati, R., Peerreddy, S., Singhal, P.: Skierarchy: Extending the power of crowdsourcing using a hierarchy of domain experts, crowd and machine learning. In: Proceedings of the TREC 2012 Crowdsourcing Track, pp. 1–11 (2012)
Google Scholar
Sanderson, M.: Test collection based evaluation of information retrieval systems. Found. Trends Inf. Retr. (FnTIR) 4(4), 247–375 (2010)
Article Google Scholar
Smucker, M.D., Kazai, G., Lease, M.: Overview of the TREC 2012 crowdsourcing track. In: Voorhees, E.M., Buckland, L.P. (eds.) Proceedings of The Twenty-First Text REtrieval Conference (TREC 2012). National Institute of Standards and Technology (NIST), Special Publication 500-298, Washington, USA (2013)
Google Scholar
Tang, W., Lease, M.: Semi-supervised consensus labeling for crowdsourcing. In: Proceedings of the SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval (CIR), pp. 36–41. ACM, New York (2011). ISBN 9781450325387
Google Scholar
Tao, D., Cheng, J., Yu, Z., Yue, K., Wang, L.: Domain-weighted majority voting for crowdsourcing. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 163–174 (2019). https://doi.org/10.1109/tnnls.2018.2836969
Article MathSciNet Google Scholar
Tian, T., Zhu, J., Qiaoben, Y.: Max-margin majority voting for learning from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2480–2494 (2019). https://doi.org/10.1109/tpami.2018.2860987
Article Google Scholar
Voorhees, E.M.: Overview of the TREC 2004 robust track. In: Voorhees, E.M., Buckland, L.P. (eds.) Proceedings of The Thirteenth Text REtrieval Conference (TREC 2004). National Institute of Standards and Technology (NIST), Special Publication 500-261, Washington, USA (2004)
Google Scholar
Voorhees, E.M., Harman, D.K.: Overview of the eigth Text REtrieval Conference (TREC-8). In: Voorhees, E.M., Harman, D.K. (eds.) Proceedings of The Eighth Text REtrieval Conference (TREC-8), pp. 1–24. National Institute of Standards and Technology (NIST), Special Publication 500-246, Washington, USA (1999)
Google Scholar
Whiting, S., Perez, J., Zuccon, G., Leelanupab, T., Jose, J.: University of Glasgow (qirdcsuog) at TREC crowdsourcing 2001: TurkRank - network based worker ranking in crowdsourcing. In: Proceedings of The Twentieth Text REtrieval Conference, TREC 2011, Gaithersburg, Maryland, USA, 15–18 November 2011, pp. 1–7, January 2011
Google Scholar
Yilmaz, E., Aslam, J.A., Robertson, S.E.: A new rank correlation coefficient for information retrieval. In: Chua, T.S., Leong, M.K., Oard, D.W., Sebastiani, F. (eds.) Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pp. 587–594. ACM Press, New York, USA (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics “Tullio Levi-Civita”, University of Padua, Padua, Italy
Marco Ferrante
Department of Information Engineering, University of Padua, Padua, Italy
Nicola Ferro & Luca Piazzon

Authors

Marco Ferrante
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Ferro
View author publications
You can also search for this author in PubMed Google Scholar
Luca Piazzon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Piazzon .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece
Avi Arampatzis
University of Amsterdam, Amsterdam, The Netherlands
Evangelos Kanoulas
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Theodora Tsikrika
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis
Faculty of Library, Information and Media Science, University of Tsukuba, Ibaraki, Japan
Hideo Joho
Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
Christina Lioma
Brown University, Providence, RI, USA
Carsten Eickhoff
LIMSI-CNRS, Orsay, France
Aurélie Névéol
Department of Information Engineering, University of Padova, Padua, Italy
Linda Cappellato
Department of Information Engineering, University of Padova, Padua, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferrante, M., Ferro, N., Piazzon, L. (2020). s-AWARE: Supervised Measure-Based Methods for Crowd-Assessors Combination. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2020. Lecture Notes in Computer Science(), vol 12260. Springer, Cham. https://doi.org/10.1007/978-3-030-58219-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-58219-7_2
Published: 15 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58218-0
Online ISBN: 978-3-030-58219-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics