A General Early-Stopping Module for Crowdsourced Ranking

Shan, Caihua; Hou U, Leong; Mamoulis, Nikos; Cheng, Reynold; Li, Xiang

doi:10.1007/978-3-030-59416-9_19

Caihua Shan¹⁴,
Leong Hou U¹⁵,
Nikos Mamoulis¹⁶,
Reynold Cheng¹⁴ &
…
Xiang Li¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12113))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1920 Accesses
1 Citations

Abstract

Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the problem. Intuitively, the higher the number of microtask answers, the more accurate the ranking becomes. However, it is often hard to decide the budget required for an accurate ranking. We study how a ranking process can be terminated early, and yet achieve a high-quality ranking and great savings in the budget. We use statistical tools to estimate the quality of the ranking result at any stage of the crowdsourcing process, and terminate the process as soon as the desired quality is achieved. Our proposed early-stopping module can be seamlessly integrated with most existing inference algorithms and task assignment methods. We conduct extensive experiments and show that our early-stopping module is better than other existing general stopping criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Accurate inference of crowdsourcing properties when using efficient allocation strategies

Article Open access 27 April 2022

TROP: Task Ranking Optimization Problem on Crowdsourcing Service Platform

Review on ranking and selection: A new perspective

Article Open access 29 March 2021

Notes

1.
A formal definition of the stable state is provided in the Sect. 2.2.

References

Amazon mechanical turk. https://www.mturk.com
Figure eight. https://www.figure-eight.com
Moving average. https://en.wikipedia.org/wiki/Moving_average
Adelsman, R.M., Whinston, A.B.: Sophisticated voting with information for two voting functions. J. Econ. Theory 15(1), 145–159 (1977)
Article MathSciNet Google Scholar
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39(3/4), 324–345 (1952)
Article MathSciNet Google Scholar
Busa-Fekete, R., Szorenyi, B., Cheng, W., Weng, P., Hüllermeier, E.: Top-k selection based on adaptive sampling of noisy preferences. In: ICML (2013)
Google Scholar
Chai, C., Fan, J., Li, G.: Incentive-based entity collection using crowdsourcing. In: ICDE, pp. 341–352. IEEE (2018)
Google Scholar
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: WSDM, pp. 193–202. ACM (2013)
Google Scholar
Chen, X., Chen, Y., Li, X.: Asymptotically optimal sequential design for rank aggregation. arXiv preprint arXiv:1710.06056 (2017)
Davidson, S.B., Khanna, S., Milo, T., Roy, S.: Using the crowd for top-k and group-by queries. In: ICDT, pp. 225–236. ACM (2013)
Google Scholar
Eriksson, B.: Learning to top-k search using pairwise comparisons. In: Artificial Intelligence and Statistics, pp. 265–273 (2013)
Google Scholar
Guo, S., Parameswaran, A., Garcia-Molina, H.: So who won? Dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396. ACM (2012)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Article MathSciNet Google Scholar
Karger, D.R., Oh, S., Shah, D.: Efficient crowdsourcing for multi-class labeling. In: SIGMETRICS, vol. 41, no. 1, pp. 81–92 (2013)
Google Scholar
Khan, A.R., Garcia-Molina, H.: Hybrid strategies for finding the max with the crowd. Technical report, Stanford InfoLab (2014)
Google Scholar
Kou, N.M., Li, Y., Wang, H., Hou U, L., Gong, Z.: Crowdsourced top-k queries by confidence-aware pairwise judgments. In: SIGMOD, pp. 1415–1430. ACM (2017)
Google Scholar
Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. TKDE 28(9), 2296–2319 (2016)
Google Scholar
Lu, T., Boutilier, C.: Learning mallows models with pairwise preferences. In: ICML, pp. 145–152 (2011)
Google Scholar
Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)
Article Google Scholar
Negahban, S., Oh, S., Shah, D.: Iterative ranking from pair-wise comparisons. In: Advances in Neural Information Processing Systems, pp. 2474–2482 (2012)
Google Scholar
Pfeiffer, T., Gao, X.A., Chen, Y., Mao, A., Rand, D.G.: Adaptive polling for information aggregation. In: AAAI (2012)
Google Scholar
Pomerol, J.C., Barba-Romero, S.: Multicriterion Decision in Management: Principles and Practice, vol. 25. Springer, Boston (2012)
MATH Google Scholar
Raykar, V., Agrawal, P.: Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process. In: Artificial Intelligence and Statistics, pp. 832–840 (2014)
Google Scholar
Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: T-Crowd: effective crowdsourcing for tabular data. In: ICDE, pp. 1316–1319 (2018)
Google Scholar
Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: A crowdsourcing framework for collecting tabular data. TKDE (2019)
Google Scholar
Thurstone, L.L.: A law of comparative judgment. Psychol. Rev. 34(4), 273 (1927)
Article Google Scholar
Venetis, P., Garcia-Molina, H., Huang, K., Polyzotis, N.: Max algorithms in crowdsourcing environments. In: WWW, pp. 989–998. ACM (2012)
Google Scholar
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: CrowdER: crowdsourcing entity resolution. PVLDB 5(11), 1483–1494 (2012)
Google Scholar
Wauthier, F., Jordan, M., Jojic, N.: Efficient ranking from pairwise comparisons. In: ICML, pp. 109–117 (2013)
Google Scholar
Welinder, P., Perona, P.: Online crowdsourcing: rating annotators and obtaining cost-effective labels. In: CVPRW, pp. 25–32. IEEE (2010)
Google Scholar
Zhang, X., Li, G., Feng, J.: Crowdsourced top-k algorithms: an experimental evaluation. PVLDB 9(8), 612–623 (2016)
MathSciNet Google Scholar

Download references

Acknowledgement

Leong Hou U was funded by the National Key R&D Plan of China (2019YFB2102100), the FDCT Macau (SKL-IOTSC-2018-2020), and UM RC (MYRG2019-00119-FST). Caihua Shan and Reynold Cheng were supported by HK RGC (RGC Projects HKU 17229116, 106150091, and 17205115), HKU (Projects 104004572, 102009508, and 104004129), and HK ITF (ITF project MRP/029/18). Nikos Mamoulis has been co-financed by the European Regional Development Fund, Research–Create–Innovate project “Proximiot” (T1EDK-04810).

Author information

Authors and Affiliations

Department of Computer Science, University of Hong Kong, Pok Fu Lam, Hong Kong
Caihua Shan, Reynold Cheng & Xiang Li
State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, University of Macau, Macau, China
Leong Hou U
Department of Computer Science, University of Ioannina, Ioannina, Epirus, Greece
Nikos Mamoulis

Authors

Caihua Shan
View author publications
You can also search for this author in PubMed Google Scholar
Leong Hou U
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Mamoulis
View author publications
You can also search for this author in PubMed Google Scholar
Reynold Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiang Li .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of System Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shan, C., Hou U, L., Mamoulis, N., Cheng, R., Li, X. (2020). A General Early-Stopping Module for Crowdsourced Ranking. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-59416-9_19
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59415-2
Online ISBN: 978-3-030-59416-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics