Abstract
Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the problem. Intuitively, the higher the number of microtask answers, the more accurate the ranking becomes. However, it is often hard to decide the budget required for an accurate ranking. We study how a ranking process can be terminated early, and yet achieve a high-quality ranking and great savings in the budget. We use statistical tools to estimate the quality of the ranking result at any stage of the crowdsourcing process, and terminate the process as soon as the desired quality is achieved. Our proposed early-stopping module can be seamlessly integrated with most existing inference algorithms and task assignment methods. We conduct extensive experiments and show that our early-stopping module is better than other existing general stopping criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A formal definition of the stable state is provided in the Sect. 2.2.
References
Amazon mechanical turk. https://www.mturk.com
Figure eight. https://www.figure-eight.com
Moving average. https://en.wikipedia.org/wiki/Moving_average
Adelsman, R.M., Whinston, A.B.: Sophisticated voting with information for two voting functions. J. Econ. Theory 15(1), 145–159 (1977)
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39(3/4), 324–345 (1952)
Busa-Fekete, R., Szorenyi, B., Cheng, W., Weng, P., Hüllermeier, E.: Top-k selection based on adaptive sampling of noisy preferences. In: ICML (2013)
Chai, C., Fan, J., Li, G.: Incentive-based entity collection using crowdsourcing. In: ICDE, pp. 341–352. IEEE (2018)
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: WSDM, pp. 193–202. ACM (2013)
Chen, X., Chen, Y., Li, X.: Asymptotically optimal sequential design for rank aggregation. arXiv preprint arXiv:1710.06056 (2017)
Davidson, S.B., Khanna, S., Milo, T., Roy, S.: Using the crowd for top-k and group-by queries. In: ICDT, pp. 225–236. ACM (2013)
Eriksson, B.: Learning to top-k search using pairwise comparisons. In: Artificial Intelligence and Statistics, pp. 265–273 (2013)
Guo, S., Parameswaran, A., Garcia-Molina, H.: So who won? Dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396. ACM (2012)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Karger, D.R., Oh, S., Shah, D.: Efficient crowdsourcing for multi-class labeling. In: SIGMETRICS, vol. 41, no. 1, pp. 81–92 (2013)
Khan, A.R., Garcia-Molina, H.: Hybrid strategies for finding the max with the crowd. Technical report, Stanford InfoLab (2014)
Kou, N.M., Li, Y., Wang, H., Hou U, L., Gong, Z.: Crowdsourced top-k queries by confidence-aware pairwise judgments. In: SIGMOD, pp. 1415–1430. ACM (2017)
Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. TKDE 28(9), 2296–2319 (2016)
Lu, T., Boutilier, C.: Learning mallows models with pairwise preferences. In: ICML, pp. 145–152 (2011)
Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)
Negahban, S., Oh, S., Shah, D.: Iterative ranking from pair-wise comparisons. In: Advances in Neural Information Processing Systems, pp. 2474–2482 (2012)
Pfeiffer, T., Gao, X.A., Chen, Y., Mao, A., Rand, D.G.: Adaptive polling for information aggregation. In: AAAI (2012)
Pomerol, J.C., Barba-Romero, S.: Multicriterion Decision in Management: Principles and Practice, vol. 25. Springer, Boston (2012)
Raykar, V., Agrawal, P.: Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process. In: Artificial Intelligence and Statistics, pp. 832–840 (2014)
Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: T-Crowd: effective crowdsourcing for tabular data. In: ICDE, pp. 1316–1319 (2018)
Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: A crowdsourcing framework for collecting tabular data. TKDE (2019)
Thurstone, L.L.: A law of comparative judgment. Psychol. Rev. 34(4), 273 (1927)
Venetis, P., Garcia-Molina, H., Huang, K., Polyzotis, N.: Max algorithms in crowdsourcing environments. In: WWW, pp. 989–998. ACM (2012)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: CrowdER: crowdsourcing entity resolution. PVLDB 5(11), 1483–1494 (2012)
Wauthier, F., Jordan, M., Jojic, N.: Efficient ranking from pairwise comparisons. In: ICML, pp. 109–117 (2013)
Welinder, P., Perona, P.: Online crowdsourcing: rating annotators and obtaining cost-effective labels. In: CVPRW, pp. 25–32. IEEE (2010)
Zhang, X., Li, G., Feng, J.: Crowdsourced top-k algorithms: an experimental evaluation. PVLDB 9(8), 612–623 (2016)
Acknowledgement
Leong Hou U was funded by the National Key R&D Plan of China (2019YFB2102100), the FDCT Macau (SKL-IOTSC-2018-2020), and UM RC (MYRG2019-00119-FST). Caihua Shan and Reynold Cheng were supported by HK RGC (RGC Projects HKU 17229116, 106150091, and 17205115), HKU (Projects 104004572, 102009508, and 104004129), and HK ITF (ITF project MRP/029/18). Nikos Mamoulis has been co-financed by the European Regional Development Fund, Research–Create–Innovate project “Proximiot” (T1EDK-04810).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Shan, C., Hou U, L., Mamoulis, N., Cheng, R., Li, X. (2020). A General Early-Stopping Module for Crowdsourced Ranking. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-59416-9_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59415-2
Online ISBN: 978-3-030-59416-9
eBook Packages: Computer ScienceComputer Science (R0)