Skip to main content

A General Early-Stopping Module for Crowdsourced Ranking

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12113))

Included in the following conference series:

Abstract

Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the problem. Intuitively, the higher the number of microtask answers, the more accurate the ranking becomes. However, it is often hard to decide the budget required for an accurate ranking. We study how a ranking process can be terminated early, and yet achieve a high-quality ranking and great savings in the budget. We use statistical tools to estimate the quality of the ranking result at any stage of the crowdsourcing process, and terminate the process as soon as the desired quality is achieved. Our proposed early-stopping module can be seamlessly integrated with most existing inference algorithms and task assignment methods. We conduct extensive experiments and show that our early-stopping module is better than other existing general stopping criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A formal definition of the stable state is provided in the Sect. 2.2.

References

  1. Amazon mechanical turk. https://www.mturk.com

  2. Figure eight. https://www.figure-eight.com

  3. Moving average. https://en.wikipedia.org/wiki/Moving_average

  4. Adelsman, R.M., Whinston, A.B.: Sophisticated voting with information for two voting functions. J. Econ. Theory 15(1), 145–159 (1977)

    Article  MathSciNet  Google Scholar 

  5. Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39(3/4), 324–345 (1952)

    Article  MathSciNet  Google Scholar 

  6. Busa-Fekete, R., Szorenyi, B., Cheng, W., Weng, P., Hüllermeier, E.: Top-k selection based on adaptive sampling of noisy preferences. In: ICML (2013)

    Google Scholar 

  7. Chai, C., Fan, J., Li, G.: Incentive-based entity collection using crowdsourcing. In: ICDE, pp. 341–352. IEEE (2018)

    Google Scholar 

  8. Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: WSDM, pp. 193–202. ACM (2013)

    Google Scholar 

  9. Chen, X., Chen, Y., Li, X.: Asymptotically optimal sequential design for rank aggregation. arXiv preprint arXiv:1710.06056 (2017)

  10. Davidson, S.B., Khanna, S., Milo, T., Roy, S.: Using the crowd for top-k and group-by queries. In: ICDT, pp. 225–236. ACM (2013)

    Google Scholar 

  11. Eriksson, B.: Learning to top-k search using pairwise comparisons. In: Artificial Intelligence and Statistics, pp. 265–273 (2013)

    Google Scholar 

  12. Guo, S., Parameswaran, A., Garcia-Molina, H.: So who won? Dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396. ACM (2012)

    Google Scholar 

  13. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)

    Article  MathSciNet  Google Scholar 

  14. Karger, D.R., Oh, S., Shah, D.: Efficient crowdsourcing for multi-class labeling. In: SIGMETRICS, vol. 41, no. 1, pp. 81–92 (2013)

    Google Scholar 

  15. Khan, A.R., Garcia-Molina, H.: Hybrid strategies for finding the max with the crowd. Technical report, Stanford InfoLab (2014)

    Google Scholar 

  16. Kou, N.M., Li, Y., Wang, H., Hou U, L., Gong, Z.: Crowdsourced top-k queries by confidence-aware pairwise judgments. In: SIGMOD, pp. 1415–1430. ACM (2017)

    Google Scholar 

  17. Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. TKDE 28(9), 2296–2319 (2016)

    Google Scholar 

  18. Lu, T., Boutilier, C.: Learning mallows models with pairwise preferences. In: ICML, pp. 145–152 (2011)

    Google Scholar 

  19. Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)

    Article  Google Scholar 

  20. Negahban, S., Oh, S., Shah, D.: Iterative ranking from pair-wise comparisons. In: Advances in Neural Information Processing Systems, pp. 2474–2482 (2012)

    Google Scholar 

  21. Pfeiffer, T., Gao, X.A., Chen, Y., Mao, A., Rand, D.G.: Adaptive polling for information aggregation. In: AAAI (2012)

    Google Scholar 

  22. Pomerol, J.C., Barba-Romero, S.: Multicriterion Decision in Management: Principles and Practice, vol. 25. Springer, Boston (2012)

    MATH  Google Scholar 

  23. Raykar, V., Agrawal, P.: Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process. In: Artificial Intelligence and Statistics, pp. 832–840 (2014)

    Google Scholar 

  24. Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: T-Crowd: effective crowdsourcing for tabular data. In: ICDE, pp. 1316–1319 (2018)

    Google Scholar 

  25. Shan, C., Mamoulis, N., Li, G., Cheng, R., Huang, Z., Zheng, Y.: A crowdsourcing framework for collecting tabular data. TKDE (2019)

    Google Scholar 

  26. Thurstone, L.L.: A law of comparative judgment. Psychol. Rev. 34(4), 273 (1927)

    Article  Google Scholar 

  27. Venetis, P., Garcia-Molina, H., Huang, K., Polyzotis, N.: Max algorithms in crowdsourcing environments. In: WWW, pp. 989–998. ACM (2012)

    Google Scholar 

  28. Wang, J., Kraska, T., Franklin, M.J., Feng, J.: CrowdER: crowdsourcing entity resolution. PVLDB 5(11), 1483–1494 (2012)

    Google Scholar 

  29. Wauthier, F., Jordan, M., Jojic, N.: Efficient ranking from pairwise comparisons. In: ICML, pp. 109–117 (2013)

    Google Scholar 

  30. Welinder, P., Perona, P.: Online crowdsourcing: rating annotators and obtaining cost-effective labels. In: CVPRW, pp. 25–32. IEEE (2010)

    Google Scholar 

  31. Zhang, X., Li, G., Feng, J.: Crowdsourced top-k algorithms: an experimental evaluation. PVLDB 9(8), 612–623 (2016)

    MathSciNet  Google Scholar 

Download references

Acknowledgement

Leong Hou U was funded by the National Key R&D Plan of China (2019YFB2102100), the FDCT Macau (SKL-IOTSC-2018-2020), and UM RC (MYRG2019-00119-FST). Caihua Shan and Reynold Cheng were supported by HK RGC (RGC Projects HKU 17229116, 106150091, and 17205115), HKU (Projects 104004572, 102009508, and 104004129), and HK ITF (ITF project MRP/029/18). Nikos Mamoulis has been co-financed by the European Regional Development Fund, Research–Create–Innovate project “Proximiot” (T1EDK-04810).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shan, C., Hou U, L., Mamoulis, N., Cheng, R., Li, X. (2020). A General Early-Stopping Module for Crowdsourced Ranking. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59416-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59415-2

  • Online ISBN: 978-3-030-59416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics