ABSTRACT
Crowdsourcing is an emerging paradigm that harnesses a mass of users to perform various types of tasks. We focus in this tutorial on a particular form of crowdsourcing, namely crowd (or mob) datasourcing whose goal is to obtain, aggregate or process data. We overview crowd datasourcing solutions in various contexts, explain the need for a principled solution, describe advances towards achieving such a solution, and highlight remaining gaps.
- S. Abiteboul, O. Benjelloun, and T. Milo. The active xml project: an overview. VLDB J., 17(5), 2008. Google ScholarDigital Library
- S. Abiteboul, M. Bienvenu, A. Galland, and E. Antoine. A rule-based language for web data management. In PODS, 2011. Google ScholarDigital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- L. Antova, T. Jansen, C. Koch, and D. Olteanu. "Fast and Simple Relational Processing of Uncertain Data". In Proc. ICDE, 2008. Google ScholarDigital Library
- O. Benjelloun, A. D. Sarma, C. Hayworth, and J. Widom. An introduction to uldbs and the trio system. IEEE Data Eng. Bull., 29(1):5--16, 2006.Google Scholar
- D. C. Brabham. Crowdsourcing as a Model for Problem Solving: An Introduction and Cases. Convergence, 14(1):75--90, 2008.Google Scholar
- P. Buneman, J. Cheney, and S. Vansummeren. On the expressiveness of implicit provenance in query and update languages. ACM Trans. Database Syst., 33(4), 2008. Google ScholarDigital Library
- P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In Proc. of ICDT, 2001. Google ScholarDigital Library
- J. Cheney, S. Chong, N. Foster, M. I. Seltzer, and S. Vansummeren. Provenance: a future history. In Proc. of OOPSLA, 2009. Google ScholarDigital Library
- D. Deutch, C. Koch, and T. Milo. On probabilistic fixpoint and markov chain query languages. In PODS '10. Google ScholarDigital Library
- M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb: answering queries with crowdsourcing. In SIGMOD Conference, pages 61--72, 2011. Google ScholarDigital Library
- A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In WSDM '10. Google ScholarDigital Library
- L. Gravano, P. G. Ipeirotis, N. Koudas, and D. Srivastava. Text joins in an rdbms for web data integration. In Proceedings of the 12th international conference on World Wide Web, WWW '03, pages 90--101, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- T. J. Green, G. Karvounarakis, Z. G. Ives, and V. Tannen. Update exchange with mappings and provenance. In Proc. of VLDB, 2007. Google ScholarDigital Library
- T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In Proc. of PODS, 2007. Google ScholarDigital Library
- Imdb. http://www.imdb.com/.Google Scholar
- R. Jampani, F. Xu, M. Wu, L. L. Perez, C. Jermaine, and P. J. Haas. Mcdb: a monte carlo approach to managing uncertain data. In SIGMOD '08. Google ScholarDigital Library
- H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. Improving search engines using human computation games. In CIKM '09. Google ScholarDigital Library
- J. Madhavan, S. R. Jeffery, S. Cohen, X. (luna Dong, D. Ko, C. Yu, A. Halevy, and G. Inc. Web-scale data integration: You can only afford to pay as you go. In CIDR, 2007.Google Scholar
- A. Marcus, E. Wu, S. Madden, and R. C. Miller. Crowdsourced databases: Query processing with people. In CIDR, pages 211--214, 2011.Google Scholar
- A. Marian and M. Wu. Corroborating information from web sources. IEEE Data Eng. Bull., 34(3):11--17, 2011.Google Scholar
- Amazon's mechanical turk. https://www.mturk.com/.Google Scholar
- A. Parameswaran, A. D. Sarma, H. G.-M. andNeoklis Polyzotis, and J. Widom. Human-assisted graph search: It's okay to ask questions. In VLDB, 2011. Google ScholarDigital Library
- R. Ramakrishnan and J. D. Ullman. A survey of research on deductive database systems. Journal of Logic Programming, 1993.Google Scholar
- J. Stoyanovich, S. Davidson, T. Milo, and V. Tannen. Deriving probabilistic databases with inference ensembles. In To appear in Proc. of ICDE, 2011. Google ScholarDigital Library
- Top coder. http://www.topcoder.com/.Google Scholar
- Tripadvisor. http://www.tripadvisor.com/.Google Scholar
- L. von Ahn and L. Dabbish. Designing games with a purpose. Commun. ACM, 51(8):58--67, 2008. Google ScholarDigital Library
- Waze. http://www.waze.com/.Google Scholar
- Wikiepdia. http://www.wikipedia.org/.Google Scholar
Index Terms
- Mob data sourcing
Recommendations
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
A Community Rather Than A Union: Understanding Self-Organization Phenomenon on MTurk and How It Impacts Turkers and Requesters
CHI EA '17: Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing SystemsThis paper aims to understand the self-organization phenomenon among the workers of Amazon Mechanical Turk (MTurk), a well-known crowdsourcing platform. Specifically, we explored 1) why MTurk workers self-organize into online communities (Turker ...
Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide WebWithin the scope of this PhD proposal, we set out to investigate two pivotal aspects that influence the effectiveness of crowdsourcing: (i) microtask design, and (ii) workers behavior. Leveraging the dynamics of tasks that are crowdsourced on the one ...
Comments