Skip to main content

Advertisement

Log in

Social network mining of requester communities in crowdsourcing markets

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Crowdsourcing is a new computing approach where human tasks are outsourced to a large number of human workers. Crowdsourcing has not only attracted attention from industry but also from various academic communities. Amazon Mechanical Turk (AMT) has been the first commercial platform offering crowdsourcing services to its customers. AMT is often referred to as a platform supplying ‘artificial’ artificial-intelligence. Recent research efforts have not been addressing the analysis of the community structure of large-scale crowdsourcing platforms. In this work, we discuss detailed statistics of the popular AMT marketplace to provide insights in task properties and requester behavior. Here we present a model to automatically infer requester communities based on task keywords. Hierarchical clustering is used to identify relations between keywords associated with tasks. We present novel techniques to rank communities and requesters by using a graph-based algorithm. Furthermore, we introduce models and methods for the discovery of relevant crowdsourcing brokers who are able to act as intermediaries between requesters and platforms such as AMT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.mturk.com/mturk/welcome.

References

  • Alonso O, Rose DE, Stewart B (2008) Crowdsourcing for relevance evaluation. SIGIR Forum 42(2):9–15

    Article  Google Scholar 

  • Barabasi A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509

    Article  MathSciNet  Google Scholar 

  • Benkler Y (2001) Coase’s penguin, or linux and the nature of the firm. CoRR. cs.CY/0109077

  • Bhattacharyya P, Garg A, Wu S (2011) Analysis of user keyword similarity in online social networks. Soc Netw Anal Min 1:143–158. doi:10.1007/s13278-010-0006-4

    Article  Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Branting L (2011) Context-sensitive detection of local community structure. Soc Netw Anal Min 1–11. doi:10.1007/s13278-011-0035-7

  • Burt RS (1992) Structural holes: the social structure of competition. Harvard University Press, Cambridge

  • Callison-Burch C, Dredze M (2010) Creating speech and language data with amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, CSLDAMT ’10. Association for Computational Linguistics, Stroudsburg, pp 1–12

  • Carvalho VR, Lease M, Yilmaz E (2011) Crowdsourcing for search evaluation. SIGIR Forum 44(2):17–22

    Article  Google Scholar 

  • Cazabet R, Takeda H, Hamasaki M, Amblard F (2012) Using dynamic community detection to identify trends in user-generated content. Soc Netw Anal Min 1–11. doi:10.1007/s13278-012-0074-8

  • Chakrabarti S (2007) Dynamic personalized pagerank in entity-relation graphs. In: Proceedings of the 16th international conference on World Wide Web, WWW ’07. ACM, New York, pp 571–580

  • Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D (2009) Reading tea leaves: how humans interpret topic models. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems, vol 22. Morgan Kaufmann, San Mateo, pp 288–296

  • ClickWorker. http://www.clickworker.com/. Accessed 2012

  • CrowdFlower. http://crowdflower.com/. Accessed 2012

  • Doan A, Ramakrishnan, R, Halevy Y (2011) Crowdsourcing systems on the world-wide web. Commun ACM 54(4):86–96

    Article  Google Scholar 

  • Eda T, Yoshikawa M, Yamamuro M (2008) Locally expandable allocation of folksonomy tags in a directed acyclic graph. In: Proceedings of the 9th international conference on Web information systems engineering, WISE ’08. Springer, Berlin, pp 151–162

  • Fazeen M, Dantu R, Guturu P (2011) Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc Netw Anal Min 1:241–254. doi:10.1007/s13278-011-0017-9

    Article  Google Scholar 

  • Fisher D, Smith M, Welser HT (2006) You are who you talk to: Detecting roles in usenet newsgroups. In: Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS ’06, vol 03. IEEE Computer Society, Washington, p 59.2

  • Flickr. http://www.flickr.com/. Accessed 2012

  • Fogaras D, Rácz B, Csalogány K, Sarlós T (2005) Towards scaling fully personalized pagerank: algorithms, lower bounds, and experiments. Internet Math 2(3):333–358

    Article  MathSciNet  MATH  Google Scholar 

  • Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 international conference on management of data, SIGMOD ’11. ACM, New York, pp 61–72

  • Gemmell J, Shepitsen A, Mobasher B, Burke R (2008) Personalizing navigation in folksonomies using hierarchical tag clustering. In: Proceedings of the 10th international conference on data warehousing and knowledge discovery, DaWaK ’08. Springer, Berlin, pp 196–205

  • Golder S, Huberman BA (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208

    Article  Google Scholar 

  • Haveliwala TH (2002) Topic-sensitive pagerank. In: Proceedings of the 11th international conference on World Wide Web, WWW ’02. ACM, New York, pp 517–526

  • Heer J, Bostock M (2010) Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th international conference on Human factors in computing systems, CHI ’10. ACM, New York, pp 203–212

  • Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst 22(1):5–53

    Article  Google Scholar 

  • Heymann P, Garcia-Molina H (2006) Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report, Computer Science Department, Standford University

  • Howe J (2006) The rise of crowdsourcing. Wired 14(14):1–5

    Google Scholar 

  • Howe J (2008) Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business. Crown Business, New York

  • Ipeirotis PG (2010) Analyzing the amazon mechanical turk marketplace. XRDS 17:16–21

    Article  Google Scholar 

  • Ipeirotis PG (2012) Mechanical turk: Now with 40.92 % spam, 2010. http://bit.ly/mUGs1n. Accessed 2012

  • Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, pp 271–279

  • Kittur A, Chi EH, Suh B (2008) Crowdsourcing user studies with mechanical turk. In: Proceedings of the twenty-sixth annual SIGCHI conference on human factors in computing systems, CHI ’08. ACM, New York, pp 453–456

  • Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632

    Article  MathSciNet  MATH  Google Scholar 

  • Kourtellis N, Alahakoon T, Simha R, Lamnitchi A, Tripathi R (2012) Identifying high betweenness centrality nodes in large social networks. Soc Netw Anal Min 1–16. doi:10.1007/s13278-012-0076-6

  • Lampe C, Resnick P (2004) Slash(dot) and burn: distributed moderation in a large online conversation space. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’04. ACM, New York, pp 543–550

  • Little G, Chilton LB, Goldman M, Miller RC (2010) Turkit: human computation algorithms on mechanical turk. In: Proceedings of the 23nd annual ACM symposium on User interface software and technology, UIST ’10. ACM, New York, pp 57–66

  • Marge M, Banerjee S, Rudnicky AI (2010) Using the amazon mechanical turk for transcription of spoken language. In: Proceedings of the IEEE international conference on acoustics, speech, and, signal processing, pp 5270–5273

  • Michlmayr E, Cayzer S (2007) Learning user profiles from tagging data and leveraging them for personal(ized) information access. In: Tagging and metadata for social information organization, workshop, WWW07

  • Munro R, Bethard S, Kuperman V, Lai VT, Melnick R, Potts C, Schnoebelen T, Tily H (2010) Crowdsourcing and language studies: the new generation of linguistic data. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, CSLDAMT ’10. Association for Computational Linguistics, Stroudsburg, pp 122–130

  • oDesk. http://www.odesk.com/. Accessed 2012

  • Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web

  • Parameswaran A, Park H, Garcia-Molina H, Polyzotis N, Widom J (2011) Deco: declarative crowdsourcing. Stanford University technical report

  • Psaier H, Skopik F, Schall D, Dustdar S (2011) Resource and agreement management in dynamic crowdcomputing environments. EDOC. IEEE Computer Society, Los Vaqueros Circle Los Alamitos, pp 193–202

  • Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing field. In: Proceedings of the 2011 annual conference on Human factors in computing systems, CHI ’11. ACM, New York, pp 1403–1412

  • Romesburg C (2004) Cluster analysis for researchers. Krieger Pub. Co., Malabar

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS 105:1118

    Article  Google Scholar 

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523

    Article  Google Scholar 

  • Samasource. http://samasource.org/. Accessed 2012

  • Satzger B, Psaier H, Schall D, Dustdar S (2011) Stimulating skill evolution in market-based crowdsourcing. In: BPM, pp 66–82

  • Schall D (2011) A human centric runtime framework for mixed service-oriented systems. Distrib Parallel Databases 29:333–360. doi:10.1007/s10619-011-7081-z

    Article  Google Scholar 

  • Schall D (2012) Expertise ranking using activity and contextual link measures. Data Knowl Eng 71(1):92–113. doi:10.1016/j.datak.2011.08.001

    Article  Google Scholar 

  • Schall D, Skopik F (2011) An analysis of the structure and dynamics of large-scale q/a communities. In: Eder J, Bieliková M, Tjoa AM (eds) ADBIS. Lecture notes in computer science, vol 6909. Springer, Berlin, pp 285–301

  • Schall D, Skopik F, Psaier H, Dustdar S (2011) Bridging socially-enhanced virtual communities. In: Chu WC, Wong WE, Palakal MJ, Hung C-C (eds) SAC. ACM, New York, pp 792–799

  • Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, pp 259–266

  • Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM, New York, pp 327–336

  • Skopik F, Schall D, Dustdar S (2009) Start trusting strangers? bootstrapping and prediction of trust. In: Vossen G, Long DDE, Yu JX (eds) WISE. Lecture notes in computer science, vol 5802. Springer, Berlin, pp 275–289

  • SmartSheet. http://www.smartsheet.com/. Accessed 2012

  • SpeechInk. http://www.speechink.com/. Accessed 2012

  • Vukovic M (2009) Crowdsourcing for enterprises. In: Proceedings of the 2009 congress on services-I, Services ’09. IEEE Computer Society, Washington

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Schall.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schall, D., Skopik, F. Social network mining of requester communities in crowdsourcing markets. Soc. Netw. Anal. Min. 2, 329–344 (2012). https://doi.org/10.1007/s13278-012-0080-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13278-012-0080-x

Keywords

Navigation