Skip to main content
Log in

Developer recommendation for Topcoder through a meta-learning based policy model

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Crowdsourcing Software Development (CSD) has emerged as a new software development paradigm. Topcoder is now the largest competition-based CSD platform. Many organizations use Topcoder to outsource their software tasks to crowd developers in the form of open challenges. To facilitate timely completion of the crowdsourced tasks, it is important to find right developers who are more likely to win a challenge. Recently, many developer recommendation methods for CSD platforms have been proposed. However, these methods often make unrealistic assumptions about developer status or application scenarios. For example, they consider only skillful developers or only developers registered with the challenges. In this paper, we propose a meta-learning based policy model, which firstly filters out those developers who are unlikely to participate in or submit to a given challenge and then recommend the top k developers with the highest possibility of winning the challenge. We have collected Topcoder data between 2009 and 2018 to evaluate the proposed approach. The results show that our approach can successfully identify developers for posted challenges regardless of the current registration status of the developers. In particular, our approach works well in recommending new winners. The accuracy for top-5 recommendation ranges from 30.1% to 91.1%, which significantly outperforms the results achieved by the related work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://github.com/zhangzhenyu13/CSDMetalearningRS

  2. https://www.topcoder.com/

  3. https://www.freelancer.ca/

  4. https://www.upwork.com/

  5. https://www.kaggle.com/competitions/

  6. https://github.com/zhangzhenyu13/CSDMetalearningRS/tree/master/Baselines

References

  • Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker PA, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium on operating systems design and implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016, pp 265–283

  • Abhinav K, Dubey A, Jain S, Virdi G, Kass A, Mehta M (2017) Crowdadvisor: a framework for freelancer assessment in online marketplace. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track. IEEE Press, pp 93–102

  • Aggarwal CC, et al. (2016) Recommender systems. Springer

  • Al-Shedivat M, Bansal T, Burda Y, Sutskever I, Mordatch I, Abbeel P (2017) Continuous adaptation via meta-learning in nonstationary and competitive environments, arXiv:1710.03641

  • Alelyani T, Yang Y (2016) Software crowdsourcing reliability: an empirical study on developers behavior. In: Proceedings of the 2nd international workshop on software analytics. ACM, pp 36–42

  • Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug?. In: Proceedings of the 28th international conference on software engineering. ACM, pp 361–370

  • Archak N (2010) Money, glory and cheap talk: analyzing strategic behavior of contestants in simultaneous crowdsourcing contests on topcoder.com. In: Proceedings of the 19th international conference on World wide web. ACM, pp 21–30

  • Avazpour I, Pitakrat T, Grunske L, Grundy J (2014) Dimensions and metrics for evaluating recommendation systems, pp 245–273

  • Baba Y, Kinoshita K, Kashima H (2016) Participation recommendation system for crowdsourcing contests. Expert Syst Appl 58:174–183

    Article  Google Scholar 

  • Begel A, Herbsleb JD, Storey M-A (2012) The future of collaborative software development. In: Proceedings of the ACM 2012 conference on computer supported cooperative work companion. ACM, pp 17–18

  • Brazdil P, Carrier CG, Soares C, Vilalta R (2008) Metalearning: applications to data mining. Springer Science & Business Media

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Calefato F, Lanubile F, Novielli N (2018) How to ask for technical help? Evidence-based guidelines for writing questions on stack overflow. Inf Softw Technol 94:186–207

    Article  Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, pp 785–794

  • Chen T, Benesty M et al A high performance implementation of xgboost. [Online]. Available: https://github.com/dmlc/xgboost/

  • Choetkiertikul M, Avery D, Dam HK, Tran T, Ghose AK (2015) Who will answer my question on stack overflow?. In: 24th Australasian software engineering conference, ASWEC 2015, Adelaide, SA, Australia, September 28 - October 1, 2015, pp 155–164

  • Chollet F, et al. (2015) Keras. [Online]. Available: https://keras.io

  • Chowdhury A, Soboroff I (2002) Automatic evaluation of world wide web search services. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 421–422

  • Cui C, Hu M, Weir JD, Wu T (2016) A recommendation system for meta-modeling: a meta-learning based approach. Expert Syst Appl 46:33–44

    Article  Google Scholar 

  • Cui Q, Wang S, Wang J, Hu Y, Wang Q, Li M (2017) Multi-objective crowd worker selection in crowdsourced testing. In: 29th International conference on software engineering and knowledge engineering (SEKE), pp 218–223

  • Cunha T, Soares C, de Carvalho AC (2018) Metalearning and recommender systems: a literature review and empirical study on the algorithm selection problem for collaborative filtering. Inform Sci 423:128–144

    Article  MathSciNet  Google Scholar 

  • Domingos P (2000) A unified bias-variance decomposition. In: Proceedings of 17th international conference on machine learning, pp 231–238

  • Dubey A, Abhinav K, Virdi G (2017) A framework to preserve confidentiality in crowdsourced software development. In: Proceedings of the 39th international conference on software engineering companion. IEEE Press, pp 115–117

  • Dwarakanath A, Shrikanth N, Abhinav K, Kass A (2016) Trustworthiness in enterprise crowdsourcing: a taxonomy & evidence from data. In: Proceedings of the 38th international conference on software engineering companion. ACM, pp 41–50

  • Edwards JR, Van Harrison R (1993) Job demands and worker health: three-dimensional reexamination of the relationship between person-environment fit and strain. J Appl Psychol 78(4):628

    Article  Google Scholar 

  • Fu Y, Sun H, Ye L (2017) Competition-aware task routing for contest based crowdsourced software development. In: 2017 6th International workshop on software mining (SoftwareMining). IEEE, pp 32–39

  • Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42

    Article  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, Cambridge

    MATH  Google Scholar 

  • Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 631–642

  • Hannebauer C, Patalas M, Stünkel S, Gruhn V (2016) Automatically recommending code reviewers based on their expertise: an empirical comparison. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering. ACM, pp 99–110

  • Hasteer N, Nazir N, Bansal A, Murthy B (2016) Crowdsourcing software development: many benefits many concerns. Procedia Comput Sci 78:48–54

    Article  Google Scholar 

  • Hauff C, Gousios G (2015) Matching github developer profiles to job advertisements. In: 12th IEEE/ACM Working conference on mining software repositories, MSR 2015, Florence, Italy, May 16–17, 2015, pp 362–366

  • Hazan E, Klivans AR, Yuan Y (2017) Hyperparameter optimization: a spectral approach. arXiv:1706.00764

  • He H, Bai Y, Garcia EA, Li S (2008) Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International joint conference on neural networks, 2008. IJCNN 2008. (IEEE World congress on computational intelligence). IEEE, pp 1322–1328

  • Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  Google Scholar 

  • Hu H, Zhang H, Xuan J, Sun W (2014) Effective bug triage based on historical bug-fix information. In: 2014 IEEE 25th International symposium on in software reliability engineering (ISSRE). IEEE, pp 122– 132

  • Javadi Khasraghi H, Aghaie A (2014) Crowdsourcing contests: understanding the effect of competitors’ participation history on their performance. Behav Inf Technol 33(12):1383–1395

    Article  Google Scholar 

  • Karim MR, Yang Y, Messinger D, Ruhe G (2018) Learn or earn? - intelligent task recommendation for competitive crowdsourced software development. In: 51st Hawaii international conference on system sciences, HICSS 2018, Hilton Waikoloa Village, Hawaii, USA, January 3-6, 2018, pp 1–10

  • Khanfor A, Yang Y, Vesonder G, Ruhe G, Messinger D (2017) Failure prediction in crowdsourced software development. In: 2017 24th Asia-Pacific software engineering conference (APSEC). IEEE, pp 495– 504

  • Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196

  • Li W, Huhns MN, Tsai WT, Wu W (2015) Crowdsourcing: cloud-based software development. Springer Publishing Company, Incorporated

  • Mao K, Yang Y, Wang Q, Jia Y, Harman M (2015) Developer recommendation for crowdsourced software development tasks. In: 2015 IEEE symposium on service-oriented system engineering (SOSE). IEEE, pp 347–356

  • Mao K, Capra L, Harman M, Jia Y (2017) A survey of the use of crowdsourcing in software engineering. J Syst Softw 126:57–84

    Article  Google Scholar 

  • Metalearning (2009) Concepts and systems. Springer, Berlin, pp 1–10

    Google Scholar 

  • Munkhdalai T, Yu H (2017) Meta networks, arXiv:1703.00837

  • Navarro DJ, Dry MJ, Lee MD (2012) Sampling assumptions in inductive generalization. Cognit Sci 36(2):187–223

    Article  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Ponzanelli L, Scalabrino S, Bavota G, Mocci A, Oliveto R, Di Penta M, Lanza M (2017) Supporting software developers with a holistic recommender system. In: 2017 IEEE/ACM 39th international conference on software engineering (ICSE). IEEE, pp 94–105

  • Porto F, Minku L, Mendes E, Simao A (2018) A systematic study of cross-project defect prediction with meta-learning, arXiv:1802.06025

  • Powers D (2007) Evaluation: from precision, recall and fmeasure to roc, informedness, markedness and correlation. J Mach Learn Technol 2(01):37–63

    Google Scholar 

  • Procaci TB, Nunes BP, Nurmikko-Fuller T, Siqueira SWM (2016) Finding topical experts in question & answer communities. In: 16th IEEE International conference on advanced learning technologies, ICALT. Austin, TX, USA, July 25-28, 2016, pp 407–411

  • Rice JR (1976) The algorithm selection problem. Adv Comput 15:65–118

    Article  Google Scholar 

  • Sanjana NE, Tenenbaum JB (2003) Bayesian models of inductive generalization. In: Advances in neural information processing systems, pp 59–66

  • Saremi R, Yang Y (2015) Dynamic simulation of software workers and task completion. In: Proceedings of the second international workshop on crowdsourcing in software engineering. IEEE Press, pp 17– 23

  • Saremi R, Yang Y, Ruhe G, Messinger D (2017) Leveraging crowdsourcing for team elasticity: an empirical evaluation at topcoder. In: 2017 IEEE/ACM 39th International conference on software engineering: software engineering in practice track (ICSE-SEIP). IEEE, pp 103–112

  • Stol K-J, Fitzgerald B (2014) Researching crowdsourcing software development: perspectives and concerns. In: Proceedings of the 1st international workshop on crowdsourcing in software engineering. ACM, pp 7–10

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Automated parameter optimization of classification techniques for defect prediction models. In: 2016 IEEE/ACM 38th International conference on software engineering (ICSE). IEEE, pp 321–332

  • Valentini G, Dietterich TG (2002) Bias—variance analysis and ensembles of svm. In: International workshop on multiple classifier systems. Springer, pp 222–231

  • Wang W, He D, Fang M (2016) An algorithm for evaluating the influence of micro-blog users. In: Proceedings of the 2016 international conference on intelligent information processing. ACM, p 14

  • Wang Z, Sun H, Fu Y, Ye L (2017) Recommending crowdsourced software developers in consideration of skill improvement. In: 2017 32nd IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 717–722

  • Yang Y, Karim MR, Saremi R, Ruhe G (2016) Who should take this task?: dynamic decision support for crowd workers. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 8

  • Ye L, Sun H, Wang X, Wang J (2018) Personalized teammate recommendation for crowdsourced software developers. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE. Montpellier, France, September 3-7, 2018, pp 808–813

  • Yuan W, Nguyen HH, Jiang L, Chen Y (2018) Libraryguru: api recommendation for android developers. In: Proceedings of the 40th international conference on software engineering: companion proceeedings. ACM, pp 364–365

  • Zanatta AL, Machado L, Steinmacher I (2018) Competence, collaboration, and time management: barriers and recommendations for crowdworkers. In: Proceedings of the 5th international workshop on crowd sourcing in software engineering. ACM, pp 9–16

Download references

Acknowledgements

This work was supported partly by National Key Research and Development Program of China under Grant No.2016YFB1000804, partly by National Natural Science Foundation under Grant No.(61702024, 61828201, 61421003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hailong Sun.

Additional information

Communicated by: Kelly Blincoe

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Sun, H. & Zhang, H. Developer recommendation for Topcoder through a meta-learning based policy model. Empir Software Eng 25, 859–889 (2020). https://doi.org/10.1007/s10664-019-09755-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09755-0

Keywords

Navigation