Abstract
In this work, different social data mining approaches are used to characterize the user churning and social traits of the Bioinformatics community over the first ten years of Stack Overflow. The proposed workflow consists of a four-step procedure that allows the characterization of users based on the social exchange exhibited by the Bioinformatics community and the Developers communities, notably the Python, Matlab and R programming communities. The motivation is to improve user churning and the quality of social interactions by categorizing user traits and being able to understand how Stack Overflow, and other Stack Exchange databases, may better respond to current user concerns and interests. Therefore, initial user identification was complemented by a second categorization focused on user “genealogy”. The goal was to explore the evolving of user interests and, in particular, the migration and swapping of users across different communities throughout the years. Noticeably, a considerable number of Bioinformatics users has moved to the Developers communities. An in-depth exploration of the most popular topics of conversation in Bioinformatics enabled a better understanding of the triggers and contents of these conversations.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Stack Overflow Developer Survey 2019. https://insights.stackoverflow.com/survey/2019/. Accessed 28 Jan 2020
Neshati, M.: On early detection of high voted Q&A on Stack Overflow. Inf. Process. Manag. 53, 780–798 (2017). https://doi.org/10.1016/j.ipm.2017.02.005
Adaji, I., Vassileva, J.: Susceptibility of users to social influence strategies and the influence of culture in a Q&A collaborative learning environment. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 49–64. Springer (2017). https://doi.org/10.1007/978-3-319-63874-4_5
Bornfeld, B., Rafaeli, S.: When interaction is valuable: feedback, churn and survival on community question and answer sites: the case of stack exchange. In: Proceedings of the 52nd Hawaii International Conference on System Sciences (2019). https://doi.org/10.24251/hicss.2019.096
Bornfeld, B., Rafaeli, S.: When interaction is valuable: feedback, churn and survival on community question and answer sites: the case of stack exchange
Gupta, R., Reddy, P.K.: towards question improvement on knowledge sharing platforms: a stack overflow case study. In: Proceedings of 2017 IEEE International Conference on Big Knowledge, ICBK 2017, pp. 41–48. Institute of Electrical and Electronics Engineers Inc. (2017). https://doi.org/10.1109/ICBK.2017.25
Khodadadi, A., Hosseini, S.A., Tavakoli, E., Rabiee, H.R.: Continuous-time user modeling in presence of badges: a probabilistic approach. ACM Trans. Knowl. Discov. Data 12 (2018). https://doi.org/10.1145/3162050
Berger, P., Hennig, P., Bocklisch, T., Herold, T., Meinel, C.: A journey of bounty hunters: analyzing the influence of reward systems on stackoverflow question response times. In: Proceedings of 2016 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2016, pp. 644–649. Institute of Electrical and Electronics Engineers Inc. (2017). https://doi.org/10.1109/WI.2016.0114
Slag, R., De Waard, M., Bacchelli, A.: One-day flies on StackOverflow - why the vast majority of StackOverflow users only posts once. In: IEEE International Working Conference on Mining Software Repositories, pp. 458–461. IEEE (2015). https://doi.org/10.1109/MSR.2015.63
Sharif, S.M.: Question and accepted answer recommendation system with a generic knowledge mining visualization tool
Wu, D., Johnson, S., Foster, C., Li, E., Elmiligi, H., Rahman, M.: Improving response time prediction for stack overflow questions. In: 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON 2019, pp. 786–791. Institute of Electrical and Electronics Engineers Inc. (2019). https://doi.org/10.1109/IEMCON.2019.8936252
Mayowa Ishola, O., Mccalla, G.: Predicting prospective peer helpers to provide just-in-time help to users in question and answer forums
Sun, J., Moosavi, S., Ramnath, R., Parthasarathy, S.: QDEE: question difficulty and expertise estimation in community question answering sites (2018)
Russell, P.H., Johnson, R.L., Ananthan, S., Harnke, B., Carlson, N.E.: A large-scale analysis of bioinformatics code on GitHub. PLoS ONE 13 (2018). https://doi.org/10.1371/journal.pone.0205898
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001)
Blei, D.M., Edu, B.B., Ng, A.Y., Edu, A.S., Jordan, M.I., Edu, J.B.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). https://doi.org/10.1162/jmlr.2003.3.4-5.993
Wallach, H.M., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: ACM International Conference Proceeding Series (2009). https://doi.org/10.1145/1553374.1553515
Sievert, C., Shirley, K.: LDAvis: a method for visualizing and interpreting topics, pp. 63–70 (2015). https://doi.org/10.3115/v1/w14-3110
Acknowledgements
This study was supported by the Consellería de Educación, Universidades e Formación Profesional (Xunta de Galicia) under the scope of the strategic funding of ED431C2018/55-GRC Competitive Reference Group, and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UIDB/04469/2020 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020 - Programa Operacional Regional do Norte. SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from the University of Vigo for hosting its IT infrastructure.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pérez-López, R., Blanco, G., Fdez-Riverola, F., Lourenço, A. (2021). The Activity of Bioinformatics Developers and Users in Stack Overflow. In: Panuccio, G., Rocha, M., Fdez-Riverola, F., Mohamad, M., Casado-Vara, R. (eds) Practical Applications of Computational Biology & Bioinformatics, 14th International Conference (PACBB 2020). PACBB 2020. Advances in Intelligent Systems and Computing, vol 1240. Springer, Cham. https://doi.org/10.1007/978-3-030-54568-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-54568-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54567-3
Online ISBN: 978-3-030-54568-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)