Skip to main content

Editing Behavior Analysis for Predicting Active and Inactive Users in Wikipedia

  • Chapter
  • First Online:
Influence and Behavior Analysis in Social Networks and Social Media (ASONAM 2018)

Abstract

These days, user-generated content platforms such as social media, question-answering Websites, and open collaboration systems are a source of information for many. These platforms survive, thanks to the pool of active contributors who generate content. As a consequence, they continuously face the problem of acquiring new users and retain them in the platform.

In this paper, we study the case of English Wikipedia, a well-established open collaboration system, and study the problem of predicting whether or not an editor will become inactive and stop contributing to the encyclopedia. Knowing this information can help the administrative community to perform engaging actions on time to keep users contributing longer.

We propose a predictive model leveraging contributors’ editing behavior to identify active vs. inactive Wikipedia users. Our experiments show that our method achieves an AUROC of at least 0.97 in predicting editors who will become inactive and can predict inactive users earlier than the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A meta-page is a page which is not a regular article, but it can be, for instance, a User page (where editors describe themselves) or an article Talk page (where editors discuss about the content of the associated Wikipedia article).

References

  1. B.T. Adler, L. de Alfaro, S.M. Mola-Velasco, P. Rosso, A.G. West, Wikipedia vandalism detection: combining natural language, metadata, and reputation features, in Proceedings of 12th International Conference on Computational Linguistics and Intelligent Text Processing - Part II (2011), pp. 277–288

    Google Scholar 

  2. H. Arelli, F. Spezzano, Who will stop contributing? Predicting inactive editors in wikipedia, in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2017), pp. 355–358

    Google Scholar 

  3. S. Asadi, S. Ghafghazi, H.R. Jamali, Motivating and discouraging factors for wikipedians: the case study of persian wikipedia. Libr. Rev. 62(4/5), 237–252 (2013)

    Article  Google Scholar 

  4. S.L. Bryant, A. Forte, A. Bruckman, Becoming wikipedian: transformation of participation in a collaborative online encyclopedia, in Proceedings of the 2005 International ACM SIGGROUP Conference on Supporting Group Work (2005), pp. 1–10

    Google Scholar 

  5. Y. Chen, F.M. Harper, J. Konstan, S.X. Li, Social comparisons and contributions to online communities: a field experiment on movielens. Am. Econ. Rev. 100(4), 1358–1398 (2010)

    Article  Google Scholar 

  6. J. Cheng, M.S. Bernstein, C. Danescu-Niculescu-Mizil, J. Leskovec, Anyone can become a troll: Causes of trolling behavior in online discussions, in Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW (2017), pp. 1217–1230

    Google Scholar 

  7. Cluebot_NG, http://bit.ly/ClueBotNG

  8. Edit Warring, http://en.wikipedia.org/wiki/Wikipedia:Edit_warring

  9. T. Green, F. Spezzano, Spam users identification in wikipedia via editing behavior, in The 11th International AAAI Conference on Web and Social Media (2017), pp. 532–535

    Google Scholar 

  10. A. Halfaker, http://datahub.io/dataset/english-wikipedia-reverts

  11. A. Halfaker, A. Kittur, J. Riedl, Don’t bite the newbies: how reverts affect the quantity and quality of wikipedia work, in Proceedings of the 7th International Symposium on Wikis and Open Collaboration (2011), pp. 163–172

    Google Scholar 

  12. A. Halfaker, R.S. Geiger, J.T. Morgan, J. Riedl, The rise and decline of an open collaboration system: How wikipedia’s reaction to popularity is causing its decline. Am. Behav. Sci. 57(5), 664–688 (2013)

    Article  Google Scholar 

  13. A. Halfaker, R.S. Geiger, L.G. Terveen, Snuggle: designing for efficient socialization and ideological critique, in CHI Conference on Human Factors in Computing Systems (2014), pp. 311–320

    Google Scholar 

  14. H. Hosseinmardi, R.I. Rafiq, R. Han, Q. Lv, S. Mishra, Prediction of cyberbullying incidents in a media-based social network, in 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2016), pp. 186–192

    Google Scholar 

  15. T. Iba, K. Nemoto, B. Peters, P.A. Gloor, Analyzing the creative editing behavior of wikipedia editors: through dynamic social network analysis. Procedia Soc. Behav. Sci. 2(4), 6441–6456 (2010)

    Article  Google Scholar 

  16. L. Jian, J.K. MacKie-Mason, Why leave wikipedia? in iConference (2008)

    Google Scholar 

  17. L. Jian, J. MacKie-Mason, B. Chiao, A. Levchenko, A. Zellner, J. Kmenta, J. Dreze, W. Oberhofer, Incentive-centered design for user-contributed content, in The Oxford Handbook of the Digital Economy, ed. by M. Peitz, J. Waldfogel (Oxford University Press, Oxford, 2012), p. 399

    Google Scholar 

  18. H.C. Kim, S. Pang, H.-M. Je, D. Kim, S.Y. Bang, Support vector machine ensemble with bagging, in Proceedings of the First International Workshop on Pattern Recognition with Support Vector Machines (2002), pp. 397–407

    Google Scholar 

  19. S. Kim, S. Park, S.A. Hale, S. Kim, J. Byun, A.H. Oh, Understanding editing behaviors in multilingual wikipedia. PLoS One 11(5), e0155305 (2016)

    Google Scholar 

  20. S. Kumar, F. Spezzano, V.S. Subrahmanian, Accurately detecting trolls in slashdot zoo via decluttering, in 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2014), pp. 188–195

    Google Scholar 

  21. S. Kumar, F. Spezzano, V.S. Subrahmanian, Vews: a wikipedia vandal early warning system, in 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015), pp. 607–616

    Google Scholar 

  22. C.-Y. Lai, H.-L. Yang, The reasons why people continue editing wikipedia content–task value confirmation perspective. Behav. Inform. Technol. 33(12), 1371–1382 (2014)

    Article  Google Scholar 

  23. O. Nov, What motivates wikipedians? Commun. ACM 50(11), 60–64 (2007)

    Article  Google Scholar 

  24. J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M. Hsu, Prefixspan: mining sequential patterns by prefix-projected growth, in Proceedings of the 17th International Conference on Data Engineering (2001), pp. 215–224

    Google Scholar 

  25. E. Raisi, B. Huang, Cyberbullying detection with weakly supervised machine learning, in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2017), pp. 409–416

    Google Scholar 

  26. F. Spezzano, Ensuring the integrity of wikipedia: a data science approach, in Proceedings of the 25th Italian Symposium on Advanced Database Systems (2017), p. 98

    Google Scholar 

  27. I. Steinmacher, T. Conte, M.A. Gerosa, D.F. Redmiles, Social barriers faced by newcomers placing their first contribution in open source software projects, in Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW (2015), pp. 1379–1392

    Google Scholar 

  28. STiki, http://bit.ly/STiki_tool

  29. B. Suh, G. Convertino, E.H. Chi, P. Pirolli, The singularity is not near: slowing growth of wikipedia, in Proceedings of the 2009 International Symposium on Wikis (2009)

    Book  Google Scholar 

  30. K. Suyehira, F. Spezzano, Depp: a system for detecting pages to protect in wikipedia, in Proceedings of the 25th ACM International Conference on Information and Knowledge Management (2016), pp. 2081–2084

    Google Scholar 

  31. VEWS, http://www.cs.umd.edu/~vs/vews

  32. H.T. Welser, D. Cosley, G. Kossinets, A. Lin, F. Dokshin, G. Gay, M. Smith, Finding social roles in wikipedia, in Proceedings of the 2011 iConference (2011), pp. 122–129

    Google Scholar 

  33. Wiki Challenge Competition, https://www.kaggle.com/c/wikichallenge

  34. WikiChallenge First Prize Winner’s Wikipedia Page, https://meta.wikimedia.org/wiki/Research:Wiki_Participation_Challenge_prognoZit

  35. WikiChallenge Third Prize Winner’s Wiki Page, https://meta.wikimedia.org/wiki/Research:Wiki_Participation_Challenge_zeditor

  36. Wikipedia English Statistics, https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/editors

  37. D. Zhang, K. Prior, M. Levene, R. Mao, D. van Liere, Leave or stay: the departure dynamics of wikipedia editors, in International Conference on Advanced Data Mining and Applications (2012), pp. 1–14

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Spezzano .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Arelli, H., Spezzano, F., Shrestha, A. (2019). Editing Behavior Analysis for Predicting Active and Inactive Users in Wikipedia. In: Kaya, M., Alhajj, R. (eds) Influence and Behavior Analysis in Social Networks and Social Media. ASONAM 2018. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-02592-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02592-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02591-5

  • Online ISBN: 978-3-030-02592-2

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics