Skip to main content

On the Relation of Edit Behavior, Link Structure, and Article Quality on Wikipedia

  • Conference paper
  • First Online:
Complex Networks and Their Applications VIII (COMPLEX NETWORKS 2019)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 882))

Included in the following conference series:

Abstract

When editing articles on Wikipedia, arguments between editors frequently occur. These conflicts occasionally lead to destructive behavior and diminish article quality. Currently, the relation between editing behavior, link structure, and article quality is not well-understood in our community, notwithstanding that this relation may facilitate editing processes and article quality on Wikipedia. To shed light on this complex relation, we classify edits for 13,045 articles and perform an in-depth analysis of a 4,800 article subsample. Additionally, we build a network of wikilinks (internal Wikipedia hyperlinks) between articles. Using this data, we compute parsimonious metrics to quantify editing and linking behavior. Our analysis unveils that controversial articles differ considerably from others for almost all metrics, while slight trends are also detectable for higher-quality articles. With our work, we assist online collaboration communities, especially Wikipedia, in long-term improvement of article quality by identifying deviant behavior via simple sequence-based edit and network-based article metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Evolution.

  2. 2.

    https://en.wikipedia.org/wiki/Nikola_Tesla.

  3. 3.

    https://en.wikipedia.org/wiki/Wikipedia:Huggle.

  4. 4.

    https://github.com/ruptho/editlinkquality-wikipedia.

  5. 5.

    https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_types/Taxonomy.

  6. 6.

    https://en.wikipedia.org/wiki/Wikipedia:Content_assessment.

  7. 7.

    https://en.wikipedia.org/wiki/Wikipedia:WikiProject.

  8. 8.

    https://github.com/wikimedia/revscoring.

  9. 9.

    https://en.wikipedia.org/wiki/Wikipedia:List_of_controversial_issues.

  10. 10.

    https://en.wikipedia.org/wiki/Wikipedia:Lamest_edit_wars.

  11. 11.

    https://en.wikipedia.org/wiki/User:ClueBot_NG.

  12. 12.

    https://scikit-learn.org.

  13. 13.

    https://networkx.github.io.

  14. 14.

    https://en.wiktionary.org/wiki/ASCII_art.

  15. 15.

    https://en.wikipedia.org/wiki/Wikipedia:Protection_policy#semi.

References

  1. Adler, B.T., De Alfaro, L., Mola-Velasco, S.M., Rosso, P., West, A.G.: Wikipedia vandalism detection: combining natural language, metadata, and reputation features. In: CICLing, pp. 277–288. Springer (2011)

    Google Scholar 

  2. Borra, E., Weltevrede, E., Ciuccarelli, P., Kaltenbrunner, A., Laniado, D., Magni, G., Mauri, M., Rogers, R., Venturini, T.: Societal Controversies in Wikipedia articles. In: SIGCHI, pp. 193–196 (2015)

    Google Scholar 

  3. Brandes, U., Kenis, P., Lerner, J., Van Raaij, D.: Network analysis of collaboration structure in Wikipedia. In: WWW, pp. 731–740. ACM (2009)

    Google Scholar 

  4. Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., Gilbert, E.: You Can’t stay here: the efficacy of Reddit’s 2015 ban examined through hate speech. HCI 1(CSCW), 31:1–31:22 (2017)

    Google Scholar 

  5. Consonni, C., Laniado, D., Montresor, A.: WikiLinkGraphs: a complete, longitudinal and multi-language dataset of the Wikipedia link networks. In: ICWSM, vol. 13, pp. 598–607 (2019)

    Google Scholar 

  6. Coursey, K., Mihalcea, R.: Topic identification using Wikipedia graph centrality. In: NAACL HLT, pp. 117–120 (2009)

    Google Scholar 

  7. Daxenberger, J., Gurevych, I.: A corpus-based study of edit categories in featured and non-featured wikipedia articles. In: COLING, pp. 711–726 (2012)

    Google Scholar 

  8. De La Robertie, B., Pitarch, Y., Teste, O.: Measuring article quality in Wikipedia using the collaboration network. In: ASONAM, pp. 464–471 (2015)

    Google Scholar 

  9. Dimitrov, D., Lemmerich, F., Singer, P., Strohmaier, M.: What Makes a Link Successful on Wikipedia? In: WWW, pp. 917–926 (2017)

    Google Scholar 

  10. Dimitrov, D., Singer, P., Helic, D., Strohmaier, M.: The role of structural information for designing navigational user interfaces. In: HT, pp. 59–68. ACM (2015)

    Google Scholar 

  11. Editorial: Britannica attacks. Nature 440(582) (2006)

    Google Scholar 

  12. Faigley, L., Witte, S.: Analyzing revision. College Compos. Commun. 32(4), 400–414 (1981)

    Article  Google Scholar 

  13. Flöck, F., Erdogan, K., Acosta, M.: TokTrack: a complete token provenance and change tracking dataset for the English Wikipedia. In: ICWSM, pp. 408–417 (2017)

    Google Scholar 

  14. Gandica, Y., dos Aidos, F.S., Carvalho, J.: The dynamic nature of conflict in Wikipedia. EPL 108(1), 18003 (2014)

    Article  Google Scholar 

  15. Halfaker, A., Geiger, R.S., Morgan, J.T., Sarabadani, A., Wight, A.: ORES: facilitating remediation of Wikipedia’s socio-technical problems (2018)

    Google Scholar 

  16. Hanada, R., Cristo, M., Pimentel, M.D.G.C.: How do metrics of link analysis correlate to quality, relevance and popularity in Wikipedia? In: WebMedia, pp. 105–112 (2013)

    Google Scholar 

  17. Ingawale, M., Dutta, A., Roy, R., Seetharaman, P.: Network analysis of user generated content quality in Wikipedia. Online Inf. Rev. 37(4), 602–619 (2013)

    Article  Google Scholar 

  18. Kamps, J., Koolen, M.: Is Wikipedia link structure different? In: WSDM, pp. 232–241 (2009)

    Google Scholar 

  19. Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H.: He says, she says: conflict and coordination in Wikipedia. In: SIGCHI, pp. 453–462 (2007)

    Google Scholar 

  20. Kumar, S., Spezzano, F., Subrahmanian, V.: VEWS: a Wikipedia vandal early warning system. In: SIGKDD, pp. 607–616 (2015)

    Google Scholar 

  21. Li, X., Tang, J., Wang, T., Luo, Z., De Rijke, M.: Automatically assessing wikipedia article quality by exploiting article-editor networks. In: European Conference on Information Retrieval, pp. 574–580. Springer (2015)

    Google Scholar 

  22. Liu, J., Ram, S.: Using big data and network analysis to understand Wikipedia article quality. Data Knowl. Eng. 115, 80–93 (2018)

    Article  Google Scholar 

  23. Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: AAAI (2008)

    Google Scholar 

  24. Platt, E.L., Romero, D.M.: Network structure, efficiency, and performance in WikiProjects. In: ICWSM, pp. 251–260 (2018)

    Google Scholar 

  25. Ravasz, E., Barabási, A.L.: Hierarchical organization in complex networks. Phys. Rev. E 67(2), 026112 (2003)

    Article  Google Scholar 

  26. Sage Ross: Editing Wikipedia, a print guide for new contributors (2014). https://w.wiki/86W. Accessed 09 Apr 2019

  27. Samoilenko, A., Lemmerich, F., Zens, M., Jadidi, M., Génois, M., Strohmaier, M.: (Don’t) mention the war: a comparison of Wikipedia and britannica articles on national histories. In: WWW, pp. 843–852 (2018)

    Google Scholar 

  28. Shin, K., Eliassi-Rad, T., Faloutsos, C.: CoreScope: graph mining using k-core analysis - patterns, anomalies and algorithms. In: ICDM, pp. 469–478 (2016)

    Google Scholar 

  29. Suchecki, K., Salah, A.A.A., Gao, C., Scharnhorst, A.: Evolution of Wikipedia’s category structure. Adv. Complex Syst. 15, 1250068 (2012)

    Article  Google Scholar 

  30. Sumi, R., Yasseri, T., et al.: Edit wars in Wikipedia. In: PASSAT/SocialCom, pp. 724–727 (2011)

    Google Scholar 

  31. Vautard, R., Mo, K.C., Ghil, M.: Statistical significance test for transition matrices of atmospheric Markov chains. J. Atmos. Sci. 47(15), 1926–1931 (1990)

    Article  Google Scholar 

  32. Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Edit categories and editor role identification in Wikipedia. In: LREC, pp. 1295–1299 (2016)

    Google Scholar 

  33. Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Identifying semantic edit intentions from revisions in Wikipedia. In: EMNLP, pp. 2000–2010 (2017)

    Google Scholar 

  34. Yasseri, T., Kertész, J.: Value production in a collaborative environment. J. Stat. Phys. 151(3), 414–439 (2013)

    Article  MathSciNet  Google Scholar 

  35. Yasseri, T., Spoerri, A., Graham, M., Kertész, J.: The most controversial topics in Wikipedia. Global Wikipedia 25 (2014)

    Google Scholar 

  36. Yasseri, T., Sumi, R., Rung, A., Kornai, A., Kertész, J.: Dynamics of conflicts in Wikipedia. PLoS ONE 7(6), 1–12 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

Tiago Santos is a recipient of a DOC Fellowship of the Austrian Academy of Sciences at the Institute of Interactive Systems and Data Science of the Graz University of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thorsten Ruprechter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ruprechter, T., Santos, T., Helic, D. (2020). On the Relation of Edit Behavior, Link Structure, and Article Quality on Wikipedia. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 882. Springer, Cham. https://doi.org/10.1007/978-3-030-36683-4_20

Download citation

Publish with us

Policies and ethics