skip to main content
research-article

On the Alignment Between Self-Declared Gender Identity and Topical Content from Wikipedia

Authors Info & Claims
Published:28 June 2021Publication History
Skip Abstract Section

Abstract

Wikipedia is an important information source for much of the world. One well-established problem is that editors of Wikipedia are overwhelmingly men. This gender gap in participation has resulted in a concern that the content suffers biases as a result of the bias in participation. This problem is hard to study, because the relationships between participation, gender identity, and content have not been established. Prior studies, mostly with children, have shown some differences in topical preferences based on sex. However, this issue has not been studied with adults and has not been considered from more than a binary stance. In this study, we work to understand how gender identity relates to topical preferences. Through an empirical study, we ask participants to declare a gender identity and then present them with pairs of topical article content from Wikipedia. Through thousands of participants and tens of thousands of paired content trials, we uncover relationships between self-declared gender identity and topical preferences. Further, by focusing on topics that have a statistically significant bias, we leverage two of Wikipedia's category systems to illustrate relative categorical differences that are similar to categorical differences described in prior work. The discussion focuses on the subtly of these differences, potential future research, and the implications for interventions based on topical content. Further, the results help us reflect on relationships that might explain the persistent and worsening gender gap in participation.

References

  1. Julia Adams and Hannah Brückner. 2015. Wikipedia, sociology, and the promise and pitfalls of Big Data. Big Data Soc. 2, 2 (2015), 2053951715614332. DOI:https://doi.org/10.1177/2053951715614332Google ScholarGoogle ScholarCross RefCross Ref
  2. Judd Antin, Raymond Yee, Coye Cheshire, and Oded Nov. 2011. Gender differences in Wikipedia editing. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym'11). ACM, New York, NY, 11–14. DOI:http://doi.acm.org/10.1145/2038558.2038561 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dawn Bazely. 2018. “Why Nobel winner Donna Strickland didn't have a Wikipedia page. The Washington Post. Retrieved from https://www.washingtonpost.com/outlook/2018/10/08/why-nobel-winner-donna-strickland-didnt-have-wikipedia-page/?noredirect=on&utm_term=.0076c36db719.Google ScholarGoogle Scholar
  4. Katherine Bischoping. 1993. Gender differences in conversation topics, 1922–1990. Sex Roles 28, 1/2 (1993), 1–18.Google ScholarGoogle ScholarCross RefCross Ref
  5. Sapna Cheryan, Jessica Schwartz Cameron, Zach Katagiri, and Benoît Monin. 2015. Manning Up: Threatened men compensate by disavowing feminine preferences and embracing masculine attributes. Soc. Psychol. 46, 218–227. https://doi.org/10.1027/1864-9335/a000239.Google ScholarGoogle ScholarCross RefCross Ref
  6. Benjamin Collier and Julia Bear. 2012. Conflict, criticism, or confidence: An empirical examination of the gender gap in Wikipedia contributions. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW'12). ACM, New York, NY, 383–392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Maitraye Das, Brent Hecht, and Darren Gergle. 2019. The gendered geography of contributions to openstreetmap: Complexities in self-focus bias. In Proceedings of the CHI Conference on Human Factors in Computing Systems. ACM, 563. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Djellel Difallah, Elena Filatova, and Panos Ipeirotis. 2018. Demographics and dynamics of mechanical Turk workers. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, 135–143. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Peter J. L Fisher. 1988. The reading preferences of third, fourth, and fifth graders. Reading Horizons 29, 1 (1988), 62–70Google ScholarGoogle Scholar
  10. Alice F. Freed and Alice Greenwood. 1996. Women, men, and type of talk: What makes the difference? Language Soc. 25 (1996), 1–26.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ruediger Glott, Philipp Schmidt, Rishab Ghosh. 2010. Wikipedia survey. Technical report, UNUMERIT, Maastricht, Netherlands. Retrieved from https://www.ris.org/uploadi/editor/1305050082Wikipedia_Overview_15March2010-FINAL.pdf.Google ScholarGoogle Scholar
  12. Eduardo Graells-Garrido, Mounia Lalmas, and Filippo Menczer. 2015. First women, second sex: Gender bias in Wikipedia. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT'15). ACM, New York, NY, 165–174. DOI: https://doi.org/10.1145/2700171.2791036 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel A. Gross. 2018. An annual feminist editing session takes on Wikipedia's gender problem. Retrieved from https://hyperallergic.com/429353/wikipedia-edit-a-thon-moma/.Google ScholarGoogle Scholar
  14. Adelaide Haas and Mark A. Sherman. 1982. Reported topics of conversation among same-sex adults. Commun. Quart. 30, 4 (1982), 332–342Google ScholarGoogle ScholarCross RefCross Ref
  15. Aaron Halfaker, R. Stuart Geiger, Jonathan T. Morgan, and John Riedl. 2013. The rise and decline of an open collaboration system: How Wikipedia's reaction to popularity is causing its decline. Amer. Behav. Sci. 57, 5 (2013), 664–688.Google ScholarGoogle ScholarCross RefCross Ref
  16. Eszter Hargittai and Aaron Shaw. 2015. Mind the skills gap: the role of Internet know-how and gender in differentiated contributions to Wikipedia, information. Commun. Soc. 18, 4 (2015), 424–442, DOI: 10.1080/1369118X.2014.957711Google ScholarGoogle ScholarCross RefCross Ref
  17. Alex Hern. 2014. Wikipedia “edit-a-thon” seeks to boost number of women editors. The Guardian. Retrieved from https://www.theguardian.com/science/2014/mar/04/wikipeadi-edit-a-thon-boost-women-editors.Google ScholarGoogle Scholar
  18. Benjamin Mako Hill and Aaron Shaw. 2013. The Wikipedia gender gap revisited: Characterizing survey response bias with propensity score estimation. PLoS ONE 8, 6 (2013), e65782. DOI:10.1371/journal.pone.0065782Google ScholarGoogle ScholarCross RefCross Ref
  19. H. C. Henderson. 1897. Report of the State Superintendent of Public Instruction of New York State. II: 978–991.Google ScholarGoogle Scholar
  20. Cheryl L. Holt and Jon B. Ellis. 1998. Assessing the current validity of the bem sex-role inventory. Sex Roles 39, 11–12 (1998): 929-941.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. E. Hupfer and B. Detlor. 2006. Gender and web information seeking: A self-concept orientation model. J. Amer. Soc. Info. Sci. Technol. 57, 8 (2006), 1105–1115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. M. Jordon. 1921. Children's Interests in Reading. Bureau of Publications, Teachers College, Columbia University, New York.Google ScholarGoogle Scholar
  23. Pamela S. Kipers. 1987. Gender and topic. Lang. Soc. 16 (1987), 543–557.Google ScholarGoogle ScholarCross RefCross Ref
  24. Aniket Kittur, Ed Chi, and Bongwon Suh. 2009. What's in Wikipedia? Mapping topics and conflict using socially annotated category structure. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI'09), DOI:10.1145/1518701.1518930. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Silvia Knobloch-Westerwick. 2012. Selective exposure and reinforcement of attitudes and partisanship before a presidential election. J. Commun. 62 (2012), 628–642. DOI:10.1111/j.1460-2466.2012.01651.xGoogle ScholarGoogle ScholarCross RefCross Ref
  26. Shyong (Tony) K. Lam, Anuradha Uduwage, Zhenhua Dong, Shilad Sen, David R. Musicant, Loren Terveen, and John Riedl. 2011. WP:Clubhouse?: An exploration of Wikipedia's gender imbalance. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym'11). ACM, New York, NY, 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Talia Lavin. 2016. A feminist edit-a-thon seeks to reshape Wikipedia. The New Yorker. Retrieved from https://www.newyorker.com/tech/annals-of-technology/a-feminist-edit-a-thon-seeks-to-reshape-wikipedia.Google ScholarGoogle Scholar
  28. Janette Lehmann, Claudia Muller-Birn, David Laniado, Mounia Lalmas, and Adreas Kaltenbrunner. 2014. Reader preferences and behavior on Wikipedia. Proceedings of the HyperText Conference (HT'14) Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sean Lyons, Linda Duxbury, and Christopher Higgins. 2005. Are gender differences in basic human values a generational phenomenon? Sex Roles 53, 9-10 (2005): 763–778.Google ScholarGoogle ScholarCross RefCross Ref
  30. McKensie Mack. 2019. Announcing Our Year 6 Campaign: Gender + The Non-Binary. Retrieved from https://artandfeminism.tumblr.com/post/181023341915/announcing-our-year-6-campaign-gender-the.Google ScholarGoogle Scholar
  31. Robert Martin. 1997. Girls don't talk about garages!": Perceptions of conversation in the same- and cross-sex friendships. Person. Relation. 4, (1997), 115–130.Google ScholarGoogle Scholar
  32. Amanda Menking, David W. McDonald, and Mark Zachry. 2017. Who wants to read this?: A Method for measuring topical representativeness in user generated content systems. In Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW’17). ACM, Portland, OR, 2068–2081. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Amanda Menking, Ingrid Erickson, and Wanda Pratt. 2019. People who can take it: How women Wikipedians negotiate and navigate safety. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI'19). ACM, New York, NY, Paper 472, 14 DOI: https://doi.org/10.1145/3290605.3300702 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. L. Monson and L. L. Sebesta. 1991. Reading preferences. In Handbook of Research on Teaching the English Language Arts, J. Flood, J. M. Jensen, D. Lapp, and J. R. Squire (Eds.). 664–673.Google ScholarGoogle Scholar
  35. H. T. Moore. 1922. Further data concerning sex differences. J. Abnorm. Psychol. 17 (1922), 210–214.Google ScholarGoogle Scholar
  36. Jonathan T. Morgan, Siko Bouterse, Heather Walls, and Sarah Stierch. 2013. Tea and sympathy: Crafting positive new user experiences on Wikipedia. In Proceedings of the Conference on Computer Supported Cooperative work (CSCW'13). ACM, New York, NY, 839–848. DOI: https://doi.org/10.1145/2441776.2441871 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. I. A. Nikoloudakis, C. Vandelanotte, A. L. Rebar, S. Schoeppe, S. Alley, M. J. Duncan, and C. E. Short. 2018. Examining the correlates of online health information–Seeking behavior among men compared with women. Amer. J. Men's Health 12, 5 (2018), 1358–1367.Google ScholarGoogle ScholarCross RefCross Ref
  38. Simone Paolo Ponzetto and Michael Strube. 2007. Knowledge derived from Wikipedia for computing semantic relatedness. J. Artific. Intell. Res. 30, 1 (2007), 181–212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lisa Posch, Arnim Bleier, Fabian Flöck, and Markus Strohmaier. 2018. Characterizing the global crowd workforce: A cross-country comparison of crowdworker demographics. Retrieved from https://arXiv:1812.05948.Google ScholarGoogle Scholar
  40. R. Rada, H. Mili, E. Bicknell, and M. Blettner. 1989. Development and application of a metric on semantic nets. IEEE Trans. Syst., Man, Cybernet. 19 (1989), 17–30.Google ScholarGoogle ScholarCross RefCross Ref
  41. Thomas Schmidt and Christian Wolff. 2016. Personality and Information behavior in web search. Proceedings of the 79th ASIS&T Annual Meeting. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jerome R. Sehulster. 2006. Things we talk about, how frequently, and to whom: Frequency of topics in everyday conversation as a function of gender, age and marital status. Amer. J. Psychol. 119, 3 (2006), 407–432.Google ScholarGoogle ScholarCross RefCross Ref
  43. Y. K. Seock and L. R. Bailey. 2008. The influence of college students' shopping orientations and gender differences on online information searches and purchase behaviours. Int. J. Consum. Studies 32, 2 (2008), 113–121.Google ScholarGoogle ScholarCross RefCross Ref
  44. Michael Strube and Simone Paolo Ponzetto. 2006. WikiRelate! computing semantic relatedness using Wikipedia. Proceedings of the American Association for Artificial Intelligence Annual Conference (AAAI’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Brian Sturm. 2003. The information and reading preferences of North Carolina children. School Library Media Res. 6 (2003), 1–30Google ScholarGoogle Scholar
  46. Krzysztof Suchecki, Alkim Almila Akdag Salah, Cheng Gao, and Andrea Scharnhorst. 2012. Evolution of Wikipedia's category structure. Adv. Complex Syst. 15 (2012). DOI:10.1142/S0219525912500683.Google ScholarGoogle Scholar
  47. Kate Summers. 2013. Adult reading habits and preferences in relation to gender differences. Ref. User Services Quart. 52, 3 (2013), 243–249.Google ScholarGoogle Scholar
  48. Mohamed Ali Hadj Taieb, Mohamet Ben Aouicha, and Abdelmajid Ben Hamadou. 2013. Computing semantic relatedness using Wikipedia features. Knowl.-Based Syst. 50 (2013), 260–278. 10.1016/j.knosys.2013.06.015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Deborah Tannen. 1990. Gender differences in topical coherence: Creating involvement in best friend's talk. Discourse Process. 13 (1990), 73–90.Google ScholarGoogle ScholarCross RefCross Ref
  50. K. Thornton and D. W. McDonald. 2012. Tagging Wikipedia: Collaboratively creating a category system. Proceedings of the ACM Conference on Supporting Group Work (GROUP'12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. C. Urquhart and A. Yeoman. 2010. Information behaviour of women: Theoretical perspectives on gender. J. Document. 66, 1 (2010), 113–139. DOI:https://doi.org/10.1108/00220411011016399Google ScholarGoogle ScholarCross RefCross Ref
  52. U.S. 2010 Decennial Census. 2010. Retrieved from https://www.census.gov/programs-surveys/decennial-census/decade.2010.html.Google ScholarGoogle Scholar
  53. C. Van Slyke, C. L. Comunale, and F. Belanger. 2002. Gender differences in perceptions of Web-based shopping. Commun. ACM, 45, 7 (2002), 82–86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Claudia Wagner, Eduardo Graells-Garrido, David Garcia, and Filippo Menczer. 2016. Women through the glass ceiling: Gender asymmetries in Wikipedia. EPJ Data Sci. 5, 5 (2016) DOI:10.1140/epjds/s13688-016-0066-4Google ScholarGoogle Scholar
  55. D. Warner and J. D. Procaccino. 2004. Toward wellness: Women seeking health information. J. Amer. Soc. Info. Sci. Technol. 55, 8 (2004), 709–730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Wikimedia Foundation. 2011. New Editors Survey. Retrieved from https://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_-_April_2011.pdf.Google ScholarGoogle Scholar
  57. Wikimedia Foundation. 2018. Community Engagement Insights 2018. Retrieved from https://meta.wikimedia.org/wiki/Community_Insights/2018_Report.Google ScholarGoogle Scholar
  58. Wikipedia. 2019a. Wikipedia: Systematic bias. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Systemic_bias.Google ScholarGoogle Scholar
  59. Wikipedia. 2019b. Wikipedia: Statistics. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Statistics.Google ScholarGoogle Scholar
  60. Ed Yong. 2012. Edit-a-thon gets women scientists into Wikipedia. Nature. Retrieved from https://www.nature.com/news/edit-a-thon-gets-women-scientists-into-wikipedia-1.11636.Google ScholarGoogle Scholar
  61. Torsten Zesch and Iryna Gurevych. 2007. Analysis of the Wikipedia category graph for NLP applications. Proceedings of the 2nd Workshop on TextGraphs: Graph-based Algorithms for Natural Language Processing. Association for Computational Linguistics, 1–8.Google ScholarGoogle Scholar

Index Terms

  1. On the Alignment Between Self-Declared Gender Identity and Topical Content from Wikipedia

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Social Computing
      ACM Transactions on Social Computing  Volume 4, Issue 2
      June 2021
      171 pages
      EISSN:2469-7826
      DOI:10.1145/3467472
      Issue’s Table of Contents

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 June 2021
      • Accepted: 1 February 2021
      • Revised: 1 January 2021
      • Received: 1 December 2019
      Published in tsc Volume 4, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)30
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format