skip to main content
10.1145/3597638.3614476acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
poster

Large-Scale Anonymized Text-based Disability Discourse Dataset

Published: 22 October 2023 Publication History

Abstract

The involvement of individuals with disabilities in online discussions related to disability and accessibility is a critical area of study. While previous research has qualitatively examined the participation of individuals with disabilities on social media platforms, large-scale analysis of social media content by people with disabilities has been an underexplored area. This paper presents a pioneering large-scale study of disability communities on Reddit. We developed an anonymized text-based dataset that consists of 1.5 million comments posted on three subreddits: r/disability, r/Blind, and r/ADHD. Using topic modeling, we analyzed the dataset and identified eight highly-coherent common categories and their associated keywords across the three subreddits. We contribute an Anonymized Disability Discourse Reddit Corpus (ADDReC) of 1.5 million comments that feature eight disability discourse categories.

References

[1]
Tawfiq Ammari, Sarita Schoenebeck, and Daniel Romero. 2019. Self-declared throwaway accounts on Reddit: How platform affordances and shared norms enable parenting disclosure and support. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–30.
[2]
Tawfiq Ammari, Sarita Schoenebeck, and Daniel M Romero. 2018. Pseudonymous parents: Comparing parenting roles and identities on the Mommit and Daddit subreddits. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.
[3]
Tal August, Dallas Card, Gary Hsieh, Noah A Smith, and Katharina Reinecke. 2020. Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
[4]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993–1022.
[5]
Chaitanya Chemudugunta, Padhraic Smyth, and Mark Steyvers. 2006. Modeling general and specific aspects of documents with a probabilistic topic model. Advances in neural information processing systems 19 (2006), 241–248.
[6]
Glen Coppersmith, Mark Dredze, Craig Harman, and Kristy Hollingshead. 2015. From ADHD to SAD: Analyzing the language of mental health on Twitter through self-reported diagnoses. In Proceedings of the 2nd workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality. 1–10.
[7]
Stephan A Curiskis, Barry Drake, Thomas R Osborn, and Paul J Kennedy. 2020. An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit. Information Processing & Management 57, 2 (2020), 102034.
[8]
Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, and William Thies. 2012. " Yours is better!" participant response bias in HCI. In Proceedings of the sigchi conference on human factors in computing systems. 1321–1330.
[9]
Jared Duval, Ferran Altarriba Bertran, Siying Chen, Melissa Chu, Divya Subramonian, Austin Wang, Geoffrey Xiang, Sri Kurniawan, and Katherine Isbister. 2021. Chasing Play on TikTok from Populations with Disabilities to Inspire Playful and Inclusive Technology Design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
[10]
Yujia Gao, Jinu Jang, and Diyi Yang. 2021. Understanding the Usage of Online Media for Parenting from Infancy to Preschool At Scale. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–12.
[11]
David Gonçalves, Manuel Piçarra, Pedro Pais, João Guerreiro, and André Rodrigues. 2023. "My Zelda Cane": Strategies Used by Blind Players to Play Visual-Centric Digital Games. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 289, 15 pages. https://doi.org/10.1145/3544548.3580702
[12]
Thomas L Griffiths, Mark Steyvers, David M Blei, and Joshua B Tenenbaum. 2004. Integrating topics and syntax. In NIPS, Vol. 4. 537–544.
[13]
Julia Himmelsbach, Stephanie Schwarz, Cornelia Gerdenitsch, Beatrix Wais-Zechmann, Jan Bobeth, and Manfred Tscheligi. 2019. Do we care about diversity in human computer interaction: A comprehensive content analysis on diversity dimensions in research. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–16.
[14]
Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, and Liang Zhao. 2019. Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimedia Tools and Applications 78, 11 (2019), 15169–15211.
[15]
Stefan Johansson, Jan Gulliksen, and Ann Lantz. 2015. User participation when users have mental and cognitive disabilities. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 69–76.
[16]
Qisheng Li, Krzysztof Z Gajos, and Katharina Reinecke. 2018. Volunteer-based online studies with older adults and people with disabilities. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. 229–241.
[17]
Lydia Manikonda, Ghazaleh Beigi, Huan Liu, and Subbarao Kambhampati. 2018. Twitter for sparking a movement, reddit for sharing the moment:# metoo through the lens of social media. arXiv preprint arXiv:1803.08022 (2018).
[18]
Alex McClimens and Frances Gordon. 2009. People with intellectual disabilities as bloggers: What’s social capital got to do with it anyway?Journal of intellectual disabilities 13, 1 (2009), 19–30.
[19]
Alexey N Medvedev, Renaud Lambiotte, and Jean-Charles Delvenne. 2017. The anatomy of Reddit: An overview of academic research. In Dynamics on and of Complex Networks. Springer, 183–204.
[20]
Joy Ming, Sharon Heung, Shiri Azenkot, and Aditya Vashistha. 2021. Accept or address? Researchers’ perspectives on response bias in accessibility research. In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility. 1–13.
[21]
Edi Surya Negara, Dendi Triadi, and Ria Andryani. 2019. Topic Modelling Twitter Data with Latent Dirichlet Allocation Method. In 2019 International Conference on Electrical Engineering and Computer Science (ICECOS). IEEE, 386–390.
[22]
David Newman, Edwin V Bonilla, and Wray Buntine. 2011. Improving topic coherence with regularized topic models. Advances in neural information processing systems 24 (2011), 496–504.
[23]
Cristiane N Nobre, Magali RG Meireles, Débora BF Da Silva, Alberto H Faria, and Niltom Vieira Jr. 2018. Emotionally oriented analysis of the experiences of visually impaired people on facebook. ACM Transactions on Accessible Computing (TACCESS) 11, 3 (2018), 1–21.
[24]
Laura Ramos, Elise Van Den Hoven, and Laurie Miller. 2016. Designing for the Other’Hereafter’ When Older Adults Remember about Forgetting. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 721–732.
[25]
Woosuk Seo and Hyunggu Jung. 2017. Exploring the community of blind or visually impaired people on youtube. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 371–372.
[26]
Kayla S Sweet, Jennifer K LeBlanc, Laura M Stough, and Noelle W Sweany. 2020. Community building and knowledge sharing by individuals with disabilities using social media. Journal of computer assisted learning 36, 1 (2020), 1–11.
[27]
Shari Trewin, Diogo Marques, and Tiago Guerreiro. 2015. Usage of subjective scales in accessibility research. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 59–67.
[28]
Theodore Tsaousides, Yuka Matsuzawa, and Matthew Lebowitz. 2011. Familiarity and prevalence of Facebook use for social networking among individuals with traumatic brain injury. Brain injury 25, 12 (2011), 1155–1162.

Cited By

View all
  • (2024)Crafting Disability Fairness Learning in Data Science: A Student-Centric Pedagogical ApproachProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 110.1145/3626252.3630815(944-950)Online publication date: 7-Mar-2024

Index Terms

  1. Large-Scale Anonymized Text-based Disability Discourse Dataset

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility
    October 2023
    1163 pages
    ISBN:9798400702204
    DOI:10.1145/3597638
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 October 2023

    Check for updates

    Author Tags

    1. ADHD
    2. Reddit
    3. blind
    4. dataset
    5. disability
    6. discourse

    Qualifiers

    • Poster
    • Research
    • Refereed limited

    Conference

    ASSETS '23
    Sponsor:

    Acceptance Rates

    ASSETS '23 Paper Acceptance Rate 55 of 182 submissions, 30%;
    Overall Acceptance Rate 436 of 1,556 submissions, 28%

    Upcoming Conference

    ASSETS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)92
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Crafting Disability Fairness Learning in Data Science: A Student-Centric Pedagogical ApproachProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 110.1145/3626252.3630815(944-950)Online publication date: 7-Mar-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media