Published November 18, 2022 | Version 1.0.0
Dataset Open

Emotional and Cognitive Changes Surrounding Online Depression Identity Claims

  • 1. University of Michigan
  • 2. The University of Texas at Austin

Description

The repository includes data files containing anonymized user IDs, timestamps, identity claim time, LIWC variables, post vs. comment (boolean), and mental health vs. other subreddit (boolean) for our paper Emotional and Cognitive Changes Surrounding Online Depression Identity Claims to replicate LIWC analysis. These files are named ic_liwc.csv (for users with identity claims) and control_liwc.csv (for users without identity claims). Because the identity claims themselves are excluded from these files but metadata about them is required to split users into groups, we also provide a file for doing so, ic_properties.csv.

We have also included files with n-gram counts to reproduce our n-gram analysis, and files with the patterns used to collect the data.

The Reddit posts themselves will not be made widely available, following the lead of Cohan et al. (the paper with the data collection process we follow) who only release raw text data to researchers upon request.

Files

ic_data.zip

Files (1.8 GB)

Name Size Download all
md5:409f999ae6f8be8c12f0136e2a5b0cee
1.8 GB Preview Download