Understanding mental health issues in different subdomains in social networking services: a focus on linguistic features
This study examines the linguistic characteristics of user posts on specific mental disorder subreddit channels (depression, anxiety, bipolar, borderline personality disorder, schizophrenia, autism, and mental health) on Reddit. The researchers use statistical analysis methods and natural language processing techniques, including BERT embeddings, to identify distinct linguistic patterns associated with each mental health issue. They successfully cluster the posts using supervised and unsupervised learning methods. The findings suggest that patients with different mental health issues exhibit unique lexical and semantic patterns in their online social networking activities. The study highlights the potential of linguistic analysis and machine learning for understanding mental health issues and aiding online interventions. The dataset used in the study is publicly available online.
- Keywords: mental health, sentiment analysis, mental disorder, text analysis, NLP, clustering.
The sample dataset is available in reddit_mentalhealth_sample.csv
file. We are only allowed to distribute the data for the research purpose, if you want to achieve full datasets, please complete the request form @ https://forms.gle/5KV2oGytJXEetnic9)https://forms.gle/5KV2oGytJXEetnic9