Abstract
Individuals who identify as sexual and gender minorities, including lesbian,gay, bisexual, transgender, queer, and others (LGBTQ+) are more likely toexperience poorer health than their heterosexual and cisgender counterparts.One primary source that drives these health disparities is minority stress(i.e., chronic and social stressors unique to LGBTQ+ communities' experiencesadapting to the dominant culture). This stress is frequently expressed inLGBTQ+ users' posts on social media platforms. However, these expressions arenot just straightforward manifestations of minority stress. They involvelinguistic complexity (e.g., idiom or lexical diversity), rendering themchallenging for many traditional natural language processing methods to detect.In this work, we designed a hybrid model using Graph Neural Networks (GNN) andBidirectional Encoder Representations from Transformers (BERT), a pre-traineddeep language model to improve the classification performance of minoritystress detection. We experimented with our model on a benchmark social mediadataset for minority stress detection (LGBTQ+ MiSSoM+). The dataset iscomprised of 5,789 human-annotated Reddit posts from LGBTQ+ subreddits. Ourapproach enables the extraction of hidden linguistic nuances throughpretraining on a vast amount of raw data, while also engaging in transductivelearning to jointly develop representations for both labeled training data andunlabeled test data. The RoBERTa-GCN model achieved an accuracy of 0.86 and anF1 score of 0.86, surpassing the performance of other baseline models inpredicting LGBTQ+ minority stress. Improved prediction of minority stressexpressions on social media could lead to digital health interventions toimprove the wellbeing of LGBTQ+ people-a community with high rates ofstress-sensitive health problems.