Developing Deep Learning Models for Depression Detection in Texts

Journal Title
Journal ISSN
Volume Title

Depression is a major mental health disorder affecting a significant portion of the world population. Methods mostly being employed for depression detection are clinical interviews and questionnaire surveys where psychiatric assessment tables are used to establish mental disorder prognosis. Analyzing texts written by an individual can serve as an additional knowledge source to diagnose depression. Consequently, using deep learning models to detect depressed and non-depressed individuals based on social media posts, by analyzing the words being posted, has become the focus of recent research. The lack of big-sized depression-labeled datasets for training models for depression detection in texts is a major challenge. Also, selecting a data augmentation (DA) method to augment the available small-sized datasets is difficult. So, we developed a methodology, named DAMEVAL, for the evaluation of DA methods for text classification. In DAMEVAL, we proposed a set of evaluation measures and benchmark NLP datasets for the evaluation and comparison of DA methods to create a reference for easier selection of DA methods by users. In this dissertation, we extracted and analyzed the textual depression symptoms indicators present in texts posted in online forums and the distribution of these indicators with respect to depressed and non-depressed social media users. Also, we computed weights, using the TFIDF method, based on the extracted depression symptoms’ indicators present in users' posts. Subsequently, we introduced a weighted deep learning model named DEP-BERTCNN, based on the computed depression indicators’ weights, for depression detection in text in online forums. DEP-BERTCNN uses a combination of a pre-trained BERT language model, an attention model and convolutional neural network to classify forum users as depressed or non-depressed. The DEP-BERTCNN model was trained and evaluated on the large-scale Reddit Self-reported Depression Dataset (RSDD). Our results outperform several baseline methods for depression detection in texts, demonstrating the effectiveness of combining deep learning model with linguistic indicators associated with depression symptoms. In summary, we aim to develop a beneficial system that can easily be used to detect depression in texts and enable policy makers to respond to mental health escalations easily and promptly.

Depression Detection, BERT, CNN, Attention, Weights, TF-IDF, Texts, NLP
Portions of this document appear in: Aigbe, Steve Aibuedefe, and Christoph Eick. "Learning domain-specific word embeddings from covid-19 tweets." In 2021 IEEE International Conference on Big Data (Big Data), pp. 4307-4312. IEEE, 2021.