Early Detection of Depression



Journal Title

Journal ISSN

Volume Title



Depression is a mental disorder that affects more than 300 million people worldwide. An individual suffering from depression functions poorly in life, is prone to other diseases and in the worst-case, depression leads to suicide. There are many impediments that prevent expert care from reaching people suffering from depression in time. Impediments such as social stigma associated with mental disorders, lack of trained health-care professionals and ignorance of the signs of depression owing to a lack of awareness of the disease. Moreover, the World Health Organization (WHO) claims that individuals who are depressed are often not correctly diagnosed and others who are misdiagnosed are prescribed antidepressants. Thus, there is a strong need to automatically assess the risk of depression.

Identification of depression from social media has been framed as a classification problem in the field of Natural Language Processing (NLP). In this work we study NLP approaches that can successfully extract information from textual data to enhance identification of depression. These NLP approaches perform feature extraction to build document representations. The issues of detecting depression in a social media environment is data scarcity for users with depression and the inherent noise associated with social media data. We attempt to address those issues by using representations that can naturally cope with a social media environment. Specifically, we propose the usage of Distributed Term Representations (DTRs) to capture information that can be used by supervised machine learning methods for learning and classifying users suffering from depression. Experimental evaluation provides evidence that DTRs are more effective for depression detection than traditional representations such as Bag of Words (BOW) and representations based on neural word embeddings. In fact, we have obtained state-of-the-art results with Document Occurrence Representation (DOR) for depression detection (F1-Score 0.66 on the depressed class). For early detection of depression, we have obtained the lowest reported Early Risk Detection Error (ERDE) using Pyramidal a newly adapted method that is used for computing document representations.



Natural Language Processing, Health care