Sentiment and Topic Classification Aproaches to African News Articles About COVID-19 and Vaccines



Journal Title

Journal ISSN

Volume Title



The project presents solutions to analyze news articles from African news sources related to COVID-19 and vaccination. The current worldwide health crisis requires us to understand the world's view on these topics to make accurate decisions for the future. Therefore, analyzing the sentiment and topic of news articles can give insights into how people view and react to the virus and vaccines. Which of the existing sentiment analysis approaches (off-the-shelf dictionaries, crowd-coding, and expert coding) is more accurate in determining the sentiment of news stories about COVID-19 and COVID-19 vaccines in African countries? What is the best approach (supervised, semi-supervised, unsupervised, deep learning) to identify the topic of news stories about COVID-19 and COVID-19 vaccines in African countries? These are the questions we aim to answers on this project. To do that, we applied multiple popular sentiment dictionaries, crowdsourcing with Amazon Mechanical Turk, and our own manually labeled gold standard for comparison. We use different models for topic classification and compare them to find the most accurate approach to these problems. We conclude that dictionary approaches for sentiment analysis do not have enough accuracy for COVID-19 and vaccine-related news because of the context-specific meaning of words that can change their sentiment. It is also necessary to do further research on topic classification. Our trained models seem to have higher accuracy than the general-purpose New York Times trained model for news topic classification.