Computational Methods for Tweet Summarization and Emotion Extraction
MetadataShow full item record
The process of gathering insights from social media has gained significant importance in the last decade. Since social media data is growing larger and larger, frameworks that can analyze social media content automatically are of critical importance. Twitter is a micro-blog service that generates a massive amount of textual content every day. Throughout our research, we concentrate on using Twitter for the task of sentiment analysis, the most popular micro-blogging site. We demonstrate how to compile a corpus automatically for purposes of sentiment analysis and opinion mining. Sentiment analysis classifies texts based on the sentimental orientation of opinions and emotions they contain. In this project, we are interested in evaluating popular sentiment analysis tools that automatically determine emotions in tweets and to develop computational methods that summarize the content of a large set of tweets. For the comparison of sentiment analysis tools, we created different benchmarks of manually annotated tweet datasets, and then evaluated the tools using these benchmarks. We also addressed some of the most popular sentiment analysis challenges. As far as summarization of tweets is concerned, we designed and developed algorithms that extract keywords and key sentences as a summary for a set of tweets. Finally, we developed a tool that creates a distance matrix for a set of tweets relying on the popular TF-IDF framework.