Automatic Characterization of Stories
Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing. In this thesis, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this thesis, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, source code of systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.