Automatic Characterization of Stories

Date

2020-05

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing. In this thesis, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this thesis, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, source code of systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.

Description

Keywords

NLP, Narrative Analysis, Deep Learning

Citation

Portions of this document appear in: Sudipta Kar, Suraj Maharjan, A. Pastor L ́opez-Monroy, and Thamar Solorio. MPST: A corpus of movie plot synopses with tags. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, France, May 2018. European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1. And in: Kar, Sudipta, Suraj Maharjan, and Thamar Solorio. "Folksonomication: Predicting tags for movies from plot synopses using emotion flow encoded neural network." In Proceedings of the 27th International Conference on Computational Linguistics, pp. 2879-2891. 2018. And in: Kar, Sudipta, Gustavo Aguilar, and Thamar Solorio. "Multi-view Characterization of Stories from Narratives and Reviews using Multi-label Ranking." arXiv preprint arXiv:1908.09083 (2019).