Automatic Characterization of Stories

dc.contributor.advisorSolorio, Thamar
dc.contributor.committeeMemberVerma, Rakesh M.
dc.contributor.committeeMemberLapata, Mirella
dc.creatorKar, Sudipta
dc.date.accessioned2020-06-04T03:04:03Z
dc.date.available2020-06-04T03:04:03Z
dc.date.createdMay 2020
dc.date.issued2020-05
dc.date.submittedMay 2020
dc.date.updated2020-06-04T03:04:04Z
dc.description.abstractComputerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing. In this thesis, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this thesis, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, source code of systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.citationPortions of this document appear in: Sudipta Kar, Suraj Maharjan, A. Pastor L ́opez-Monroy, and Thamar Solorio. MPST: A corpus of movie plot synopses with tags. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, France, May 2018. European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1. And in: Kar, Sudipta, Suraj Maharjan, and Thamar Solorio. "Folksonomication: Predicting tags for movies from plot synopses using emotion flow encoded neural network." In Proceedings of the 27th International Conference on Computational Linguistics, pp. 2879-2891. 2018. And in: Kar, Sudipta, Gustavo Aguilar, and Thamar Solorio. "Multi-view Characterization of Stories from Narratives and Reviews using Multi-label Ranking." arXiv preprint arXiv:1908.09083 (2019).
dc.identifier.urihttps://hdl.handle.net/10657/6709
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectNLP, Narrative Analysis, Deep Learning
dc.titleAutomatic Characterization of Stories
dc.type.dcmiText
dc.type.genreThesis
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KAR-DOCTORALTHESISEDD-2020.pdf
Size:
3.6 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.43 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: