Stylistically Aware Representations of Books
Maharjan, Suraj 1986-
MetadataShow full item record
The conscious or unconscious choices made by an author to use some language forms constantly over other possible forms constitute the style of the author. Capturing style embedded in documents has a wide range of applications across many domains. In this dissertation, we propose a multitude of hand-crafted lexical, syntactic, and stylistic features together with novel deep learning methods to capture different stylistic markers embedded in documents. The methods are general enough to be applied to any domain. Here, we evaluate on an interesting and important domain: Books. The deeper study of stylistic variations will reveal the dos and don'ts of successful authors, which might help authors in shaping their writings and readers discover new books suited to their taste. We empirically show that traditional hand-crafted features and deep learning methods capture complementary information which upon careful combination yield better performance. Moreover, we find that adding an auxiliary task of genre classification to the primary task of success prediction improves results. Next, we propose a novel multimodal neural architecture that incorporates genre supervision to assign weights to individual feature types. As compared to previous ad-hoc feature combinations, which is time consuming and rigid, this method is capable of dynamically tailoring weights given to feature types based on the characteristics of each book. We then explore the authors' dexterity in use of emotion flow across the entire books to captivate readers. We show that modeling the sequential flow of emotions depicted across entire book performs better than without taking this information into account. Finally, we propose a novel method to learn stylistically aware embeddings for authors by feeding in the stylistic traits from their writings. These embeddings also prove to be assets in predicting books' likability.