Solorio, Thamar2022-06-17December 22021-12December 2Portions of this document appear in: Shafaei, Mahsa, Niloofar Safi Samghabadi, Sudipta Kar, and Thamar Solorio. "Age Suitability Rating: Predicting the MPAA Rating Based on Movie Dialogues." In Proceedings of The 12th Language Resources and Evaluation Conference, pp. 1327-1335. 2020; and in: Shafaei, Mahsa, Christos Smailis, Ioannis A. Kakadiaris, and Thamar Solorio. "A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers." RANLP (2021).https://hdl.handle.net/10657/9270In this dissertation, we discuss methods toward having a system to automatically detect objectionable content in online media. Movies, animations, trailers, and video blogs are vastly accessible by younger audiences through the movie service providers (e.g. Amazon and Netflix), YouTube, movie theatres, and generally the web. The online content helps us learn and inspire societal changes. But it can also contain objectionable content that negatively affects viewers' behavior, especially children. For some media content (like movies, books and trailers), we do have a rating system. For example, the rating system for movies is adopted from the Motion Picture Association of America (MPAA), consists of manual inspection of movies to assign an age rating. However, there are some issues regarding this rating system. First, the current system announces a single rating for the whole content. Yet, suitability is partially related to the culture, people's background, emotional and cognitive skills of children. Thus, having a single rating is not always helpful, and more details are needed. Second, this manual process does not scale to an ever-increasing number of online videos available on the internet. As the first step towards the main goal of this dissertation (detecting objectionable content), we design, implement, and evaluate a system that is capable to predict movies and trailers age suitability rating without a human observation to explore different models for the task. The system that we propose either employs only the script of the movies as the input, or it takes advantage of all modalities and combines all cues from acoustic, visual and textual information for detecting the objectionable content. The script-based system can be utilized at the early steps of the production when we only have the script. The multi-modal version, however, can be used after the production when a video is fully ready. Finally, we expand our multi-modal model to automatically generate the list of objectionable elements in any kind of video. In this dissertation, we focus exclusively on "Comic Mischief" elements, which no one has attempted previously. Along with the system, we propose the biggest corpus of movie scripts that comes with metadata, poster images, and movie trailers that are rated by the MPAA institution. We also compile a dataset including a wide range of videos that are tagged with comic mischief elements in video scenes. Finally, we make the implementation and data resources available for further research.application/pdfengThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).Objectionable ContentDeep Learning ArchitectureDetecting Objectionable Content in Online Media2022-06-17Thesisborn digital