Estimating the Emotional Content of an Image from the Observer's Eye Scan Patern
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The aim of the study is to predict the emotional gist of the image, namely the level of arousal (low or high) and kind of emotion (positive or negative) that the image elicits from the pattern of eye movements of a human observer. Images were selected based on their arousal and valence ratings. The observers (n=32) viewed the images in a random order and their pattern of eye movements was recorded with a head-mounted eye tracker. Features pertaining to saccades and fixation were extracted. Feature values obtained from the eye scan pattern data were fed into a random forest algorithm in MATLAB. Performing 10 fold cross-validation yielded a classification efficiency of 57% on low versus high arousal images, and 56% on positive versus negative valence images (a priori probability=50%). Several dynamic features were added to improve the efficiency though the effort proved to be unfruitful. Finally, the images were checked to see if they really show any difference by training them through a Convolutional Neural Network. The model showed a classification efficiency of 85% based on Valence and 75% based on Arousal.