Visual Summarization of Lecture Videos to Enhance Navigation
Recorded lecture video is a popular and essential learning resource. A fundamental limitation of lecture video is the inability to access any content of interest quickly. Several lecture video management portals introduced additional navigation features like indexing, captioning, search, etc. Lecture video indexing is the automatic partitioning of videos into smaller segments, each discussing a particular topic. However, these indexes do not describe the content of the segment. My goal is to create a visual summary containing a subset of images extracted from a lecture video segment to enhance navigation. The quality of a visual summary depends on the uniqueness and importance of the images. The uniqueness is achieved by ensuring a diverse set of images that has low similarity between them. The importance is the desirability of an image to be included in the summary. Experimental results indicate a combination of keypoints-match and color histograms work best to identify unique objects, and a combination of the size and the number of keypoints can closely approximate the desirability of an image for including in the summary. This dissertation presents a graph-based algorithm that selects a subset of unique and important images for a visual summary. The results from this research are implemented into a real-world lecture video management portal called Videopoints. The evaluation is based on summaries provided by Videopoints users on a dataset of 120 video segments. The graph-based heuristic algorithm for identifying summary images achieves 66% F1-measure with frequently-selected images as the ground truth and 79% F1-measure with the union of all user-selected images as the ground truth. For 93.8% of algorithm selected visual summary images, at least one user also selected that image for their summary or considered it similar to another image they selected. Over 70% of automatically generated summaries were rated as good or very good by the users on a 4-point scale from poor to very good. Overall, the results establish that the methodology introduced in this dissertation produces good quality visual summaries that are practically useful for lecture video navigation.