DEVELOPMENT AND EVALUATION OF TEXT - BASED INDEXING FOR LECTURE VIDEOS
Varghese, Varun 1978-
MetadataShow full item record
Lecture videos are an extremely valued learning resource which has been validated by surveys among the students of the University of Houston. The inability to directly access the topics of interest within the video due to the continuous nature of video playback, limits its usability. Providing Index Points that represents the topic segments can significantly improve the accessibility. Indexing by manually identifying the topics is a time consuming task. Image analysis can identify the scene changes in a video; but this may not match topic change. However, the text within the lecture video describes the topics. This thesis proposes development of an automatic text-based approach for indexing of lecture videos that can provide topic-based segmentation. The methodology involves splitting the video into smaller segments at slide Transition Points where the scene changes, by determining the image difference between consecutive frames of the video. Optical character recognition technology extracts the text from the image that represents each segment. The indexing algorithm determines the topic changes within the video based on the text similarity. The text-based indexing algorithm combines neighboring segments with high text similarity to form a topic. For the performance evaluation, the output of the text-based algorithm is compared to a non-text-based method, as well as the ideal output based on the ground truth. The video selection for the algorithm evaluation comprised of 23 videos from diverse subjects. The results indicate that the indexing accuracy of text-based indexing is higher than other approaches. The best text-based algorithm achieved a maximum average accuracy of 74% while the simpler methods achieved 67% under the same test condition. Error analysis revealed that text similarity alone cannot accurately detect the topic changes within the video. The author suggests combining text and image as well as considering the semantics to further improve the indexing accuracy.