Contextual Descriptors for Human Activity Recognition



Journal Title

Journal ISSN

Volume Title



Human activity recognition is one of the most challenging problems that has received considerable attention from the computer vision community in recent years. Its applications are diverse, spanning from its use in activity understanding for intelligent surveillance systems to improving human-computer interactions. The goal of human activity recognition is to automatically recognize ongoing activities from an unknown video (i.e. a sequence of image frames). The challenges in solving this problem are multi-fold due to the complexity of human motions, the spatial and temporal variations exhibited due to differences in duration of different activities performed, the changing spatial characteristics of the human form, and the contextual information in performing each activity. A number of approaches have been proposed to address these challenges over the past few years by trying to design effective, compact descriptors for human activity encoding activity characteristics with context; however the mechanisms for incorporating them are not unique.

In this dissertation, I present efficient techniques to handle learning and recognizing human activities. The primary goal of this research is to design compact but rich descriptors along with effective algorithms that can generally accommodate useful activity representation in a way of recognizing a single human activity or a collective activity in a crowded scene. For single human activity recognition, I introduce the subject-centric descriptors incorporating of both local and global representations that provide robustness against noise, partial occlusion, and invariance to changes in image scales. For collective activity recognition, I present context-based descriptors that efficiently encode human activity characteristic with contextual information leading to improve methods for analyzing group activities in a crowded scene.

My results focus on recognizing single human activity and collective activity in a crowded scene. I show how efficient of my proposed descriptors in encoding human activity to be made on several public datasets. Moreover, I show how to incorporate contextual information to human activity characteristic in analyzing human activities in a crowded scene.



Human action recognition, Contextual descriptors