Face Recognition in Unconstrained Conditions: Improving Face Alignment and Constructing a Pose-Invariant Compact Biometric Template
Face recognition has been significantly advanced in the past decade; however, challenges remain under unconstrained conditions regarding variations in pose, illumination, and occlusion. Existing solutions tackle the unconstrained face recognition problem in two ways: (i) controlling the variations of the input to the recognition system, and (ii) improving the robustness of the recognition system to these variations. In the first method, the face frontalization module in 3D-aided face recognition significantly reduces the pose variation by mapping a facial image to a frontalized texture space with the help of a 3D facial model. However, because face frontalization relies heavily on the projection matrix generated by face alignment, its performance has been largely constrained by the robustness of face alignment under unconstrained conditions. In the second method, using an ensemble deep neural network model for recognition has been demonstrated to be robust to pose variations. However, the biometric template generated by the ensemble model is much larger than the template generated by an individual model. This dissertation presents solutions to both problems. To improve the robustness of face alignment under unconstrained conditions and significantly reduce the biometric template size, the first contribution is a Globally Optimized Dual-Pathway (GoDP) landmark detector algorithm that is robust to head pose variations up to 90\degrees. The second contribution is a pose estimation algorithm namely Annotated Face Model-based Alignment (AFMA) that estimates a head pose without landmarks. The third contribution is a pose estimation algorithm with the name Sensible-Points based reinforced Hypothesis Refinement (SHR) which is robust to facial occlusion. The fourth contribution is a pose estimation algorithm with the name Convolutional Point-set Representation-based Face Alignment (CPRFA), it is robust to facial occlusion and large head pose variations. The fifth contribution is a neural network architecture that reduces the template size of an ensemble deep model by more than an order-of-magnitude based on self-occlusion masks, we name it Mask-Guided Compact Template Learning (MGCTL). When plugging GoDP and MGCTL into a 3D-aided face recognition pipeline, state-of-the-art performance is achieved on multiple databases in terms of both face recognition accuracy and template matching speed.