Human Detection in the Wild
dc.contributor.committeeMember | Kakadiaris, Ioannis A. | |
dc.contributor.committeeMember | Prasad, Saurabh | |
dc.contributor.committeeMember | Eick, Christoph F. | |
dc.contributor.committeeMember | Gabriel, Edgar | |
dc.creator | Shi, Lei | |
dc.creator.orcid | 0000-0002-9536-336X | |
dc.date.accessioned | 2021-06-09T17:33:00Z | |
dc.date.created | August 2020 | |
dc.date.issued | 2020-08 | |
dc.date.submitted | August 2020 | |
dc.date.updated | 2021-06-09T17:33:02Z | |
dc.description.abstract | Human detection remains a challenging task due to the problems caused by occlusion variance. Visible-body bounding boxes are typically used as an extra supervision signal to improve the performance of human detection. However, visible-body assisted approaches produce a large number of false positives, which result from a lack of adequate and discriminative full-body contextual information. As the most discriminative features of head and human, face detection has attracted much attention. Despite the great progress that has been achieved for accurate face detection, detecting multi-scale faces, especially for small faces, remains a challenging problem. Existing approaches that tackle multi-scale face detection problem could be categorized into two-stage face detectors and single-stage face detectors. Regarding two-stage face detectors, to learn discriminative facial features at various scales, the input pyramids or multi-scale feature maps are deployed to provide more facial information for the network to learn features in various scales. However, they could increase the training difficulty and complexity of the network. Regarding single-stage face detectors, feature fusion and context aggregation have been used to enrich contextual information. However, treating reliable information and noise equally could result in much noise in the fused features at different levels. Moreover, dilated convolutions in the context aggregation module could result in the gridding artifacts problem. The goal of this dissertation is to design, develop, and evaluate human detection algorithms to solve the above problems. Three contributions made in this dissertation could be summarized as follows: (i) A decoupled visible region network for human detection was designed, developed, and evaluated to overcome the occlusion challenge. The proposed human detector improved performance from MR$^{-2}$ of $11.24$ to MR$^{-2}$ of $10.50$ when compared to Bi-box which is inspired by our work on the CityPersons dataset. (ii) A two-stage face detector was designed, developed, and evaluated to overcome scale challenge. It improves performance by mAP of $12.1\%$ when compared to our baseline on the WIDER FACE dataset. (iii) A single-stage face detector was designed, developed, and evaluated to overcome scale challenge. The proposed method achieves the best performance with an mAP of $77.0\%$ on the UFDD dataset. | |
dc.description.department | Computer Science, Department of | |
dc.format.digitalOrigin | born digital | |
dc.format.mimetype | application/pdf | |
dc.identifier.citation | Portions of this document appear in: L. Shi, X. Xu, I. A. Kakadiaris. “SSFD+: A Robust Two-stage Face Detector,”IEEETransaction on Biometrics, Behavior, and Identity Science, 1(2019), 181-191.; L. Shi, X. Xu, I. A. Kakadiaris. “SANet: Smoothed Attention Netwok for Single-stage FaceDetector,” InProceedings of IEEE International Conference on Biometrics, (Crete, Greece,2019), pp. 11-20.; L. Shi, X. Xu, I. A. Kakadiaris. “SSFD: A Face Detector using A Single-scale Feature Map,”InProceedings of IEEE International Conference on Biometrics: Theory, Applications, andSystems, (LA, CA, 2018), pp. 1-10. | |
dc.identifier.uri | https://hdl.handle.net/10657/7759 | |
dc.language.iso | eng | |
dc.rights | The author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s). | |
dc.subject | Human detection, face detection | |
dc.title | Human Detection in the Wild | |
dc.type.dcmi | Text | |
dc.type.genre | Thesis | |
local.embargo.lift | 2022-08-01 | |
local.embargo.terms | 2022-08-01 | |
thesis.degree.college | College of Natural Sciences and Mathematics | |
thesis.degree.department | Computer Science, Department of | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | University of Houston | |
thesis.degree.level | Doctoral | |
thesis.degree.name | Doctor of Philosophy |
Files
Original bundle
1 - 1 of 1