Aerosol Data Modeling & Similarity Assessment – a Probabilistic Approach

dc.contributor.advisorEick, Christoph F.
dc.contributor.committeeMemberShi, Weidong
dc.contributor.committeeMemberChoi, Yunsoo
dc.creatorAnchlia, Puja 1987-
dc.creator.orcid0000-0003-0764-5912
dc.date.accessioned2018-02-15T19:44:30Z
dc.date.available2018-02-15T19:44:30Z
dc.date.createdDecember 2015
dc.date.issued2015-12
dc.date.submittedDecember 2015
dc.date.updated2018-02-15T19:44:31Z
dc.description.abstractBio-threat detection is a problem of paramount importance to the modern society. One of the major requirements of a system that could detect such threats is to have a low false alarm rate. An important reason behind false alarms is the inability of the system to take into consideration the ambient aerosol background of the location under consideration. This is because aerosol backgrounds typically differ across locations—one agent may be naturally present in one location but might be unusual for some other location. Although work has been done in the past to characterize aerosol backgrounds of certain locations, there is no general algorithm available that creates an aerosol background model at a particular location. This work centers on the design and implementation of a Sensor Modeling Toolbox. The toolbox assumes that the sensor data comes from a Gaussian mixture distribution; therefore, it uses Gaussian mixture models (GMM) to model the sensor data. The Expectation Maximization algorithm is used to estimate the parameters of the GMM and the Bayesian Information Criterion (BIC) is used to select the best model by considering both the log-likelihood and model complexity (number of parameters to be estimated) of the GMM. The toolbox provides various functionalities, the major ones being to create a model for a set of sensor observations, to generate a model for the aerosol background at a particular location, and to assess similarity between two sets of sensor readings by introducing two novel distance functions. The functionalities of the toolbox are evaluated on two real-world datasets that were obtained from aerosol sensors deployed on the campus of the University of Houston. Due to the unavailability of labeled data, this work assesses the quality of the obtained results by comparing them with the characteristics of the raw-input data and shows that they are in good agreement. It is also observed that it is usually better to remove outliers from datasets before creating a GMM. The experimental results demonstrate that the toolbox is useful to model and analyze aerosol data and that it provides important capabilities for building a successful bio-threat detection system in the future.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10657/2157
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectAerosol data modeling
dc.subjectSimilarity assessment
dc.subjectGaussian mixture models
dc.subjectBio-threat detection
dc.subjectAerosol background modeling
dc.titleAerosol Data Modeling & Similarity Assessment – a Probabilistic Approach
dc.type.dcmiText
dc.type.genreThesis
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ANCHLIA-THESIS-2015.pdf
Size:
3.09 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: