Robust Domain Adaptation Using Active Learning

Dhar Gupta, Kinjal 1982-

Robust Domain Adaptation Using Active Learning

dc.contributor.advisor	Vilalta, Ricardo
dc.contributor.committeeMember	Eick, Christoph F.
dc.contributor.committeeMember	Chen, Guoning
dc.contributor.committeeMember	Mahabal, Ashish
dc.creator	Dhar Gupta, Kinjal 1982-
dc.creator.orcid	0000-0001-9498-6365
dc.date.accessioned	2018-11-30T19:20:05Z
dc.date.available	2018-11-30T19:20:05Z
dc.date.created	August 2016
dc.date.issued	2016-08
dc.date.submitted	August 2016
dc.date.updated	2018-11-30T19:20:06Z
dc.description.abstract	Traditional machine learning algorithms assume training and test datasets are generated from the same underlying distribution, which is not true for most real-world datasets. As a result, a model trained on the training dataset fails to produce good classification accuracy on the test dataset. One way to mitigate this problem is to use domain adaptation techniques; these techniques build a new model on the unlabeled test dataset (target dataset) by transferring information from a related but labeled training dataset, (source dataset) even when their underlying distributions are different. One other important issue is that in domain adaptation, there is no allowance for obtaining class labels on the test dataset during the training phase. This issue can be handled by active learning techniques that assume the existence of a budget that can be used to label instances on the target domain. Active learning finds the most informative instances of the test dataset that can be labeled by the expert to get a better classification accuracy on the unlabeled test dataset. The goal of this research is to build an optimal classifier on the target dataset by using information related to model complexity. We propose a novel domain adaptation technique using active learning to find the optimal value of a parameter of a class of models that yields the best classifier on the target dataset without assuming the equivalence of the class-conditional probabilities across the domains, unlike other domain adaptation methods. This research also proposes a novel data-alignment technique that allows the use of the source model directly on the target if the distributions differ due to a linear shift, thus avoiding building a complete new classifier on the target domain. Empirical results show that our methods yield better classification accuracy than the state-of-art methods.
dc.description.department	Computer Science, Department of
dc.format.digitalOrigin	born digital
dc.format.mimetype	application/pdf
dc.identifier.citation	Portions of this document appear in: Vilalta, Ricardo, Kinjal Dhar Gupta, and Lucas Macri. "A machine learning approach to Cepheid variable star classification using data alignment and maximum likelihood." Astronomy and Computing 2 (2013): 46-53. And in: Vilalta, Ricardo, Kinjal Dhar Gupta, and Lucas Macri. "Domain adaptation under data misalignment: An application to cepheid variable star classification." In 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 3660-3665. IEEE, 2014. And in: Vilalta, Ricardo, Kinjal Dhar Gupta, and Ashish Mahabal. "Star classification under data variability: an emerging challenge in astroinformatics." In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 241-244. Springer, Cham, 2015. And in: Gupta, Kinjal Dhar, Ricardo Vilalta, Vicken Asadourian, and Lucas Macri. "Adapting Predictive Models for Cepheid Variable Star Classification Using Linear Regression and Maximum Likelihood." Proceedings of the International Astronomical Union 10, no. S306 (2014): 319-321.
dc.identifier.uri	http://hdl.handle.net/10657/3520
dc.language.iso	eng
dc.rights	The author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subject	Domain adaptation
dc.subject	Active learning
dc.subject	Machine learning
dc.subject	Model Complexity
dc.title	Robust Domain Adaptation Using Active Learning
dc.type.dcmi	Text
dc.type.genre	Thesis
thesis.degree.college	College of Natural Sciences and Mathematics
thesis.degree.department	Computer Science, Department of
thesis.degree.discipline	Computer Science
thesis.degree.grantor	University of Houston
thesis.degree.level	Doctoral
thesis.degree.name	Doctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1

Name:: DHARGUPTA-DISSERTATION-2016.pdf
Size:: 814.34 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: LICENSE.txt
Size:: 1.82 KB
Format:: Plain Text
Description:

Download

Collections

Published ETD Collection