Show simple item record

dc.contributor.advisorVilalta, Ricardo
dc.creatorSui, Bangsheng 1987-
dc.date.accessioned2014-02-13T15:22:14Z
dc.date.available2014-02-13T15:22:14Z
dc.date.createdDecember 2013
dc.date.issued2013-12
dc.identifier.urihttp://hdl.handle.net/10657/523
dc.description.abstractAnalyzing high-dimensional data stands as a great challenge in machine learning. In order to deal with the curse of dimensionality, many effective and efficient feature-selection algorithms have been developed recently. However, most feature-selection algorithms assume independence of features; they identify relevant features mainly on their individual high correlation with the target concept. These algorithms can have good performance when the assumption of feature independence is true. But they may perform poorly in domains where there exist feature interactions. Due to the existence of feature interactions, a single feature with little correlation with the target concept can be in fact highly correlated when looked together with other features. Removal of these features can harm the performance of the classification model severely. In this thesis, we first present a general view of feature interaction. We formally define feature interaction in terms of information theory. We propose a practical algorithm to identify feature interactions and perform feature selection based on the identified feature interactions. After that, we compare the performance of our algorithm with some well-known feature selection algorithms that assume feature independence. By comparison, we show that by taking feature interactions into account, our feature selection algorithm can achieve better performance in datasets where interactions abound.
dc.format.mimetypeapplication/pdf
dc.language.isoeng
dc.subjectfeature selection
dc.subjectmachine learning
dc.subjectfeature interaction
dc.subjectinformation gain
dc.subject.lcshComputer science
dc.titleINFORMATION GAIN FEATURE SELECTION BASED ON FEATURE INTERACTIONS
dc.date.updated2014-02-13T15:22:19Z
dc.type.genreThesis
thesis.degree.nameMaster of Science
thesis.degree.levelMasters
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.departmentComputer Science
dc.contributor.committeeMemberEick, Christoph F.
dc.contributor.committeeMemberKaiser, Klaus
dc.type.dcmiText
dc.format.digitalOriginborn digital
dc.description.departmentComputer Science
thesis.degree.collegeCollege of Natural Sciences and Mathematics


Files in this item


Thumbnail

This item appears in the following Collection(s)

Show simple item record