ENHANCING MALWARE DETECTION THROUGH BEHAVIORAL MODELLING AND FEATURE LEARNING

dc.contributor.advisorHuang, Stephen
dc.contributor.committeeMemberLeiss, Ernst L.
dc.contributor.committeeMemberShi, Weidong
dc.contributor.committeeMemberConklin, Wm. Arthur
dc.creatorEl Aassal, Ayman
dc.creator.orcid0000-0002-0225-4544
dc.date.accessioned2024-01-27T01:59:34Z
dc.date.createdDecember 2023
dc.date.issued2023-12
dc.date.updated2024-01-27T01:59:34Z
dc.description.abstractThe growing frequency of data breaches and cyberattacks due to malware infections in recent years highlights the significance of ongoing research in malware detection. Malicious software, or malware for short, often undergoes numerous mutations to avoid detection by signature-based antivirus software. The abundance of malware variants has made the task of detection increasingly complex. Mainstream cybersecurity vendors favor static analysis methods due to their speed and scalability in assessing incoming files, generating their signatures, and cross-referencing them with a database of recognized malicious signatures for detection. However, this form of analysis is susceptible to obfuscation methods where hackers modify malware code in superfluous ways to generate a new signature that is not yet recognized by antiviruses. That is why this work focuses on analyzing the run-time execution of programs to extract their behavior and identify them as malware or benign. This dissertation addresses the persistent challenge posed by the ever-evolving malware variants by introducing a framework designed to capture the run-time behavior of programs through graph modelling and deep learning methods. The proposed approach parses the log of native functions called by a program during its execution. This parsing process enables the creation of Behavior Call Graphs (BCGs) using a novel methodology emphasizing the connections between these native functions. Graph structures offer the ability to effectively represent intricate relationships within the data, facilitating the extraction of relevant information that might be challenging to capture otherwise. This research employs two different methods to analyze these BCGs. The first involves extracting domain expert features, while the second leverages deep learning algorithms to generate the features automatically. However, it's worth noting that conventional deep learning methods like Neural Networks and Convolutional Neural Networks are not designed to handle graphs as input. To address this limitation, we adopted feature learning algorithms that automatically embed graph structures into feature vectors within a multi-dimensional space. This dissertation validates the effectiveness of these approaches in analyzing BCG generated from Windows and Android applications to identify and capture the malicious behavior of malware variants. This research is helpful for companies and software publishers to test the safety of uploaded or shared applications and prevent malware from spreading to their end users.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.citationPortions of this document appear in: El Aassal, Ayman, and Shou-Hsuan Stephen Huang. "Learning Discriminative Representations for Malware Family Classification." In International Conference on Hybrid Intelligent Systems, pp. 1327-1336. Cham: Springer Nature Switzerland, 2022.
dc.identifier.urihttps://hdl.handle.net/10657/16221
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectMalware Detection, Graph Modelling, Machine Learning, Representation Learning
dc.titleENHANCING MALWARE DETECTION THROUGH BEHAVIORAL MODELLING AND FEATURE LEARNING
dc.type.dcmitext
dc.type.genreThesis
dcterms.accessRightsThe full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period.
local.embargo.lift2025-12-01
local.embargo.terms2025-12-01
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.43 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.82 KB
Format:
Plain Text
Description: