Year of Award

2023

Document Type

Thesis

Degree Type

Master of Science (MS)

Degree Name

Computer Science

Department or School/College

Computer Science

Committee Chair

David Opitz

Commitee Members

David Opitz, Andrew Ware, Patricia Duce

Keywords

malware, vulnerability, transfer learning, machine learning, cyber security, binary

Subject Categories

Data Science | Statistical Models

Abstract

Malware detection and vulnerability detection are important cybersecurity tasks. Previous research has successfully applied a variety of machine learning methods to both. However, despite their potential synergies, previous research has yet to unite these two tasks. Given the recent success of transfer learning in many domains, such as language modeling and image recognition, this thesis investigated the use of transfer learning to improve vulnerability detection. Specifically, we pre-trained a series of models to detect malicious binaries and used the weights from those models to kickstart the detection of vulnerable binaries. In our study, we also investigated five different data representations of portable executable binaries, all but one of which showed positive transfer in at least one experiment. The single-channel image and tf-idf assembly instruction count embedding were particularly successful, increasing the accuracy of a non- transfer randomly initialized model from 77.2% to 95.8%.

Share

COinS
 

© Copyright 2023 Sean Patrick McNulty