Vulnerability Detection in Assembly Code Using Deep Learning

Thangavelu, Karthiga

Vulnerability Detection in Assembly Code Using Deep Learning

Files

Thangavelu_Karthiga_MEng_2023.pdf (932.57 KB)

Date

2023-03-01

Authors

Thangavelu, Karthiga

Abstract

Language modelling for source code is a state-of-the-art method which is developing significantly in recent years. Its applications are found in code completion, translating programming languages from one to another, translating text documents to code, finding vulnerabilities in source code, etc. Unlike other source code modelling such as C, C++ or Python, modelling assembly language is a tedious process. Most of the approaches involved in feature engineering are manual in assembly code. In this project, the pattern of assembly code is recognized, and malicious code is classified from non-malicious code. The strings of jumps are introduced into the assembly code to make it non-malicious. The pattern recognition and classification process consist of 3 main tasks. Firstly, the strings of jumps are introduced to the assembly code and tokenize the assembly code. Secondly, converting instructions to vectors using assembly language model for instruction embedding based on BERT language transformer, which minimizes the manual process of dataset pre-processing. The final task is a downstream task where the instruction embeddings are fed into the LSTM network for classifying malicious code from non-malicious code using an assembly code dataset. The performance of the model is evaluated using various evaluation metrics such as accuracy, confusion matrix, recall, precision, and F1 score.

Keywords

Vulnerability detection, Assembly code, Transformer based-model, Instruction embedding

URI

http://hdl.handle.net/1828/14805

Collections

Graduate Projects (Electrical and Computer Engineering)

Full item page

Vulnerability Detection in Assembly Code Using Deep Learning

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections