Mining Phishing Campaigns using FP Trees

dc.contributor.authorKaranth, Anirudh
dc.contributor.supervisorTraore, Issa
dc.date.accessioned2020-08-04T22:02:06Z
dc.date.available2020-08-04T22:02:06Z
dc.date.copyright2020en_US
dc.date.issued2020-08-04
dc.degree.departmentDepartment of Electrical and Computer Engineering
dc.degree.levelMaster of Engineering M.Eng.en_US
dc.description.abstractPhishing is a fraudulent online activity conducted by hackers to obtain sensitive information such as credit card number, social security number, or passwords of a user by disguising themselves as legitimate entity via emails, text messages or phone calls. It has been reported that in 2019 nearly 4% of all emails were phishing emails, which correspond to about 3.4 billion emails. Analyzing those phishing emails is an important step towards understanding the motivation and methods of phishers. However, analyzing manually that amount of astronomical data is impossible and ineffective considering that phishers are always finding unique and novel methods to evade detection. One way to keep up with the huge amount of data and the growing sophistication in evasion tactics is to focus the analysis around phishing campaigns. A phishing campaign is the collection of phishing emails built from the same template. This report adapts and extends previous work on spam campaigns for mining phishing campaigns. The phishing campaigns are mined using Frequent Pattern Tree (FP Tree). The campaigns are identified by investigating the contribution of different email features. Experiments are conducted using a dataset consisting of over 17,342 phishing messages, yielding 231 different campaigns in the best case. The campaigns found for given set parameters are found to be very stable with an error percentage of around 1.5%.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/11971
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectFP Treesen_US
dc.subjectData Miningen_US
dc.subjectPhishingen_US
dc.subjectUnsupervised Learningen_US
dc.subjectPythonen_US
dc.titleMining Phishing Campaigns using FP Treesen_US
dc.typeprojecten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Karanth_Anirudh_MEng_2020.pdf
Size:
761.56 KB
Format:
Adobe Portable Document Format
Description:
Project Report
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: