Overcoming Imbalanced Class Distribution and Overfitting in Financial Fraud Detection: An Investigation Using A Modified Form of K-Fold Cross Validation Approach to Reach Representativeness

dc.contributor.authorRocha Bezerra Junior, Joao Batista
dc.contributor.supervisorDamian, Daniela
dc.contributor.supervisorMurray, Adam
dc.date.accessioned2023-08-17T23:13:37Z
dc.date.available2023-08-17T23:13:37Z
dc.date.copyright2023en_US
dc.date.issued2023-08-17
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractAccording to the Internet Crime Report 2022, the number of complaints and the amount of financial losses from 2018 to 2022 show the total of $27.6 billion dollars, in 3.26 million complaints. Technology has been in development by institutions interested in mitigating cybercrimes, and researchers have been contributing with them to keep ahead the fraudulent systems. Machine learning and deep learning are being applied in a variety of studies to understand and learn how to avoid fraudulent transactions in real-world financial networks from financial institutions, through the use of past transactions. This thesis proposes to use a modified version of k-fold crossvalidation technique (Full Sets approach) applied to the PaySim synthetic dataset and submit it to a neural network model, and compare the results to one method of splitting that uses five folds of 20% of the dataset each fold applied to the same model, and then compare it to the machine learning algorithms Random Forest (RF), Logistic Regression (LR), and AdaBoost (AB). The measurements scores applied to evaluate the performances of the models are accuracy, precision, recall, F1 score, specificity, AUC-ROC, and PRC.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/15268
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectFinancial fraud detectionen_US
dc.subjectSynthetic dataseten_US
dc.subjectMachine learningen_US
dc.subjectNeural networken_US
dc.subjectImbalanced dataseten_US
dc.titleOvercoming Imbalanced Class Distribution and Overfitting in Financial Fraud Detection: An Investigation Using A Modified Form of K-Fold Cross Validation Approach to Reach Representativenessen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bezerra_Joao_MSc_2023.pdf
Size:
2.66 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: