Text augmentation using a graph-based approach and clonal selection algorithm

dc.contributor.authorAhmed, Hadeer
dc.contributor.authorTraore, Issa
dc.contributor.authorMamun, Mohammad
dc.contributor.authorSaad, Sherif
dc.date.accessioned2023-11-06T17:50:09Z
dc.date.available2023-11-06T17:50:09Z
dc.date.copyright2023en_US
dc.date.issued2023
dc.description.abstractAnnotated data is critical for machine learning models, but producing large amounts of data with high-quality labeling is a time-consuming and labor-intensive process. Natural language processing (NLP) and machine learning models have traditionally relied on the labels given by human annotators with varying degrees of competency, training, and experience. These kinds of labels are incredibly problematic because they are defined and enforced by arbitrary and ambiguous standards. In order to solve these issues of insufficient high-quality labels, researchers are now investigating automated methods for enhancing training and testing data sets. In this paper, we demonstrate how our proposed method improves the quality and quantity of data in two cybersecurity problems (fake news identification & sensitive data leak) by employing the clonal selection algorithm (CLONALG) and abstract meaning representation (AMR) graphs, and how it improves the performance of a classifier by at least 5% on two datasets.en_US
dc.description.reviewstatusRevieweden_US
dc.description.scholarlevelFacultyen_US
dc.description.sponsorshipThis project was supported in part by collaborative research funding from the National Research Council of Canada’s Artificial Intelligence for Logistics Program.en_US
dc.identifier.citationAhmed, H., Traore, I., Mamun, M., Saad, S. (2023). Text augmentation using a graph-based approach and clonal selection algorithm. Machine Learning with Applications, 11, 100452. https://doi.org/10.1016/j.mlwa.2023.100452.en_US
dc.identifier.urihttps://doi.org/10.1016/j.mlwa.2023.100452
dc.identifier.urihttp://hdl.handle.net/1828/15585
dc.language.isoenen_US
dc.publisherMachine Learning with Applicationsen_US
dc.subjectData augmentation
dc.subjectUnstructured data
dc.subjectCybersecurity
dc.subjectText generation
dc.subjectClonal selection
dc.subject.departmentDepartment of Electrical and Computer Engineering
dc.titleText augmentation using a graph-based approach and clonal selection algorithmen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ahmed_hadeer_MachLearnAppl_2023.pdf
Size:
1.05 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: