Machine learning and unlearning for IoT anomaly detection

dc.contributor.authorFan, Jiamin
dc.contributor.supervisorWu, Kui
dc.date.accessioned2023-04-22T00:09:08Z
dc.date.available2023-04-22T00:09:08Z
dc.date.copyright2023en_US
dc.date.issued2023-04-21
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelDoctor of Philosophy Ph.D.en_US
dc.description.abstractDespite the booming market of the Internet of Things (IoT), the weak security protection of IoT devices makes anomaly detection in IoT systems extremely challenging. This dissertation tackles three critical problems in the anomaly detection of IoT: i) the fast update of deep learning-based detection models, ii) the non-independent and identically distributed (non-IID) problem in federated learning (FL) based anomaly detection, and iii) root cause analysis of anomalies. First, to update deep learning-based detection models in IoT anomaly detection systems, we propose a new machine unlearning method called ViFLa, which groups training data based on estimated unlearning probability and treats each group as a virtual client in the federated learning framework. Since the virtual clients are physically in the same machine, ViFLa only leverages the concept of data/local model isolation in federated learning without incurring any network communication. To tackle the non-IID problem caused by the data grouping strategy, ViFLa designs an enhanced class distribution weighted sum (ECDWS) aggregation method based on Kullback–Leibler divergence and attention mechanism. It also introduces a new state transition ring mechanism into the statistical query (SQ) learning framework to update the local model of each virtual client quickly. Using real-world IoT traffic data, we showcase the benefit of ViFLa regarding its efficiency and completeness for model updates in the context of IoT traffic anomaly detection. Second, we develop a new anomaly detection approach, called ClusterFLADS, which depends on clustered federated learning to address the issue of non-IID data amongst different clients in traditional federated detection systems. ClusterFLADS takes advantage of the false predictions of inappropriate global models, together with knowledge of temperature scaling and catastrophic forgetting, to reveal distributional similarities between the training data of different clusters and the test data. To improve the clustering speed, we introduce an efficient feature extraction scheme by exploiting the difference in the role each layer of a neural network plays in the learning process. We evaluate the performance of ClusterFLADS using real-world IoT trace data in various scenarios. The results show that ClusterFLADS can cluster clients accurately and efficiently, with a 100% true positive rate and no false positives over various data distributions. Third, to cope with the diverse anomaly scenarios that may be encountered by FL-based IoT anomaly detection systems, we design Score-VAE, a new framework to identify the root causes of anomalies based on a variational autoencoder (VAE) network. Score-VAE can work with existing IoT anomaly detection systems built over the FL framework. To achieve lifelong learning, Score-VAE} builds a separate global model for each abnormal scenario, so the intervention of new scenarios will not render the existing system unusable. To obtain better generalization and collaboration capacities required by the IoT systems, Score-VAE adopts a privacy-preserve training scheme and a Hamming tests scheme. To further improve model performance, Score-VAE employs a VAE network with dynamic loss, which exploits knowledge of multi-task learning, stopping gradients and distributions. Evaluation results with real-world IoT trace data collected from different scenarios demonstrate that Score-VAE can accurately discover the root causes of alarms triggered by the IoT anomaly detection system.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/14962
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectInternet of Thingsen_US
dc.subjectanomaly detectionen_US
dc.subjectmachine learningen_US
dc.titleMachine learning and unlearning for IoT anomaly detectionen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Fan_Jiamin_PhD_2023.pdf
Size:
9.56 MB
Format:
Adobe Portable Document Format
Description:
PhD dessertation
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: