Machine learning and unlearning for IoT anomaly detection

Fan, Jiamin

Machine learning and unlearning for IoT anomaly detection

dc.contributor.author	Fan, Jiamin
dc.contributor.supervisor	Wu, Kui
dc.date.accessioned	2023-04-22T00:09:08Z
dc.date.available	2023-04-22T00:09:08Z
dc.date.copyright	2023	en_US
dc.date.issued	2023-04-21
dc.degree.department	Department of Computer Science
dc.degree.level	Doctor of Philosophy Ph.D.	en_US
dc.description.abstract	Despite the booming market of the Internet of Things (IoT), the weak security protection of IoT devices makes anomaly detection in IoT systems extremely challenging. This dissertation tackles three critical problems in the anomaly detection of IoT: i) the fast update of deep learning-based detection models, ii) the non-independent and identically distributed (non-IID) problem in federated learning (FL) based anomaly detection, and iii) root cause analysis of anomalies. First, to update deep learning-based detection models in IoT anomaly detection systems, we propose a new machine unlearning method called ViFLa, which groups training data based on estimated unlearning probability and treats each group as a virtual client in the federated learning framework. Since the virtual clients are physically in the same machine, ViFLa only leverages the concept of data/local model isolation in federated learning without incurring any network communication. To tackle the non-IID problem caused by the data grouping strategy, ViFLa designs an enhanced class distribution weighted sum (ECDWS) aggregation method based on Kullback–Leibler divergence and attention mechanism. It also introduces a new state transition ring mechanism into the statistical query (SQ) learning framework to update the local model of each virtual client quickly. Using real-world IoT traffic data, we showcase the benefit of ViFLa regarding its efficiency and completeness for model updates in the context of IoT traffic anomaly detection. Second, we develop a new anomaly detection approach, called ClusterFLADS, which depends on clustered federated learning to address the issue of non-IID data amongst different clients in traditional federated detection systems. ClusterFLADS takes advantage of the false predictions of inappropriate global models, together with knowledge of temperature scaling and catastrophic forgetting, to reveal distributional similarities between the training data of different clusters and the test data. To improve the clustering speed, we introduce an efficient feature extraction scheme by exploiting the difference in the role each layer of a neural network plays in the learning process. We evaluate the performance of ClusterFLADS using real-world IoT trace data in various scenarios. The results show that ClusterFLADS can cluster clients accurately and efficiently, with a 100% true positive rate and no false positives over various data distributions. Third, to cope with the diverse anomaly scenarios that may be encountered by FL-based IoT anomaly detection systems, we design Score-VAE, a new framework to identify the root causes of anomalies based on a variational autoencoder (VAE) network. Score-VAE can work with existing IoT anomaly detection systems built over the FL framework. To achieve lifelong learning, Score-VAE} builds a separate global model for each abnormal scenario, so the intervention of new scenarios will not render the existing system unusable. To obtain better generalization and collaboration capacities required by the IoT systems, Score-VAE adopts a privacy-preserve training scheme and a Hamming tests scheme. To further improve model performance, Score-VAE employs a VAE network with dynamic loss, which exploits knowledge of multi-task learning, stopping gradients and distributions. Evaluation results with real-world IoT trace data collected from different scenarios demonstrate that Score-VAE can accurately discover the root causes of alarms triggered by the IoT anomaly detection system.	en_US
dc.description.scholarlevel	Graduate	en_US
dc.identifier.uri	http://hdl.handle.net/1828/14962
dc.language	English	eng
dc.language.iso	en	en_US
dc.rights	Available to the World Wide Web	en_US
dc.subject	Internet of Things	en_US
dc.subject	anomaly detection	en_US
dc.subject	machine learning	en_US
dc.title	Machine learning and unlearning for IoT anomaly detection	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Fan_Jiamin_PhD_2023.pdf
Size:: 9.56 MB
Format:: Adobe Portable Document Format
Description:: PhD dessertation

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (ETD)