Enhancing Self-Organizing Maps with numerical criteria: a case study in SCADA networks




Wei, Tianming

Journal Title

Journal ISSN

Volume Title



Self-Organizing Maps (SOM) can provide a visualization for multi-dimensional data with two dimensional mappings. By applying unsupervised learning techniques to SOM representations, we can further enhance visual inspection for change detection. In order to obtain a more accurate measurement for the changes of self-organizing maps beyond simple visual inspection, we introduce the Gaussian Mixture Model (GMM) and Kullback-Leibler Divergence (KLD) on top of SOM trained maps. The main contribution in this dissertation focuses on adding numerical methods to SOM algorithms, with anomaly detection as example domain. Through extensive traced-based simulations, it is observed that our techniques can uncover anomalies with an accuracy of 100% at an anomaly mixture-rate as low as 12% from the CTU-13 dataset. Tuning of the KLD threshold further reduces the mixture-rate to 7%, significantly augmenting visual inspection to assist in detecting low-rate anomalies. Suitable hierarchical and distributed SOM-based approaches are also explored, along with other approaches in the literature. Hierarchies in SOM can show the correlations among the neural cells on the self-organizing maps. In order to obtain a higher accuracy for anomaly detection, a new dimension of labels is suggested to be added in the second layer of SOM training. Also for more general distributed SOM-based algorithms, we investigate the use of principal component analysis (PCA) for the separation of dimensions. With the transformed dataset from PCA, the inner dependencies can be reserved in a manageable scale. As a case study, this dissertation uses a SOM-based approach for anomaly detection in Supervisory Control And Data Acquisition (SCADA) networks. We further investigate the use of SOM for the Quality of Service (QoS) in the scenario of wireless SCADA networks. Solving the problem of long computing time of optimizing the cached contents, the new SOM-based approach can also learn and predict the sub-optimal locations for the caching while maintaining a prediction error of 28%.



Self-Organizing Maps, Numerical Methods, SCADA Networks