Graduate Projects (Electrical and Computer Engineering)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 164
  • Item
    Simulating NoC Mesh and Torus Topologies
    (2017) Khan, Muhammad Ahsan; Gebali, Fayez
    An interconnection network is a programmable system that transports the data between the terminals. The interconnection is important because of the limiting factor in the performance of many systems. Network on chip (NoC) plays a vital role in the memory latency or memory bandwidth, which are the two key performances in computer systems. Apart from them the topologies are also one of the most important performance factors. In this project the two most signi cant topologies, mesh topology and torus topology are studied. A study is conducted on the above two mentioned topologies by injecting various it rates with di erent combinations of virtual channels. The main objective of this project is to explain how virtual channels are e ective on throughput and latency on di erent topologies. The comparative evaluation of topologies will help to explore more features in detail which will be helping in future developing in NoC.
  • Item
    Breast Cancer Prediction Using Machine Learning Algorithms
    (2024) Shahzad, Zeeshan Ali; Gulliver, T. Aaron
    Breast cancer has become a pressing global health issue with its prevalence increasing worldwide. The rise in breast cancer cases is a cause for concern as it not only affects the physical and emotional well-being of individuals but also places a significant burden on the healthcare system. Early detection and timely intervention are critical factors in effectively combatting this disease. The ability to predict and diagnose breast cancer at its earliest stages can have a profound difference in patient outcomes, potentially saving countless lives. In recent years, the importance of Machine Learning (ML) in the field of healthcare has become paramount. This study considers the utility of supervised ML models to address the challenges posed by breast cancer using the publicly available Breast Cancer Wisconsin (Diagnostic) dataset from the University of California Irvine (UCI) ML repository. The Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbors (KNN) classifiers are implemented using Jupyter Notebook with Python programming. The goal of the proposed methodology is accurate breast cancer prediction. First, data preprocessing is employed to clean the dataset by removing null values and duplicates, and handling missing data. In order to balance the target labels of the dataset, Synthetic Minority Oversampling Technique (SMOTE) is employed. Then, Principal Component Analysis (PCA) is used to reduce the dimensions of the dataset. The number of components is varied (n=2, 5, 10, 15). For training and testing the ML models, five data splits, namely 80/20, 70/30, 50/50, 30/70, and 20/80 are employed to assess the impact on model performance. The performance of the models is evaluated using the metrics accuracy, precision, recall, F1-score, and execution time. The results obtained show that SVM and Logistic Regression outperform the other models with SVM having an accuracy of 98.2% and an execution time of 9.99 ms with an 80/20 split using 10 features and Logistic Regression having an accuracy of 97.9% and an execution time of 8.42 ms with a 50/50 split using 15 features.
  • Item
    PUF Evaluation Metrics on 7 Series FPGA: Comparative Analysis of Arbiter, XOR Arbiter, and Double Arbiter PUFs for Uniqueness, Randomness, and Stability
    (2024-01-23) Lunagariya, janviben; Sima, Mihai; Papadopoulos, Chris
    Hardware security modules play a crucial role in protecting and preserving technologically integrated systems that are used in daily life. They employ cryptographic protocols to secure a system against adversaries. Generally, cryptographic algorithms and security keys are essential for maintaining the security of a system. Cryptography uses a secret key to encipher and decipher the data. The confidential keys are stored in a non-volatile memory, making it easily accessible to potential attackers.The hardware security primitive, Physical Unclonable Function (PUF) is a promising alternative for enhancing the security of interconnected devices. Physical Unclonable Functions are specialized circuit components that exploit the subtle variations inherent in microchip fabrication. These variances enable the creation of unique "fingerprint" output sequences, or responses, in reaction to specific inputs or challenges. The random, device-specific nature of these variations and their replication difficulty - even by the original manufacturer using identical methods, tools, and parameters - make PUFs an excellent choice for cryptographic key generation. Moreover, these characteristics are designed to remain unchanged, reinforcing their suitability for this application. The Arbiter-based Physically Unclonable Function (PUF) is a type of delay-based PUF that utilizes signal delay-line time differences. However, previous studies indicated that Arbiter PUF, when implemented on Xilinx Virtex-5 FPGAs, produced nearly identical responses by exhibiting low uniqueness. Other variants of Arbiter PUF, such as XOR Arbiter PUF and the Double Arbiter PUF, were introduced to address this issue. This novel technique generates highly unique responses from duplicated Arbiter PUFs on FPGAs at a comparable cost to the 2-XOR Arbiter PUF. The Double Arbiter PUF differs from the 2-XOR version in the mode of operation, particularly regarding wire assignment between the arbiter and the final selector output signals. This study evaluates these PUFs for uniqueness, randomness, and stability on Xilinx 7-series FPGA Devices and seeks to identify a new Arbiter PUF operation mode that is feasible for FPGA implementation. We propose the 3-1 Double Arbiter PUF, which includes an extra duplicated Arbiter PUF, yielding three Arbiter PUFs that produce a 1-bit response. When compared with the 3-XOR Arbiter PUF, the 3-1 Double Arbiter PUF shows better response uniqueness and randomness estimated at 50%, indicating that the evaluation metrices of the PUF can be improved by using a new Arbiter PUF operation mode. We show that we can improve uniqueness and randomness using the new mode of operation for the Arbiter PUF performance characteristics for 16, 32, and 64-bit selector pairs for 65,536 responses.
  • Item
    Improving Large Graph Visualization Using a Paging Mechanism
    (2023-11-15) Jafarrangchi, Fatemeh; Traore, Issa; Woungang, Isaac
    The activity and event network (AEN) model captures the network activities and events using a large random dynamic graph that is continuously maintained and updated as new information and data arrive. The AEN engine leverages extensive graph database technology in creating, maintaining, and visualizing the produced graph. Because the graph can become very large (e.g., have millions of nodes) over time, a visual analysis by a security analyst can be unwieldy, overwhelming, and thus counterproductive. This thesis presents an extension of the AEN graph engine visualization module, which consists on developing a timeline feature that improves the visualization process by allowing the analyst to access and work on segments or portions of the graph as needed. A graph paging mechanism was developed to implement the timeline feature, where a graph is structured into multiple pages that enable navigating back and forth and other related functionality. To reduce memory/storage usage, the proposed graph paging mechanism supports consolidating fine-grain changes into coarser-grain ones without losing the timeline integrity and altering the order in which the changes occurred. An experimental evaluation using the CIC 2017 IDS evaluation dataset yielded improved results in visualizing and handling large graphs while achieving low performance overhead in terms of response time, CPU time, and memory utilization.
  • Item
    Optimizing the Movement Path in a Network of Ground and Aerial Mobile Robots in the Field of Communications and Transportation
    (2023-10-12) Doostniae, Reza; Baniasadi, Amirali
    In digital systems and industries related to the IoT, especially in mechatronic systems and communication industries, as well as land and air transportation, motion sensors and routing systems play a crucial and indispensable role. The primary goal of this research is to find the optimal movement path while effectively avoiding obstacles . In other words, the path should be chosen in a way that ensures the robot does not collide with any obstacles, whether they are stationary or moving objects. To achieve this, the shapes of obstacles are extracted, and the need to avoid them is determined. By implementing obstacle avoidance algorithms , we can guarantee safe and reliable navigation for the robot. This paper describes an object-oriented software system for continuous optimization by a new metaheuristic method, the Bat Algorithm, based on the echolocation behavior of bats. Bat algorithm was successfully used for many optimization problems and there is also a corresponding program in MATLAB
  • Item
    Assessing the Effectiveness of Snort in Detecting Malicious URLs
    (2023-08-29) Zuva, Simbarashe; Traore, Issa; Wougang, Isaac
    Web attacks have been on the rise in recent years, and organisations are constantly searching for new and better ways to detect and block the corresponding attack vectors. Some of the prominent attributes of web attack vectors are malicious domains used to trigger or sustain these attacks, for instance, through launching phishing attacks or by hosting command and control (C&C) infrastructures. Detecting accurately and blocking the malicious domains has become increasingly difficult due to the evasive techniques used by the attackers to mask their activities by emulating legitimate network traffic to an accurately high degree and through tactics such as domain generation algorithms (DGA) and fast flux DNS. Snort, an open-source intrusion detection system, has traditionally been utilized to detect network intrusions through network traffic signature analysis. However, while snort has subsequently been upgraded to enable the detection of web attacks, its effectiveness in detecting malicious domains is questionable because of the coarse-grained nature of web attack signatures. At the same time, it is a reasonable proposition to assume that there would be an implicit relation between granular attacks and the usage/occurrence of malicious domains. In this project, a platform is developed to explore and assess experimentally the ability of snort in detecting malicious domains. The proposed approach extracts some useful indicators of compromise (IoC) from the granular Snort alerts triggered by web visits and leverage such information to establish whether the corresponding URLs are benign or malicious. The platform was built around a headless chrome browser and the pfSense open-source firewall which has a built-in snort engine. The experimental evaluation, conducted using a public dataset of benign and malicious domains, yielded important insights into the strengths and limitations of snort in detecting malicious domains, and helped identify directions for future improvements.
  • Item
    Design and Implementation of a new Visualization Aided Anomaly Detection Framework
    (2023-08-21) Farag, Ahmed; Traore, Issa; Yousef, Waleed
    In today's data-driven world, the identification of unusual patterns or anomalies in data sets has become increasingly vital, especially in the realm of security data where the detection of these atypical patterns can preempt security threats. This is the juncture where our work, as an extension to UNAVOIDS (Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring), becomes instrumental. UNAVOIDS is a distinctive model that integrates specialized techniques for both detection algorithms and visualization methods, operating within a unique space known as the Neighborhood Cumulative Distribution Function (NCDF) space. In this two-dimensional space, each data point is transformed into a unique 2D curve, facilitating visual identification and examination. A salient feature of UNAVOIDS is its fully unsupervised nature, which requires neither prior training nor specific data inputs, eliminating the need for parameter selection or tuning. Another feature is its assignment of a deviation score to each unusual data point, offering a clear gauge of its abnormality. In this study, we successfully deployed UNAVOIDS across four platforms: the Python Package Index (PyPI), a Restful API, a software named VAAD—which integrates UNAVOIDS with the Data Visualization Platform (DVP)—, and a custom Microsoft PowerBivisual. Two main challenges were tackled in this implementation. First, handling large datasets within the RESTful API posed an ongoing challenge. To address this, we adopted compression over file streaming, enabling the efficient transmission of data within the API constraints. Second, creating an interactive visual representation presented a significant challenge due to the unique nature of the data, where each observation is mapped to a 2D curve. We overcame this challenge by mapping curve indices and implementing a reflection mechanism for interactivity between selected curves and other visuals. Our study contributes to the practical implementation and effectiveness of UNAVOIDS, and all these implementations along with their documentations are accessible from the official repository of the ISOT lab. These implementations, catering to users from various sectors including research and development, provide the versatility and effectiveness of UNAVOIDS in diverse environments.
  • Item
    A Long-Range Transmission Network for Animal Sighting in the Wilderness
    (2023-08-16) Zhang, Yan; Li, Kin Fun
    When wild animals are monitored in the vast wilderness of Canada, data transmission is considerably challenging due to the lack of effective network service provided by telecom operators or carriers, especially in sparsely populated areas. A Long-Range Transmission Network for a wildlife detection system using low-power and low-cost embedded software and hardware is designed and implemented. The objective of the system is to transmit the results of wildlife identification with environmental data through independent long-range networking. The system consists of a Camera-embedded System for wildlife image capturing and environmental data logging, a user system for scanning images and notifications, and a LoRaWAN networking for Long-Range Transmission. Once a targeted animal is detected and identified, the system issues an alarm in the monitored area and sends a LoRa data frame to an application server for further analysis and user notification. The transmission distance of data is effectively extended through the relay between nodes. The system can process up to nine frames per second from the camera and identify the designated wildlife with high accuracy by asynchronous multi-threading in a low-cost embedded system. The application could be beneficial for a variety of purposes in the vast and diverse wilderness areas, such as traffic alarms for large wild animals’ crossing, monitoring wildlife migrations by biologists, or a warning system in urban areas when there is a potential threat to the public such as approaching dangerous animals.
  • Item
    Stock Market Prediction using LSTM and Markov Chain Models: A Case Study of Royal Bank of Canada Stock
    (2023-08-09) Kumar, Amer; Gebali, Fayez; El-Kharashi, Mohamed Watheq
    Stock price prediction is one of the most important aspects of financial investment. This research aims to provide insights into the dynamics of stock prices, enabling more informed decision-making in financial investments by combining these two modeling approaches. Using a four-layer long short-term memory (LSTM) architecture and the Root Mean Square Error (RMSE) as the loss function, we aim to capture temporal dependencies and patterns to predict closing prices. Furthermore, we employ a threestate Markov chain to estimate the transition matrix, and metrics like steady-state distribution and mean hitting times have been used to calculate the matrix. The preliminary results indicate that this approach shows promising results for stock market prediction as LSTM has predictive power that caters more to long-term temporal trends while Markov Chain provides probabilistic values for staying and transitioning to states. The findings of the study highlight the effectiveness of combining LSTM and Markov Chain in capturing the intricate dynamics of the stock market data and predicting stock market prices.
  • Item
    Stock Price Prediction Using Natural Language Processing and Machine Learning
    (2023-08-09) Amer, Ahmed; Gebali, Fayez; El-Kharashi, Mohamed
    Predicting the stock market is an infamous problem that many people have tried to solve. Can real time textual data in the form of tweets be used to predict stock movements? In this project, the use of different natural language processing methods are used to process twitter data to try to find out their sentiment. Furthermore, based on the sentiment, further analysis is done using machine learning techniques to try and predict next day returns for individual stocks. Two and Three different features were used to try and predict the next day's percentage change. The metrics used to assess the methodology were accuracy, precision and cumulative percentage gain or loss using a specific strategy or method. The results of this project suggest that using tweets as input for natural language processing and machine learning can achieve average accuracies and result in strategies that have consistently beaten the market in terms of cumulative returns.
  • Item
    Optimizing Demand Response in Deregulated Electricity Markets: A Customer-Centric Game Theory Approach
    (2023-08-01) Goudarzi, Arman; Traore, Issa
    In the era of IoT-enabled smart grid technologies and the ever-increasing integration of renewable energy sources, the need for efficient and customer-oriented demand response programs is becoming crucial for the stability and flexibility of power systems. In this regard, this report presents an innovative customercentric game theory-based demand response (CC-GTDR) for managing electricity consumption during periods of high demand in a deregulated electricity market. The proposed CC-GTDR method exceptionally combines both incentive and price-based demand response programs while emphasizing customer benefits and flexibility of choice. A fuzzy analytic hierarchy process based on non-linear programming (FAHPNLP) is employed to determine the optimum weightings of the designed multi-criteria objective function of the study. To solve the proposed model, a hybrid optimization algorithm is implemented, which merges enthusiasm-assisted teaching and learning-based optimization (EaTLBO) with an enhanced variant of particle swarm optimization (EPSO). The study investigates various dynamic pricing mechanisms, such as time-of-use pricing, real-time pricing, and their combinations, in deregulated electricity markets. The proposed approach demonstrates significant improvements in overall load and peak load reductions, as well as utility profit gains. Additionally, the integration of renewable energy sources (RESs) within the CCGTDR and profit-based dynamic cost environmental economic dispatch (DCEED) model results in substantial reductions in NOx emissions. The developed CC-GTDR model contributes to a more resilient and efficient electrical system by prioritizing customer engagement and empowerment, ultimately enhancing grid reliability and facilitating the integration of renewable resources.
  • Item
    Maximizing Energy Efficiency in Energy Management System using Optimization Algorithm in Microgrids
    (2023-07-04) Shah, Sarthak Umeshkumar; Baniasadi, Dr. Amirali
    Due to technological advancements, population growth, and urbanization, the demand for electricity is increasing day by day. Meeting the global electricity demand is a challenge considering its socio-economic and environmental impacts. Energy Management Systems (EMS) are becoming a vital topic of discussion, as renewable energy sources such as solar, wind, hydro, and energy storage systems are being considered. EMS is becoming an essential component of a microgrid, as the system works when connected with the grid and also in islanded mode, connected with renewable sources. However, the increasing use of renewable energy resources is causing operational efficiency and reliability issues. Additionally, meeting demands during high energy consumption and reducing costs during high demand for electricity are challenging. Therefore, optimization techniques are being implemented to solve issues related to demand response and cost reduction. The proposed approach focuses on minimizing the total cost of energy consumption, taking into account demand, load control, energy storage systems, and PV systems using the novel algorithm Ant Colony Optimization. The results demonstrate that the Ant Colony Optimization algorithm is effective in reducing costs and can be used to address increasing demands and constraints related to energy management in microgrids. Future work may include fault detection, power quality improvement through optimization algorithms in the real-world grid model, and automating it to prevent losses, power outages, and asset failures.
  • Item
    Deep Learning-Based Automatic Modulation Classification for Telecommunication Systems
    (2023-05-31) Sanatimehrizi, Sara; Baniasadi, Amirali
    Modulation schemes play a crucial role in various communication systems, as they enable the transmission of information through electromagnetic signals. Accurately identifying the modulation scheme employed in a signal is essential for efficient signal processing, interference mitigation, and overall system performance. However, predicting modulation schemes based solely on their features remains a challenging task due to the complexity and variability of modern communication signals. This thesis addresses the problem of modulation scheme prediction by developing and evaluating a model and algorithm that capable to analyze the distinctive features of different modulation schemes. The dataset used in this study is a real-time series dataset obtained from MCI, consisting of 36,000 signals with features such as Modulation, In-phase Signal, Quadrature Signal, and Signal-to-Interference-plus-Noise Ratio. The goal is to train a fully connected neural network to accurately classify and predict the modulation used in unknown signals. Experimental results demonstrate the effectiveness of the proposed algorithm, with a validation accuracy of 83.33% and an overall accuracy of 93.90%. While these results indicate the algorithm's capability to predict modulation types and classify instances accurately, it is important to acknowledge that there is room for improvement. In comparison to real-world scenarios, further enhancements can be made to achieve even better results. It is essential to recognize that the proposed model and algorithm provide a solid foundation for enhancing signal processing and system performance in communication systems. By accurately identifying modulation schemes, this research contributes to the advancement of efficient communication techniques. Future work in this area has the potential to build upon these findings and further refine the algorithm, potentially yielding improved accuracy and robustness when applied to real-world scenarios.
  • Item
    Authentication Algorithms modelling and Simulations of an Arbiter PUF
    (2023-05-08) Khan, Vaseem; Gebali, Fayez
    Physical attacks represent a threat to intellectual property, confidential data, and service security because they typically involve reading and modifying data. Attackers frequently have access to tools and resources that can be utilised, either invasively or non-invasively, to read or corrupt memory. Secret keys for cryptographic techniques are often kept in memory. Physical Unclonable Functions (PUFs), which dynamically construct keys only when necessary and do not need to be retained on a powered-off chip, appear to be a potential remedy for such issues. PUFs are circuit primitives that use inherent differences of microchips made during the manufacturing process to produce distinctive "fingerprint" output sequences (response) to a particular input (challenge). The PUF is a fantastic choice for creating cryptographic keys since these modifications are stochastic, device-specific, hard to duplicate even by the same manufacturer using similar procedures, tools, and settings, and are intended to be static. The delay based PUF, an arbiter PUF, is the subject of our study. It benefits from the differences in propagation delays that are present between two symmetrical channels. Without the need for helper data or secure sketch techniques, we created some of the most modern algorithms that may be used to enable solid authentication and secret key generation. Finally, we present data that demonstrates how these devices behave and how their functionality is influenced by the chosen authentication mechanism and key system variables.
  • Item
    Improving the Efficiency of a New Malicious Domain Prediction System
    (2023-05-02) Arora, Aashish; Gebali, Dr. Fayez; Traore, Dr. Issa
    Cybersecurity is a key concern in today’s digital era and healthy number of cyber-attacks are launched every day. Malicious domains represent one of the media through which attacks are launched and malicious artifacts are spread. While many malicious domains are known and blacklisted, a sizable number of new domains registered by cybercriminals are unknown to blacklist maintainers, and as such can be used undetected in ongoing and future hacking campaigns. The Domain Prediction System (DPS) is a prototype malicious domain prediction system developed by one of the industry partners of the ISOT Lab. Based on a small number of seed blacklisted domains, DPS generates a list of associated registered domains that can potentially be malicious in the future. Predicting malicious domains is a long slog process that involves mining and iterating over billions registered domains. This project focuses on reviewing, evaluating, and improving the performance of the prototype implementation of DPS. A code was provided but had several efficiency issues and inaccurate outputs. As a result, this report identifies problems in the existing code and proposes solutions to improve performance. Additionally, some experimental details are presented to demonstrate effectiveness. Furthermore, a Flask web-based application was developed to host the project and make it easier to use.
  • Item
    Dark Web Traffic Detection Using Supervised Machine Learning
    (2023-04-29) Nezhad, Sahra Zangeneh; Baniasadi, Amirali
    The purpose of this study is to examine the feasibility of utilizing machine learning algorithms for distinguishing and categorizing VPN and TOR traffic on the dark web. The dark web, often referred to as the inaccessible or shadow aspect of the internet, is marked by its anonymity and inability to be indexed by search engines, making it a common platform for illegal activities such as drug trafficking, money laundering, and cybercrime. Both Virtual Private Networks (VPNs) and The Onion Router (TOR) are commonly employed technologies for anonymizing web traffic and accessing the dark web. While these technologies can be used for legitimate purposes, such as protecting the privacy and bypassing internet censorship, they can also be exploited by cybercriminals. To achieve our objective, we will leverage a dataset of dark web traffic, specifically, the CIC-Darknet2020 dataset, which comprises a comprehensive and diverse collection of network traffic captures from the dark web, incorporating traffic features from both The Onion Router (TOR) and Virtual Private Network (VPN) technologies. Our model will be constructed using supervised machine learning methods, specifically classification algorithms including Random Forest (RF), Support Vector Machine (SVM) , Naive Bayes (NB) , and the Decision Tree (J48) classifiers. The experiments will be performed using five-fold and ten-fold cross-validation, and 66/34 and 80/20 percentage splits, utilizing the open-source software WEKA. The performance of the model will be evaluated based on parameters such as execution time, accuracy, precision, F-measure, and recall. The results of this study indicate that the Decision Tree (J48) classifier surpasses the other classifiers in terms of accuracy, achieving 99.6% accuracy with an execution time of 15 seconds for a ten-fold cross-validation.
  • Item
    Assessing the Effectiveness of Malicious Domain Prediction Using Machine Learning
    (2023-04-28) Bu, Jinlin; Traore, Issa
    Malicious domains are a serious threat to network security as they deceive users into accessing them, leading to information disclosure, identity theft, and economic losses. Despite efforts to tackle this problem, cybercriminals continue to buy and use brand-new domains to evade detection, bypassing network defenses and endangering users' security. Predicting future malicious domains in advance can greatly reduce their harm. The Domain Prediction System (DPS) developed by one of the industry partners of the Information Security and Object Technology (ISOT) Lab aims to predict in advance potentially malicious domains, but the effectiveness of the system needs to be tested as it is uncertain whether the predicted domains will be used for malicious purposes. This report introduces the problem's background and a description of the dataset used in the experiments. Then evaluates the effectiveness of the DPS system by comparing two sets of models: baseline and predictive models. The baseline models were obtained by training and testing different machine learning (ML) classifiers using existing (known) benign and malicious domains. The predictive models were obtained by training the ML classifiers using domains generated by the DPS that may be used for malicious purposes, and testing using the same benign domains as previously. The evaluation of the predictive models on the same test set as the baseline models yielded comparable performance measures, providing a strong indication of the utility and credibility of the predicted domains.
  • Item
    DDoS Attacks Detection using Machine Learning
    (2023-04-28) Sabir, Mohammed Younus; Gebali, Fayez
    The advancement in information technology has created a new era named as Internet of Things (IoT). This new technology has allowed things to be connected to the Internet, for example smart TVs, printers, cameras, smartphones, smartwatches, etc. This trend has enhanced the lifestyle of the users of these devices, and it provides new services and applications to them. The fast growth of IOT has resulted in inclusion and connection of these devices a predominant procedure. Though there are many advantages due to usage of IoT devices, there are different challenges as well due to its usage. Among the many existing challenges, Distributed Denial of Service(DDoS) attack is a relatively simple but very powerful technique to attack intranet and Internet resources. Usually, in this attack, the legitimate users are deprived of using web-based services by many compromised machines. DDoS attacks can be implemented in network, transport and application layers using different protocols, such as TCP, UDP, ICMP and HTTP. The CIC-DDoS2019 dataset consists of 11 different DDoS attacks and benign traffic with 88 features. In this report, data for six DDoS attacks and benign data has been used. Info Gain Attribute Evaluator was used to extract the twenty-four most important features. The Machine Learning (ML) algorithms studied are Bayesian Network (BayesNet) , K-Nearest Neighbors (KNN) , J48. The experiments have been performed using the Waikato Environment for Knowledge Analysis (WEKA) tool with five-fold validation. Accuracy, Precision, Recall, F-measure, and execution time have been used as the performance metrics. From the results obtained, J48 performed better among all the algorithms in terms of accuracy, precision, recall and F-measure.
  • Item
    Agentless Host Intrusion Detection Using Machine Learning Techniques
    (2023-04-12) Jianfeng, Liu; Issa, Traore
    With the rise in the frequency and sophistication of cyberattacks, host intrusion detection systems (HIDSs) have become an essential component in monitoring and protecting endpoints in the network security perimeter. Current HIDSs rely on a local software agent deployed on the monitored host that collects and processes or pre-processes required data. However, this architecture has adverse effects such as increased attack surface, and high maintenance cost and overhead. Recently, a generic agentless endpoint framework that collects transparently raw data from the monitored host was proposed by Ghaleb et al [1] along with a basic threshold-based statistical model for intrusion detection as an initial proof of concept. This report extends the generic agentless framework by collecting a new dataset with more attack vectors and developing and comparing six machine learning models, including k-nearest neighbors, logistic regression, naïve Bayes, decision tree, random forest, and support vector machine. The experimental evaluation using the collected dataset confirmed the feasibility of agentless host intrusion detection, with increased detection efficiency and effectiveness.
All items in UVicSpace are protected by copyright, with all rights reserved.