Graduate Projects (Computer Science)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 81
  • Item
    An Exploratory Study of Data Physicalization Using Household Objects
    (2024-03-20) Ramesh, Shanker; Somanath, Sowmya; Perin, Charles
    We explore people’s perceptions and ideas regarding creating data physicalizations using household objects such as chairs, flower pots, and photo frames to enable data-driven self-reflection. By conducting a sketching-based qualitative study with 11 participants we identified styles of physical encoding participants used, strategies for creating physicalizations they employed, and techniques for constructing physicalizations they relied on. From the study results we contribute i) a bottom-up list of physical variables people might use for different data types, ii) a comparison between the theory about visual variables and the empirical use of physical variables, iii) an identification of the need for flexible taxonomies for physical representations, and iv) a discussion of the relationship between social pressure and the location of physical representations in the household.
  • Item
    Unified User Interface Development for Cross- Disciplinary Scientific Computing Applications
    (2023-11-27) Shah, Prerak; Schneider, Dr. Teseo
    In this study, we create a cross-platform unified user interface for general-purpose scientific computing. Our project work is guided by the intention to face the growing importance of scientific computing, especially in the field of geometric computing. Making user visual interfaces has remained a non-trivial task, even with the swift advancement of computational tools and their incorporation into other fields. The ability to combine developer integration features with user accessibility is something that many current systems struggle with. Finding out whether it is possible to create a single user interface that encompasses all of the primary paradigms of scientific computing and allows developers to easily integrate their code bases while still being understandable to users in unrelated professions is the main purpose of this project. In order to achieve this, the suggested user interface solution makes use of the cross-platform compatibility and stability offered by web-based applications. By leveraging a robust ecosystem of TypeScript and JavaScript libraries, the user interface (UI) operates within mainstream browsers and benefits from the stability of browsers such as Chrome. The project also presents a novel strategy utilising an on-site Representational State Transfer (REST) server and tackles the challenging problem of cross-platform functionality. This server is in charge of binary execution, shape conversion (particularly for geometric objects), file storage, and query handling. The ultimate goal is to create a general interface specification protocol and a user interface that can be customised to meet the specific needs of different scientific computing projects, even those that support dynamic components.
  • Item
    A Conceptual Design: The Verification of QIR’s Generic Quantum Optimization Passes
    (2023-08-30) Naghavi, Paria; Weber, Jens
    This report outlines the design of a verification system for QIR quantum optimization passes using Vellvm interpreter, along with QWIRE semantics and memory model. The primary objective is to establish a foundation for future advancements in quantum optimization and verification techniques. The system focuses on a small circuit in a one-qubit system but can be scaled up for more complex analytics and optimization passes in QAT. The design includes three components: the QIR input, Vellvm parser, and verification tools of QWIRE. The input consists of the QIR quantum blocks before and after the optimization pass. The parser utilizes QWIRE semantics to map the QIR code to Coq objects and types, generating a Coq AST for analysis. The verification is performed by Coq's proof assistant, checking for well-formedness and equivalence of density matrices from the QIR code blocks corresponding to pre and post quantum optimization pass application. The design can be extended to other generic or targeted quantum transformations in QAT, including target-specific gate merging. The overall goal is to provide an agnostic and hybrid-compatible verification system that can improve the reliability of quantum computations on diverse hardware platforms.
  • Item
    CADConverter-For Converting Complex CAD files into HDF5 Format
    (2023-08-29)
    In this paper, we offer a very effective approach for extracting data from Computer- Aided Design And Manufacturing (CAD/CAM) step files and converted them to the self-descriptive and open-source HDF5 format. This format provides for the seamless integration of data and metadata inside a single file, making it ideal for dealing with complex data. The difficulties associated with converting CAD files between applications and the sophisticated data organisation inside multiple CAD file formats have motivated our research. The International standard ISO 10303 or STEP (’STandard for the Exchange of Product Model data’) addresses the format complexity and CAD data conversion problem between different computer-aided design (CAD) systems. We reviewed current CAD datasets like ABC and Fusion360 for geometrical data processing and created an algorithm that converts CAD models to open-source a human-readable format to feed deep learning algorithms directly and hence eliminating the requirement for third-party software for file conversions will be boosting the efficiency of CAD data administration and processing. To do this, we employ a method that combines data pre-processing as well as the development of an algorithm that converts around one million CAD models to the HDF5 format and uses the derived data for a number of applications, including machine learning and data analysis. The findings of the study lead to a better understanding of CAD data processing procedures and establish the framework for future research and development and integrate the converted dataset with PyTorch and TensorFlow to address current CAD system constraints,such as the limitations of standard neutral 3D CAD file formats that are difficult to understand. It opens the way for more efficient and simplified procedures in CAD-intensive sectors.
  • Item
    CrowdLabs: A platform for human behaviour and perception studies
    (2023-08-29) Parikh, Kunal; Haworth, Brandon
    Advancements in technology encourage researchers to decipher several intricate conceptions such as human perception and behavioral studies. Though engineering science supports scientists, it in turn demands enormous efforts from them. Therefore, the necessity for a system that reduces the academicians’ time and toil becomes inevitable. The proposed system – CrowdLabs allows the researchers to conduct a wide variety of human experiments and simulations and can scale up to capture fast-flowing data streams without them worrying about most of the underlying engineering concerns. The apparatus consists of two major traits: The experimenter who creates new trials and collects several forms of data for their studies, and the participants who take part in these experiments. This modular framework offers a minimal backend for different perspective controllers, data collection, and control to the scientists for loading and managing their experimental designs, repetition of scenes, randomizing sequences, and integrating training through a simple interface. The system was assessed for an interactive experiment to determine the system's capability to conduct a wide variety of complex human perception and behavior experiments. Subsequently, the system’s performance was evaluated for CPU, GPU, memory, and frame rates. This analysis indicated that the apparatus was responsive and robust for all the experiment scenarios. Furthermore, the system was evaluated extensively and several avenues for future work have been identified.
  • Item
    Zero Trust Network Architecture
    (2023-07-04) Srinivasan, Priyadharsini; Wu, Kui
    In light of the rapid advancement of digital technology and the increasing popularity of cloud- based services, it has become necessary to revisit traditional approaches to cybersecurity. Businesses are facing ever-more sophisticated cyberthreats, both from within and outside their networks, which have exposed the limitations of perimeter-based security solutions. The zero-trust architecture (ZTA) has emerged as a promising model for cybersecurity, fo- cusing on resource security instead of network perimeter protection. This project provides a comprehensive overview of ZTA, including its fundamental principles and the Zero Trust Network Access (ZTNA) architecture. The project focuses on an in-depth analysis of Cisco Duo’s Multi-Factor Authentica- tion (MFA) system using Wireshark to capture network traffic on a local PC. This analysis provides a comprehensive understanding of the user identity verification process using mul- tiple authentication factors. Additionally, the project discusses the driving forces behind the adoption of ZTA, as well as the challenges and opportunities that it presents. The project explores real-world ZTA implementations, including Google’s BeyondCorp and Microsoft’s Zero Trust Network Architecture. The application of ZTA in various fields, including big data, cloud computing, and the Internet of Things (IoT), is also investigated. The project concludes by discussing potential future research directions in ZTA, empha- sizing the need for more complex trust algorithms, continuous verification and authentication techniques, and standardized frameworks for applying ZTA in various sectors and use cases. Overall, this project provides a comprehensive and detailed examination of the zero-trust architecture, its applications, and its potential for improving cybersecurity in an increasingly digitized world.
  • Item
    Agile Requirements Change Management Model For Global Software Development
    (2023-05-22) Koulecar, Neha Sheilesh; Damian, Daniela
    We propose a comprehensive and robust agile requirements change management (ARCM- GSD) model that addresses the limitations of existing models and is tailored for agile software development in the global software development paradigm. To achieve this goal, we conducted an exhaustive literature review and an empirical study with RCM industry experts. Our study evaluated the effectiveness of the proposed RCM model in a real-world setting and identified any limitations or areas for improvement. The results of our study provide valuable insights into how the proposed ARCM-GSD model can be applied in agile global software development environments to improve software development practices and optimize project success rates.
  • Item
    SmartCAD For Exploring Complex CAD files in YAML Format
    (2023-04-28) Madduri, Vamsi Naga Sai Chandra; Schneider, Teseo
    In this study, we develop an efficient approach for sampling and extracting data from Computer-Aided Design And Manufacturing (CAD/CAM) files using an open-source YAML format. Our project is driven by the challenges encountered while performing file conversions between multiple CAD software and the complex data organization within CAD file structures, including proprietary, Model-Based Definition (MBD), and non-MBD CAD files. By employing the YAML format for CAD files, our approach eliminates the need for third-party software for file conversions, reduces compatibility issues, and improves the efficiency of CAD data management and processing. To achieve this, we implement a methodology involving preprocessing, feature extraction, and computing winding numbers from the YAML-formatted CAD files. The proposed method serves as a foundation for a future project that will convert approximately one million CAD models into the YAML format and leverage the extracted data for various purposes, such as model learning and data analysis. The findings of this study contribute to a deeper understanding of CAD data processing techniques and provide a foundation for further research and development. Our work addresses the current limitations of CAD systems, including the restrictions of conventional neutral 3D CAD file formats. It paves the way for more effective and streamlined workflows in industries that rely heavily on CAD technology.
  • Item
    IoT Security Using Machine Learning Methods
    (2023-04-17) Hosseini Goki, Seyedamiryousef; Wu, Kui
    The rapid growth of internet-connected devices has made robust cybersecurity measures essential to protect against cyber threats. IoT cybersecurity includes various methods and technologies to secure internet-connected devices and systems from cyber attacks. The unique nature of IoT devices and systems poses several challenges to cybersecurity, including limited processing power, minimal security features, and vulnerability to attacks like DoS and DDoS. Cybersecurity strategies for IoT include encryption, authentication, access control, and threat detection and response, which utilize machine learning and artificial intelligence technologies to identify and respond to potential cyber attacks in real-time. The report discusses two projects related to cybersecurity in IoT environments, one focused on developing an intrusion detection system (IDS) based on deep learning algorithms to detect DDoS attacks, and another focused on identifying potential abnormalities in IoT networks using a fingerprint. These projects highlight the importance of prioritizing cybersecurity measures to protect against the growing number of cyber threats facing IoT devices and systems.
  • Item
    An exploratory study regarding the ease-of-use, comprehensibility, and usefulness of the Empirical Standards Checklists
    (2022-12-21) Cupryk, Cassandra; Storey, Margaret-Anne; Larios, Enrique
    Context/Background: Novice researchers have stated that being provided with guidelines for reviewing empirical research papers would be helpful. The Empirical Standards Checklist Generator is a tool that can generate a variety of Empirical Standards Checklists. An Empirical Standards Checklist contains the core criteria that can be used to review Software Engineering empirical papers. Moreover, my exploratory study aims to determine whether novice researchers could benefit from using the Empirical Standards Checklists to help them review Software Engineering empirical papers. Objective: To investigate whether novice researchers perceive the Empirical Standards Checklists as easy to understand, easy to use, and useful for reviewing empirical papers. Methods: Seven participants completed a survey to evaluate the Empirical Standards Checklists and then participated in a group discussion. During the survey, the participants used the appropriate Empirical Standard Checklists to review a qualitative survey paper and a repository mining paper. They then highlighted the items from the Empirical Standards Checklists that were difficult to comprehend. The participants also answered survey questions that exposed their perceptions of the Empirical Standards Checklists' comprehensibility, ease-of-use, and usefulness. Results: The majority of the participants had positive perceptions of the Empirical Standards Checklists' comprehensibility, ease-of-use, and usefulness. Conclusion: This exploratory study demonstrates that the Empirical Standards Checklist is a promising new tool for reviewing Software Engineering empirical research papers.
  • Item
    Data Visualization of COVID-19 in Canada
    (2022-09-30) Pan, Suyin; Wu, Kui; Thomo, Alex
    Data visualization has been essential in fighting the COVID-19 pandemic in the past two years. Interactive dashboards helped people track, analyze, and predict the spread of the disease effectively. In this project, we created tools for visualizing the COVID-19 data in Canada to demonstrate how vaccination helped combat the pandemic, what group of people were impacted the most, and what the variant of concern was in each wave.
  • Item
    User Concern in App Reviews, a study of perceived privacy violation among user sentiments and other contribute factors
    (2022-05-02) Cheng, Yue; Damian, Daniela; Ernst, Neil
    Privacy, a significant factor in software usage, also provides software developers with additional insights into how applications can be improved. However, it is a delicate matter that peeks into user behaviour to the amount of information they are willing to share. With the rise of mobile applications, another concerning factor of user information collection also became prominent. The existence of user chatter on the Google app store can help identify whether privacy concern is problematic or not. However, little research has been conducted to study privacy violations and their contributing factors. In this project, we proposed using an LDA based privacy identification model that assesses the factors relating to user concerns with privacy matters on the Google App Store user reviews. A total of 45,114,727 rows of data were scraped from the Google play store, which were later filtered and processed into workable data. With the help of the Gensim LDA library, we can identify a coherence score of 0.604 and eight topics of various subjects. We later arranged these subjects into their corresponding categories, which could be used to analyze why specific privacy terms are more sensitive while others are not.
  • Item
    Abstract and Metaphoric visualization of emotionally sensitive data
    (2022-04-28) Malik, Mona
    Standard visualizations such as bar charts and scatterplots, especially those representing qualitative, emotionally sensitive issues, fail to build a connection between the data that the visualization represents and the viewer of the visualization. To address this challenge, the information visualization community has become increasingly interested in exploring creative visualization techniques that could potentially help viewers relate to the suffering and pain in emotionally sensitive data. We contribute to this open question by investigating whether visualizations that rely on metaphors (i.e., that involve existing mental images such as a tree or a person image) with some emotional connection can foster viewers’ empathy and engagement with the data. Specifically, we conducted an empirical study in which we compare the effect of visualization type (metaphoric and abstract) on people’s engagement and empathy when exposed to emotionally sensitive data (data about sexual harassment in academia). We designed a metaphoric visualization that relies on the metaphor of a flower symbolizing life, beauty, and fragility which might help the viewers to relate to the victim, build some emotional connection, and an abstract visualization that relies on purely geometric forms with which people should not have any existing emotional connection. In our study, we found no clear difference in engagement and empathy between metaphoric and abstract visualization. Our findings indicate that female participants were slightly more engaged and empathic with both visualizations compared to other participants. Additionally, we learned that measuring empathy in a data visualization is a complex task. Informed by these findings on how people engage and empathize with metaphoric and abstract visualization, newer and improved visualization and experiences can be developed for similar emotionally sensitive topics that are emotionally charged and fear-provoking.
  • Item
    Mining GitHub Issues for Bugs, Feature Requests and Questions
    (2021-12-14) Jokhio, Marvi; Ernst, Neil A.
    The maintenance and success of software projects highly depend on updated and bug-free code. To effectively process hundreds of daily new issues in big software projects, tools like issue tracking systems (ITS) play an important role but the critical aspect for issue processing and triaging needs assignment of accurate labels to determine their type (e.g., bug, feature, question and so on). This labelling is a time-consuming and tedious task and hence needs automated solutions. Automatic classification of issues is a challenging task due to semantically ambiguous text which contains code, links, package and method names, commands etc. In this work, we propose supervised and unsupervised mining techniques for GitHub issues using text only. In the supervised machine learning technique, we show that our model can classify issues in the bug, feature, and question classes with 86.7% AUC scores. We also proposed a technique to extract topics from GitHub issues using Latent Dirichlet Allocation (LDA) to analyze the type of development issues faced by developers.
  • Item
    Research on modern computer architecture optimization techniques: implementation and measurements for big data processing
    (2021-09-11) He, Yan; Weber, Jens
    With the rapid development of big data computing, our programmers need to improve the efficiency of big data processing. In our daily development process, we normally focus on bug-free deployment and often overlook another aspect of software engineering, performance optimization. Actually, a great deal of programming effort is required to achieve good performance. Many techniques are having the potential to significantly improve software performance. To achieve efficiency, I firstly conclude some optimization techniques dealing with memory-bound issues then moving to parallel programming to deal with compute-bound problems. As an example, we apply as many techniques that I mention in the report to a popular algorithm implementation, PageRank based on C++17. By providing constant feedback from performance measurement and profiling, we can see a drastic speed up after implementing all these optimization techniques. I also experiment with a real-world graph dataset provided by IMDb which can rank the top movies, and also evaluate performance before and after optimization. This report hopefully can provide potential future direction towards applying different optimization techniques to various big data processing applications.
  • Item
    Design and implementation of a safe bi-directional document exchange system in health information systems.
    (2021-04-27) Tyagi, Aakash; Weber, Jens
    Present-day health care systems are information-dense and progressively relying on computer- based information systems. Unfortunately, many of these information system’s use is predominantly restricted to the collection and retrieval of patient records. The failure to digitally communicate and transfer medical record information to other health information systems is undoubtedly a major barrier to efficient patient care and other clinical decision making, and the interoperability between heterogeneous systems still remains a challenge even with decades of investment. The implementation of a safe bi-directional document exchange system is a part of enhancing interoperability by providing a safe messenger system to support some of the core interaction medical documents. Lack of interoperability can also jeopardize patient safety and led to technology-related medical accidents. Hence, the project also focuses on the safety aspect of the system and provides an in-depth hazard analysis along with the implementation of a safety monitoring system to notify users about unexpected system failures.
  • Item
    Utility-based summarization for large graphs
    (2021-04-24) Singh, Jasbir; Srinivasan, Venkatesh; Thomo, Alex
    A fundamental challenge in graph mining is the ever increasing size of datasets. Graph summarization aims to find a compact representation resulting in faster query algorithms and reduced storage needs. The flip side of graph summarization is often loss of utility which significantly diminishes its usability. The key questions we address in this work is: How to summarize a graph with some loss of utility but above a user-specified threshold? Kumar and Efstathopoulos proposed a method to address the above question but with limitations. In our work, we present a highly scalable algorithm, which foregoes the expensive iterative process that hampers previous work. Our algorithm achieves this by combining a memory reduction technique and a novel binary-search approach. In contrast to the competition, we are able to handle web-scale graphs in a single machine without performance impediment as the utility threshold (and size of summary) decreases. Previous works suffer from conceptual limitations and lack of scalability.
  • Item
    Interactive edge-bundled parallel coordinates
    (2021-04-20) Li, Ziang; Storey, Margaret-Anne
    Parallel coordinates are a well-researched visualization technique to represent multidimensional data. There are many variations of parallel coordinates for different application needs. This report describes a visualization solution for large multidimensional data. The report also proposes an evaluation plan to investigate non-linear data relationships through variants of parallel coordinates. Based on this proposal, a web-based application of bundled parallel coordinates was designed and implemented. This visualization supports different clustering methods and violin plots to discover data distribution. It also has a series of interaction features such as brushing, reordering of axes, and zooming. A pilot study was also conducted to evaluate the perception of non-linear data relationships through variants of parallel coordinates, and the results helped the formation of a hypothesis: interactions help the discovery of non-linear data relationships, and standard parallel coordinates can better support such tasks than bundled parallel coordinates. An evaluation is needed in future work to evaluate this hypothesis.
  • Item
    Sentinel-3 Satellite Chlorophyll-a Concentration Validation
    (2020-08-31) Kaur, Gaganjot; Coady, Yvonne; Damian, Daniela
    Ocean health is very crucial for the balance of the ecosystem. Therefore, continuous monitoring of oceans is an important work being undertaken by remote sensing satellites. European Space Agency’s (ESA) Sentinel 3 is deployed in the earth’s orbit which keeps track of chlorophyll concentration in the oceans. This project is focused on validating the chlorophyll concentration data obtained by Sentinel 3. The data collected from the satellite is compared with the data directly retrieved from the ocean with the help of British Columbia (BC) ferries. The BC ferries are equipped with instruments and sensors that estimate the amount of chlorophyll. The area of study involves coastal British Columbia especially the southern Strait of Georgia. The goal of this project is to find how correlated the two datasets are. The project is extremely data-centric and involves extensive pre-processing and exploration followed by designing of efficient methodology for validation. The validation methodology is analyzed by statistical measures such as the Pearson Correlation. This project also sheds light on the assumptions and uncertainties involved in the data collection procedures which can affect the consistency and reliability of data.
  • Item
    Detecting Rework in Software: A Machine Learning Technique
    (2020-08-31) Nguyen, Minh Phuc; Damian, Daniela
    Rework, which is a major software development activity, is the additional effort needed to correct or improve completed work. Software organizations generally spend 40 to 50% of their effort on avoidable rework, which could have been foreseen. The cost of rework could exceed 50% of the project cost if it is identified late in development cycle. Therefore, rework detection is key to efficient software development and to reduce project cost. However, little research has been conducted to study about rework identification and categorization. In this project, we proposed a machine learning based rework detection model that can classify a development task as rework or not based on its description. With the help of data augmentation, we achieved an F1-score of 0.72 and an accuracy of 0.74. We designed a flexible rework detection service architecture, which could be integrated with collaborative development platforms. Based on the trained model, we implemented a proof of concept service in Python, and integrated it into Jira.
All items in UVicSpace are protected by copyright, with all rights reserved.