Evaluation of network inference algorithms and their effects on network analysis for the study of small metabolomic data sets

dc.contributor.authorGreenyer, Haley
dc.contributor.supervisorJabbari, Hosna
dc.contributor.supervisorStege, Ulrike
dc.date.accessioned2022-05-24T23:46:19Z
dc.date.available2022-05-24T23:46:19Z
dc.date.copyright2022en_US
dc.date.issued2022-05-24
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractMotivation: Alzheimer’s Disease (AD) is a highly prevalent, neurodegenerative disease which causes gradual cognitive decline. As documented in the literature, evi- dence has recently mounted for the role of metabolic dysfunction in AD. Metabolomic data has therefore been increasingly used in AD studies. Metabolomic disease studies often suffer from small sample sizes and inflated false discovery rates. It is therefore of great importance to identify algorithms best suited for the inference of metabolic networks from small cohort disease studies. For future benchmarking, and for the development of new metabolic network inference methods, it is similarly important to identify appropriate performance measures for small sample sizes. Results: The performances of 13 different network inference algorithms, includ- ing correlation-based, regression-based, information theoretic, and hybrid methods, were assessed through benchmarking and structural network analyses. Benchmark- ing was performed on simulated data with known structures across six sample sizes using three different summative performance measures: area under the Receiver Op- erating Characteristic Curve, area under the Precision Recall Curve, and Matthews Correlation Coefficient. Structural analyses (commonly applied in disease studies), including betweenness, closeness, and eigenvector centrality were applied to simu- lated data. Differential network analysis was additionally applied to experimental AD data. Based on the performance measure benchmarking and network analysis results, I identified Probabilistic Context Likelihood Relatedness of Correlation with Biweight Midcorrelation (PCLRCb) (a novel variation of the PCLRC algorithm) to be best suited for the prediction of metabolic networks from small-cohort disease studies. Additionally, I identified Matthews Correlation Coefficient as the best mea- sure with which to evaluate the performance of metabolic network inference methods across small sample sizes.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/13964
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectAlzheimer'sen_US
dc.subjectMetabolomicsen_US
dc.subjectDifferential Network Analysisen_US
dc.subjectsample sizeen_US
dc.subjectnetwork inferenceen_US
dc.subjectmouse modelen_US
dc.titleEvaluation of network inference algorithms and their effects on network analysis for the study of small metabolomic data setsen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Greenyer_Haley_MASc_2022.pdf
Size:
6.59 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: