Techniques for analyzing high throughput molecular biology data

dc.contributor.authorLu, Linghong
dc.contributor.supervisorLesperance, Mary
dc.date.accessioned2011-09-09T21:00:58Z
dc.date.available2011-09-09T21:00:58Z
dc.date.copyright2011en_US
dc.date.issued2011-09-09
dc.degree.departmentDepartment of Mathematics and Statistics
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractThe application of ultrahigh-field Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) technology to identify and quantify metabolomics data is relatively new. An important feature of the FTICR-MS metabolomics data is the high percentage of missing values. In this thesis, missing value analysis showed that the missing value percentages were up to 50% and the control treatment, NaOH.ww, had the highest missing value percentage among the treatments in the aqueous FTICRMS sets. A simulation study was done for the FTICR-MS data to compare selection methods, the Kruskal-Wallis test and the MTP and Limma functions in Bioconductor, an open source project to facilitate the analysis of high-throughput data. The study showed that MTP was sensitive to variations among treatments, while the Kruskal- Wallis test was relatively conservative in detecting variations. As a result, MTP had a much higher false positive rate than Kruskal-Wallis test. The performance of Limma for sensitivity and false positive rate was between the Kruskal-Wallis test and MTP. Data sets with missing values were also simulated to assess the performance of imputation methods. Study showed that variances among treatments diminished or disappeared after imputations, but no new differentially expressed masses were created. This gave us confidence in using imputation methods. Summary of analysis results of some of the frogSCOPE data sets was given in the last chapter as an illustration.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/3566
dc.language.isoenen_US
dc.rights.tempAvailable to the World Wide Weben_US
dc.subjecthigh throughput biology dataen_US
dc.subjectstatistical analysisen_US
dc.subjectdifferential expressionen_US
dc.titleTechniques for analyzing high throughput molecular biology dataen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lu_Linghong_MSc_2011.pdf.pdf
Size:
1.06 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.74 KB
Format:
Item-specific license agreed upon to submission
Description: