A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genomics

dc.contributor.authorGreenlaw, Keelin
dc.contributor.supervisorLesperance, M. L.
dc.contributor.supervisorNathoo, Farouk
dc.date.accessioned2015-08-26T22:41:18Z
dc.date.available2016-08-21T11:22:07Z
dc.date.copyright2015en_US
dc.date.issued2015-08-26
dc.degree.departmentDepartment of Mathematics and Statistics
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractRecent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. In this setting, high-dimensional regression for multi-SNP association analysis is challenging as the brain imaging phenotypes are multivariate and there is a desire to incorporate a biological group structure among SNPs based on their belonging genes. Wang et al. (Bioinformatics, 2012) have recently developed an approach for simultaneous estimation and SNP selection based on penalized regression with regularization based on a novel group l_{2,1}-norm penalty, which encourages sparsity at the gene level. A problem with the proposed approach is that it only provides a point estimate. We solve this problem by developing a corresponding Bayesian formulation based on a three-level hierarchical model that allows for full posterior inference using Gibbs sampling. For the selection of tuning parameters, we consider techniques based on: (i) a fully Bayes approach with hyperpriors, (ii) empirical Bayes with implementation based on a Monte Carlo EM algorithm, and (iii) cross-validation (CV). When the number of SNPs is greater than the number of observations we find that both the fully Bayes and empirical Bayes approaches overestimate the tuning parameters, leading to overshrinkage of regression coefficients. To understand this problem we derive an approximation to the marginal likelihood and investigate its shape under different settings. Our investigation sheds some light on the problem and suggests the use of cross-validation or its approximation with WAIC (Watanabe, 2010) when the number of SNPs is relatively large. Properties of our Gibbs-WAIC approach are investigated using a simulation study and we apply the methodology to a large dataset collected as part of the Alzheimer's Disease Neuroimaging Initiative.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/6577
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc/2.5/ca/*
dc.subjectBayesian Shrinkageen_US
dc.subjectImaging Genomicsen_US
dc.subjectTuning Parameter Selectionen_US
dc.subjectMultivariate Regressionen_US
dc.titleA Bayesian Group Sparse Multi-Task Regression Model for Imaging Genomicsen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Greenlaw_Keelin_MSc_2015.pdf
Size:
2.6 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.74 KB
Format:
Item-specific license agreed upon to submission
Description: