Model-based clustering for identifying disease-associated SNPs in case-control genome-wide association studies
Date
2019
Authors
Xu, Y.
Xing, L.
Su, J.
Zhang, Xuekui
Qiu, W.
Journal Title
Journal ISSN
Volume Title
Publisher
Scientific Reports
Abstract
Genome-wide association studies (GWASs) aim to detect genetic risk factors for complex human
diseases by identifying disease-associated single-nucleotide polymorphisms (SNPs). The traditional
SNP-wise approach along with multiple testing adjustment is over-conservative and lack of power
in many GWASs. In this article, we proposed a model-based clustering method that transforms the
challenging high-dimension-small-sample-size problem to low-dimension-large-sample-size problem
and borrows information across SNPs by grouping SNPs into three clusters. We pre-specify the patterns
of clusters by minor allele frequencies of SNPs between cases and controls, and enforce the patterns
with prior distributions. In the simulation studies our proposed novel model outperforms traditional
SNP-wise approach by showing better controls of false discovery rate (FDR) and higher sensitivity. We
re-analyzed two real studies to identifying SNPs associated with severe bortezomib-induced peripheral
neuropathy (BiPN) in patients with multiple myeloma (MM). The original analysis in the literature
failed to identify SNPs after FDR adjustment. Our proposed method not only detected the reported
SNPs after FDR adjustment but also discovered a novel BiPN-associated SNP rs4351714 that has been
reported to be related to MM in another study.
Description
Keywords
Citation
Xu, Y., Xing, L. Su, J., Zhang, X., & Qiu, W. (2012). Model-based clustering for identifying disease-associated SNPs in case-control genome-wide association studies. Scientific Reports, 9. https://doi.org/10.1038/s41598-019-50229-6