LASSO based Resample Model Averaging for Genetic Association Studies Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 22, 2019
Creator
  • Sabourin, Jeremy Alan
    • Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
Abstract
  • Significance testing one SNP at a time has proven useful for identifying genomic regions that harbor variants affecting human disease. In theory, simultaneous modeling of multiple loci should help when considering complex diseases affected by multiple predictors. However, they are typically applied in an ad hoc fashion: conditioning on the top SNPs, with limited exploration of the model space and no assessment of how sensitive model choice was to sampling variability. Formal alternatives exist but are seldom used. When considering complex traits in humans, the genetic model is most often assumed to be additive only SNP effects. When non-additive effects such as dominance or overdominance are present, additive only models can be underpowered. We first present LLARRMA, a resample model averaging based method using the LASSO that allows for additive. It estimates for each SNP, the probability that it would be included in a multiple SNP model in alternative realizations of the data. We show that under simulations based on real GWAS data, that LLARRMA identifies a set of candidates that is enriched for causal loci relative to single locus analysis. We next generalize the resample model averaging framework and present LLARRMA-dawg, a generalized resample model averaging based method using the group LASSO that allows for additive and non-additive SNP effects. We show that under simulations based on real GWAS data, that LLARRMA-dawg identifies a set of candidates that is enriched for causal loci relative to LLARRMA in the presence of non-additive effects. We examine how the framework for LLARRMA-dawg can be extended to other problems where multiple model predictors are required to model the effects of a single variable. The final portion of this dissertation describes additional information that one may explore from resample model averaging. Specifically, we examine how one can identify response specific variable relationships based on the models selected under resampling. This give the researcher further information about the predictors than the standard pairwise correlation structure which does not account for the response.
Date of publication
DOI
Resource type
Rights statement
  • In Copyright
Advisor
  • Nobel, Andrew
Degree
  • Doctor of Philosophy
Degree granting institution
  • University of North Carolina at Chapel Hill
Graduation year
  • 2013
Language
Publisher
Parents:

This work has no parents.

Items