Enhancing eQTL analysis techniques with special attention to the transcript dependency structure

Schwarz, John Carter

Download PDF

Request Version for Screen Reader

Last Modified

March 20, 2019

Creator

Schwarz, John Carter
- Affiliation: Gillings School of Global Public Health, Department of Biostatistics

Abstract

Gene expression microarray analysis and genetic marker association studies are two common experimental methods in the genetic literature. A growing number of studies have begun combining these two experiments into a single study known as an expressed quantitative trait loci (eQTL) study. Analysis of eQTL data has been performed on several different organisms including yeast, maize, mouse, and human. We propose a set of methods to effectively analyze eQTL data by properly transforming and adjusting analysis models. Our method addresses multiple issues often left out of eQTL analysis that include population stratification and adjustment of racial and ethnic classifications, adjustment of multiple covariates, and the influence of extreme outlying observations. Additionally we propose a statistic that is able to provide significance for trans bands (i.e., genetic markers that harbor a large number of eQTL) without the computational intensity of permutation testing. Most methods that identify a significance threshold for trans band activity either use simple binning approaches or have complex statistical methods that may require many assumptions and restrictions. We use a parametric approach that uses known distributions and simple approximations to develop a significance threshold. The advantages of our methods are that they account for correlation structures in the gene expression data and correlation between genetic markers. Also by using a parametric approach we do not rely on permutation testing which can be computationally daunting for even modestly sized studies. In the second part we will focus in on multiple testing in genetic applications. We study the family-wise error control by quantifying the probability that our test statistic crosses a defined threshold. The existing methods that employ this technique leave room for adjustments and modifications that allow for use in a variety of situations. We also explore the idea of considering discoveries as clumps of genetic markers instead of individual markers. By considering a clump as a single discovery, we can redefine the false discovery rate in terms of clumps and not single hypotheses. Additionally we provide some modifications to better model complex correlation structures as well as handle situations in which limited information on the markers is available.

Date of publication

August 2010

DOI

https://doi.org/10.17615/hqfa-sp02

Resource type

Dissertation

Rights statement

In Copyright

Note

"... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biostatistics."

Advisor

Wright, Fred A.

Language

English

Publisher

University of North Carolina at Chapel Hill

Place of publication

Chapel Hill, NC

Access right

Open access

Date uploaded

March 18, 2013

Relations

Parents:

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
		2019-04-11	Public	Download

Enhancing eQTL analysis techniques with special attention to the transcript dependency structure

Downloadable Content

Relations

Items