Integrated Analysis of Multiple Data Sets With Biomedical Applications Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 19, 2019
Creator
  • Li, Gen
    • Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
Abstract
  • It is increasingly common to have measurements from multiple platforms on the same set of samples in modern biomedical sciences. In this dissertation, we develop novel methodologies for integrated analysis of multiple data sets. In particular, we devise a supervised principal component analysis framework that achieves dimension reduction of the primary data with guidance from an auxiliary data set. It extracts accurate and interpretable low-rank structures that are potentially driven by the auxiliary information. We further extend the method to accommodate special features of data such as functionality and high dimensionality through regularization. Numerical examples demonstrate that the proposed methodologies have clear advantages over existing methods. In addition, we develop a Bayesian hierarchical model for multi-tissue eQTL analysis. It exploits shared information in multiple tissues to increase the power of eQTL discovery and improve tissue specicity assessment. The method has been adopted by the Genotype-Tissue Expression (GTEx) consortium and successfully applied to the nine-tissue pilot data.
Date of publication
Keyword
Subject
DOI
Identifier
Resource type
Rights statement
  • In Copyright
Advisor
  • Zhang, Kai
  • Wright, Fred A.
  • Marron, James Stephen
  • Nobel, Andrew
  • Shen, Haipeng
Degree
  • Doctor of Philosophy
Degree granting institution
  • University of North Carolina at Chapel Hill Graduate School
Graduation year
  • 2015
Language
Publisher
Place of publication
  • Chapel Hill, NC
Access
  • There are no restrictions to this item.
Parents:

This work has no parents.

Items