Integrated Analysis of Multiple Data Sets With Biomedical Applications Public Deposited
- Last Modified
- March 19, 2019
- Creator
-
Li, Gen
- Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
- Abstract
- It is increasingly common to have measurements from multiple platforms on the same set of samples in modern biomedical sciences. In this dissertation, we develop novel methodologies for integrated analysis of multiple data sets. In particular, we devise a supervised principal component analysis framework that achieves dimension reduction of the primary data with guidance from an auxiliary data set. It extracts accurate and interpretable low-rank structures that are potentially driven by the auxiliary information. We further extend the method to accommodate special features of data such as functionality and high dimensionality through regularization. Numerical examples demonstrate that the proposed methodologies have clear advantages over existing methods. In addition, we develop a Bayesian hierarchical model for multi-tissue eQTL analysis. It exploits shared information in multiple tissues to increase the power of eQTL discovery and improve tissue specicity assessment. The method has been adopted by the Genotype-Tissue Expression (GTEx) consortium and successfully applied to the nine-tissue pilot data.
- Date of publication
- May 2015
- Keyword
- Subject
- DOI
- Identifier
- Resource type
- Rights statement
- In Copyright
- Advisor
- Wright, Fred A.
- Marron, James Stephen
- Shen, Haipeng
- Zhang, Kai
- Nobel, Andrew
- Degree
- Doctor of Philosophy
- Degree granting institution
- University of North Carolina at Chapel Hill Graduate School
- Graduation year
- 2015
- Language
- Publisher
- Place of publication
- Chapel Hill, NC
- Access
- There are no restrictions to this item.
- Parents:
This work has no parents.
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
|
Li_unc_0153D_15483.pdf | 2019-04-09 | Public |
|