Collections > Electronic Theses and Dissertations > Bioinformatics Tools for Exploring Regulatory Mechanisms

Gene expression is the fundamental initial step in the flow of genetic information in biological systems and it is controlled by multiple precisely coordinated regulatory mechanisms, such as structural and epigenetic regulations. Dysregulation of gene expression plays important roles in the development of a broad range of diseases. Modern high-throughput technologies provide unprecedented opportunities to investigate these diverse regulatory mechanisms on a genome-wide scale. Here we develop several methods to analyze these omics profiles. First, Hi-C experiments generate genome-wide contact frequencies between pairs of loci by sequencing DNA segments ligated from loci in close spatial proximity. To detect biologically meaningful interactions between loci, we propose a hidden Markov random field (HMRF) based Bayesian method to rigorously model interaction probabilities in the two-dimensional space based on the contact frequency matrix. By borrowing information from neighboring loci pairs, our method demonstrates superior reproducibility and statistical power in both simulation studies and real data analysis. Second, DNA methylation is a key epigenetic mark involved in both normal development and disease progression. To facilitate joint analysis of methylation data from multiple platforms with varying resolution, we propose a penalized functional regression model to impute missing methylation data. By incorporating functional predictors, our model utilizes information from non-local probes to improve imputation quality. We compared the performance of our functional model to linear regression and the best single probe surrogate in real data and via simulations, and our method showed higher imputation accuracy. The simulated association study further demonstrated that our method substantially improves the statistical power to identify trait- associated methylation loci in epigenome-wide association study (EWAS). Finally, we applied an integrative analysis to characterize molecular systems associated with hepatocellular carcinoma (HCC). Dysregulaton of inflammation-related genes plays a pivotal role in the development of HCC. We performed array-based analyses to comprehensively investigate the contributions of DNA methylation and somatic copy number aberration (SCNA) to the aberrant expression of inflammation-related genes in 30 HCCs and paired non-tumor tissues. The results were validated in public datasets and an additional sample set of 47 paired HCCs and non-tumor tissues. We found that DNA methylation and SCNA together contributed to less than 30% aberrant expression of inflammation-related genes, suggesting that other molecular mechanisms might play major role in the dysregulation in HCCs.