The use of microarray data integration to improve cancer prognosis Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 22, 2019
  • Zhang, Zhe
    • Affiliation: School of Medicine, UNC/NCSU Joint Department of Biomedical Engineering
  • Microarray is a high-throughput technology used to simultaneously measuring the expression of thousands of genes in each sample. Therefore, it has the potential to benefit the treatment of complicated diseases like cancer. This study made efforts to improve the application of microarray technologies to clinical medicine with two separate, but related phases. The first phase dealt with the generation of clinically valuable expression profiles from microarray data. By re-analyzing several published cancer datasets, we first confirmed that microarray data presented extra information about prognosis of cancer patients beyond currently used indexes such as tumor size. At the same time, it was noticed that those indexes generally confounded the correlation between gene expression and cancer outcome, so the contents of expression profiles were highly dependent on the clinical background of sample patients. Consequently, integrating multiple datasets was revealed by this study to obtain more general and reproducible cancer expression profiles. A novel data analysis procedure incorporating bootstrap re-sampling and training/testing validation was performed to impartially compare strategies of expression profiling. The results illustrated that after two independent datasets were integrated, the resultant expression profiles more correctly differentiated cancer patients in terms of disease outcome. The second phase of this study was to develop MAMA (Meta-Analysis of MicroArray), a data mining platform for conveniently collecting, managing, and analyzing multiple microarray datasets altogether. The complete MAMA system included three components: a relational database storing microarray cancer datasets; a web server providing the access to the database; and a client-side application implementing data manipulation and analysis methods. MAMA had an open-source framework allowing other developers to plug in their own data analysis methods. Moreover, it made cross-dataset analysis possible by standardizing annotation of samples and sequences in microarray datasets.
Date of publication
Resource type
Rights statement
  • In Copyright
  • "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biomedical Engineering."
  • Hsiao, Henry
  • Fenstermacher, David Alan
  • Doctor of Philosophy
Graduation year
  • 2006

This work has no parents.