High level integration of genomic data for improving prediction, prognostication and classification of breast tumors Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 20, 2019
  • Weigman, Victor John
    • Affiliation: College of Arts and Sciences, Department of Biology
  • The explosion of genomic data has forever transformed the way we approach our understanding of diseases and their effect on human systems, shifting the paradigm of discovery from paucity of biological data to an overload. Within a decade our understanding of breast cancer has grown from one disease with a few regimented treatments to several well-characterized subtypes of well understood pathology and clinical significance. Gene expression microarrays generated an explosion of characterization studies that have now generated hundreds of molecular portraits which profile specific clinical outcome groups, most of which contain genes strongly associated to breast cancer. Only a few of these gene signatures are robust across datasets, hinting at over fitting of selection sets and sample selection bias. This subjective selection of gene/sample pairings was addressed through a novel biclustering method, Large Average Submatrices (LAS), by using objective statistical assumptions to recreate robust expression signatures. However, the identification of specific interacting partners and downstream events are either rare or unconfirmed in these studies. The identification of Copy Number Aberrations (CNAs) is a vital step in linking the addition of genomic health to specific clinical states in these subtypes. To this effect, SWITCH (SupWald Identification of dna copy CHanges), was developed to call CNAs in aCGH platforms and associate them to subtypes. Subtypes displayed novel CNA profiles, most notably in Basal tumors, whose disruptions of key DNA damage response genes (BRCA1, WEE1 and RAD-complex genes) were conserved in suitably-paired mouse models. Identifying pathway states in subtypes is necessary for individualized therapies and this work shows the effects of such therapies as well as suggests new pathway targets. Following EGFR inhibition, 90% of Basal tumors showed higher pathway activity than other subtypes, which suggest increased sensitivity. Analysis of conserved expression and CNA patterns in C(3)-Tag mice showed an overwhelming activation of PI3K-mediated AKT survival in Basal tumors across species a feature of these tumors not previously seen. The work demonstrated here demonstrates how to validate biological hypotheses in a genomic era and the power of data integration to highlight biological mechanisms that lay hidden under the volumes of modern genomic data.
Date of publication
Resource type
Rights statement
  • In Copyright
  • "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biology and the program in Bioinformatics and Computational Biology."
  • Dangl, Jeffery L.
Place of publication
  • Chapel Hill, NC
  • Open access

This work has no parents.