Computational Tools to Aid the Design and Development of a Genetic Reference Population Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 19, 2019
  • Welsh, Catherine
    • Affiliation: College of Arts and Sciences, Department of Computer Science
  • Model organisms are important tools used in biological and medical research. A key component of a genetics model organism is a known and reproducible genome. In the early 1900s, geneticists developed methods for fixing genomes by inbreeding. First generation genetic models used inbreeding to create disease models from animals with spontaneous or stimulated mutations. Recently, geneticists have begun to develop a second generation of models which better represent the human population in terms of diversity. One such model is the Collaborative Cross (CC), which is a mouse model derived from 8 founders. I have been involved in developing the CC since its early stages. In particular, I am interested in speeding up the inbreeding process, since it currently takes an average of thirty-six generations to achieve complete fixation. To speed up the inbreeding process, I developed a simulator that replicates the breeding process and tested various breeding strategies before applying them to a CC. To apply the simulation techniques to live mice, a fast, low-cost way to monitor their genomes at each generation was needed. As a result, two genotyping arrays were designed, a first generation array with 7,851 markers called MUGA and a second generation array called MegaMUGA with 77,800 markers. Both arrays were designed specifically to be maximally informative for the CC population. Using these genotyping arrays, one can determine from which of the eight CC founders each part of a developing mouse lines genome is inherited. I refer to these as haplotype reconstructions, and they are used as the input into my simulations as well as various other monitoring tools. To determine theaccuracy of these haplotype reconstructions, I used DNA sequencing data for three samples which were also genotyped, and compared the haplotype reconstructions from the DNA sequencing data to solutions from the genotyping array data.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Valdar, William
  • Pardo-Manuel Pardo-Pardo-Manuel de Villena, Fernando
  • Prins, Jan
  • McMillan, Leonard
  • Wang, Wei
  • Doctor of Philosophy
Degree granting institution
  • University of North Carolina at Chapel Hill Graduate School
Graduation year
  • 2014
Place of publication
  • Chapel Hill, NC
  • There are no restrictions to this item.

This work has no parents.