Data Representation and Basis Selection to Understand Variation of Function Valued Traits Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 20, 2019
  • Gaydos, Travis L.
    • Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
  • Many fields, including evolutionary biology, collect data in which a curve corresponds to each individual. Therefore a curve is the statistical atom of analysis, which is an area in statistics known as Functional Data Analysis (FDA). A common goal in FDA is to understand the variation of curves. Often Principal Components Analysis (PCA) is a useful tool to do this. But PCA will often yield undesirable results if the amount of variation explained by directions is not significantly different. Directions of low variation do not often explain significantly differing amounts of variation. Therefore in subspaces of low variation it is difficult to separate biological signal from noise using variation measures. In this dissertation a way to separate biological signal from noise by quantifying the simplicity structure of curves in a subspace of low variation is shown. Also asymptotic properties of subspaces of low variation and subspaces of biological signal are developed. The results of PCA are highly dependent on the representation of the data as well. In this dissertation a representation of data curves similar to shape statistics is produced by exploiting the developmental stages of insects. This representation allows for variation, that is usually most efficiently modeled using non-linear methods when using typical FDA grid based representations of the data, to be modeled using linear PCA. Also in the dissertation is a method to simultaneously visualize multiple t-tests to understand the slope structure of data sets.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Marron, James Stephen
  • Open access

This work has no parents.