Methods for Adapting Global Mass Spectrometry Based Metabolomics to the Clinical Enviornment Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 22, 2019
  • Wulff, Jacob
    • Affiliation: Gillings School of Global Public Health, Department of Biostatistics
  • Metabolomics is a maturing field with successful application to research areas such as biomarker discovery and mechanisms of disease. With the ability to profile hundreds or even thousands of biochemicals simultaneously, many of which are also used in various laboratory diagnostics, the technology has the potential to replace a battery of clinical tests with a single test. However, the current state of global analysis presents several challenges for the clinical environment. This dissertation addresses two of these challenges. First is handling of missing values with respect to comparing an individual sample against a reference population. Second is the semi-quantitative nature of the liquid chromatography mass spectrometry. The first paper explores basic properties of metabolites, specifically the statistical distribution of metabolite concentrations and correlation between them. In human sample sets covering three different sample material appropriate for clinical testing, raw ion counts are shown to be vastly non-normal and consistently having a heavy right skew. Natural log-transformation is effective at removing this skewness and inducing Gaussian behavior, though departures from normality may persist in the tails of the distributions. Correlation between library-matched metabolites after removing artifact related features is also shown to be of only moderate degree in most cases. In the second paper, application of the log transformation is used to account for missing values in estimating population parameters of a reference cohort. Missing values are largely attributed to the true level falling below the detection limit of the instrument. Combining this assumption with the Gaussian model leads to two parametric approaches being introduced for the estimation of population parameters. These methods are shown to outperform standard imputation approaches in the field using a combination of simulations and real metabolomic datasets. The third paper addresses merging multiple global LC-MS metabolomic sets of the same biological sample type together. Typical normalization methods meant to account for sample to sample variation are presented and compared to alternative approaches using technical replicates and within batch scaling. Concentrations from targeted analysis of eight clinical biomarkers are used to show the superiority of these alternative approaches.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Edwards, Lloyd
  • Cai, Jianwen
  • Hudgens, Michael
  • Suchindran, Chirayath
  • McDunn, Jonathan
  • Doctor of Public Health
Degree granting institution
  • University of North Carolina at Chapel Hill Graduate School
Graduation year
  • 2018

This work has no parents.