Application of Novel Statistical Methods for Biomarker Selection to HIV Infection Data Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 19, 2019
  • Pierre-Louis, Bosny J.
    • Affiliation: Gillings School of Global Public Health, Department of Biostatistics
  • The past decade has seen an explosion in the availability and use of biomarker data as a result of innovative discoveries and recent development of new biological and molecular techniques. Biomarkers are essential for at least four key purposes in biomedical research and public health practice: they are used for disease detection, diagnosis, prognosis, to identify patients who are most likely to benefit from selected therapies, and to guide clinical decision making. Determining the predictive and diagnostic value of these biomarkers, singly or in combination, is essential to their being used effectively, and this has spurred the development of new statistical methodologies to assess the relationship between biomarkers and clinical outcomes. One active area of research is the development of variable importance measures, a class of estimators that could reliably capture the effect of a specific biomarker on a clinical outcome. The central question addressed in this dissertation is the following: Given a large set of biomarkers that potentially predict a clinical outcome, how can one make a determination as to which ones are the most important? In the first paper, we estimate a targeted variable importance measure through Van der Laan's theory of targeted maximum likelihood estimation in the point treatment setting and use the same objective function to compute an alternative measure of marginal variable importance based on weights from a flexible propensity score model. Covariate-adjusted targeted variable importance measures are compared to estimates from this alternative methodology and to incremental value estimates from partial ROC curves. In the second paper, we extend the applicability of the TMLE methodology to analyze longitudinal repeated measures data. It addresses the gap caused by the absence of a generally accepted approach for generating a longitudinal variable importance index by proposing an estimator involving both TMLE and computation of the area under or above the LOESS curve. A graphical method is proposed for visual assessment of the longevity of a biomarker in terms of its predictive power, information that could be used to determine when repeated measures of a biomarker should be taken. Finally, in the third paper we take right censoring in the outcome variable into consideration and achieve biomarker selection in the presence of confounding and potential informative censoring through the use of stabilized weights in a time-dependent Cox proportional hazards model. A dataset from the Hormonal Contraception and HIV Genital Shedding and Disease Progression Study that includes longitudinal HIV infection data on a sample of 306 HIV-infected adult women from Uganda and Zimbabwe was used to develop and evaluate the methods discussed in the three papers. This study collected information on a number of biomarkers related to HIV infection, including plasma viral load, HIV subtype, CD4 and CD8 lymphocyte counts, hemoglobin level, and herpes simplex virus 2 (HSV-2). The relationships of these biomarkers with changes in CD4 cell counts were considered in three different contexts: cross-sectional, longitudinal and survival. In short, baseline CD4 cell counts, HIV subtype, and HSV-2 were found to be important biomarkers for the outcome variable studied.
Date of publication
Resource type
Rights statement
  • In Copyright
  • "... in partial fulfillment of the requirements for the degree of DrPH in the Department Biostatistics of the UNC Gillings School of Global Public Health."
  • Suchindran, Chirayath
Place of publication
  • Chapel Hill, NC
  • Open access

This work has no parents.