Affiliation: Gillings School of Global Public Health, Department of Biostatistics
The big data age has brought with it challenges and opportunities for biomedical decision making. New technologies allow for collecting large data sets that can be used to tailor treatment. In this dissertation, we develop machine learning methods for data-driven biomedical decision making with the goal of leveraging patient heterogeneity to make more precise treatment decisions.
Many problems in biomedical decision making can be expressed as classification problems. The costs of false positives and false negatives differ across application domains and this trade-off is often displayed using a receiver operating characteristic (ROC) curve. In the first chapter, we develop an ROC curve estimator using a weighted support vector machine (SVM).
Precision medicine is the paradigm of incorporating individual patient factors into treatment decisions, formalized through individualized treatment regimes (ITR's), or maps from the covariate space into the treatment space. The optimal ITR is defined as that which maximizes the mean of a clinical outcome. The estimation of an optimal ITR from mobile health data is complicated by the fact that observations occur over time at a fine granularity and there is no definite time horizon. In the second chapter, we develop an estimation method for optimal ITR's in mobile health.
Clinical decision making often requires balancing trade-offs between multiple outcomes while accounting for patient preferences, creating a disconnect with the traditional definition of the optimal ITR. If an instrument to elicit patient preferences is available, one can construct a composite outcome to identify the optimal ITR. However, such an instrument is often unavailable. In the third chapter, we introduce a method for estimating a composite outcome from treatment decisions in observational data under the assumption that clinicians act approximately optimally.
Direct search methods, such as outcome weighted learning (OWL), estimate the optimal ITR by maximizing an inverse probability weighted estimator (IPWE) over a class of ITR's. In the final chapter, we show that the IPWE objective function is a profile log-likelihood for a semiparametric model. We use this characterization to develop efficient computational algorithms and exploratory data analysis techniques for estimating the optimal ITR.