Non-parametric and semi-parametric methods for parsimonious statistical learning with complex data Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 19, 2019
  • Dasgupta, Sayan
    • Affiliation: Gillings School of Global Public Health, Department of Biostatistics
  • In clinical research, non-parametric and semi-parametric methods are increasingly gathering importance as statistical tools to infer on accumulated data. They require fewer assumptions and their applicability is much wider than the corresponding parametric methods. Being robust, these methods are seen by some statisticians as leaving less room for improper use and misunderstanding. In this dissertation we study some of these nonparametric and semiparametric methods in statistical learning and their applications to various areas of biomedical research. In the first part of our dissertation, we study the application of temporal process regression in the study of medical adherence. Adherence refers to the act of conforming to the recommendations made by the provider with respect to timing, dosage, and frequency of medication taking. Here we assess the effect of drug adherence in the study of viral resistance to antiviral therapy for chronic Hepatitis C. We use Temporal Process Regression (Fine, Yan, and Kosorok 2004) to model adherence as a longitudinal predictor of SVR. We show that adherence has a significant effect on SVR and this analysis can serve as an archetype for more statistically efficient analyses of medical adherence in studies where the common theme till now has been to report summary statistics. In the second part of the dissertation, we develop an approach for feature elimination in support vector machines, based on recursive elimination of features. We present theoretical properties of this method and show that this is uniformly consistent in finding the correct feature space under certain generalized assumptions. We present case studies to show that the assumptions are met in most practical situations and give simulation studies to demonstrate performance of the proposed approach. In the third part of the dissertation we focus our attention to feature selection in Q-learning. Here we discuss three different methods for feature selection, based on the same vital idea of feature screening through ranking in a sequential backward selection scheme. We discussed the applicability of the methods, reasoned on heuristics stemming from our previous work on feature selection in support vector machines and gave results showing their performance in various simulated settings.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Zeng, Donglin
  • Truong, Kinh
  • Kosorok, Michael
  • Goldberg, Yair
  • Esserman, Denise
  • Cole, Stephen
  • Doctor of Philosophy
Degree granting institution
  • University of North Carolina at Chapel Hill Graduate School
Graduation year
  • 2014
Place of publication
  • Chapel Hill, NC
  • This item is restricted from public view for 1 year after publication.

This work has no parents.