Diagnostic Measures for Missing Covariate Data and Semiparametric Models for Neuroimaging Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 20, 2019
  • Shi, Xiaoyan
    • Affiliation: Gillings School of Global Public Health, Department of Biostatistics
  • This dissertation is composed of two major topics: a) diagnostic measures for generalized linear models (GLMs) with missing covariate data, and b) semiparametric models for neuroimaging data. The first topic, diagnostic measures for GLMs with missing covariate data, is covered in two thesis papers. In the first paper, we carry out an in-depth investigation for assessing the influence of observations and model misspecification in the presence of missing covariate data in GLMs. Our diagnostic measures include case-deletion measures and conditional residuals. We use the conditional residuals to construct goodness of fit statistics for testing possible misspecifications in model assumptions. We develop specific strategies for incorporating missing data into goodness of fit statistics in order to increase the power of detecting model misspecification, and employ a resampling method to approximate the p-value of the goodness of fit statistics. In the second paper, we formally set up a general local influence method to carry out sensitivity analyses of minor perturbations to GLMs with missing covariate data. We examine two types of perturbation schemes (the single-case and global perturbation schemes) and show that the metric tensor of a perturbation manifold provides useful information for selecting an appropriate perturbation. We also develop several local influence measures to identify influential points and test model misspecification. The second topic, semiparametric models for neuroimaging data, also consists of two thesis papers. The main objective of the first paper is to develop an adjusted exponentially tilted empirical likelihood (ETEL) procedure for the analysis of neuroimaging data. We propose a likelihood ratio statistic to test hypotheses and construct goodness of fit statistics for testing possible model misspecifications and apply them to the classification of time-dependent covariates. Our semiparametric method avoids standard parametric assumptions and the adjustment to the ETEL method can dramatically improve its finite sample performance over the original ETEL. In the second paper, we develop a semiparametric framework for describing the variability of medial representation (m-rep) of subcortical subjects and its association with covariates in a Euclidean space. Because the elements of the m-rep do not form a vector space, applying classical multivariate regression techniques may be inadequate in establishing the association between an m-rep and covariates of interest. Our semiparametric model avoids specifying a probability distribution on a Riemannian manifold. We develop an estimation procedure based on the annealing evolutionary stochastic approximation Monte Carlo (AESAMC) algorithm to obtain parameter estimates and establish their limiting distributions. We use Wald statistics to carry out tests of hypotheses.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Ibrahim, Joseph
  • Open access

This work has no parents.