Model Assessment for Models with Missing Data
Public DepositedAdd to collection
You do not have access to any existing collections. You may create a new collection.
Downloadable Content
Download PDFCitation
MLA
Zhou, Xiaolei. Model Assessment for Models with Missing Data. Chapel Hill, NC: University of North Carolina at Chapel Hill Graduate School, 2015. https://doi.org/10.17615/9rwg-8r31APA
Zhou, X. (2015). Model Assessment for Models with Missing Data. Chapel Hill, NC: University of North Carolina at Chapel Hill Graduate School. https://doi.org/10.17615/9rwg-8r31Chicago
Zhou, Xiaolei. 2015. Model Assessment for Models with Missing Data. Chapel Hill, NC: University of North Carolina at Chapel Hill Graduate School. https://doi.org/10.17615/9rwg-8r31- Last Modified
- March 19, 2019
- Creator
-
Zhou, Xiaolei
- Affiliation: Gillings School of Global Public Health, Department of Biostatistics
- Abstract
- Missing data commonly occur in various study setting. In this dissertation, we first investigate three likelihood-based models for missing data in longitudinal studies: mixed effects models, pattern mixture models (PMM), and selection models. Extensive simulations from ten missing mechanisms are performed with the focus on treatment effect. Results suggest that no model consistently performs better than others under various missing data mechanism. However, PMM using the treatment-specific proportion and selection model provide some correction of the estimate compared with mixed-effects model in several missing not at random situations, even when the mechanism of missing data is not exactly the same as the model assumption. Secondly, we focus on the case deletion diagnostic measures for general linear models (GLMs) with missing covariate data. Cook's distance is one of the most important diagnostic tools to identify influential observations on the parametric models. However, Cook's distance may not be directly comparable because its scale stochastically depends on the degree of the perturbation. We define the degree of perturbation for GLM with missing covariates. Then, we derive the Cook's distance based on likelihood function and compare it to the Cook's distance based on the Q-function used in the EM algorithm for models with missing data. We further develop the scaled Cook's distance in the GLM with missing covariate data, which resolves the size issue of Cook's distance. Simulation data are used to illustrate the size matters issue in GLM with missing covariates. The applications of scaled Cook's distances in a formal influence analysis are examined in simulations and real data examples. At last, we examine the connection between case deletion measures and cross validation method for GLM with missing covariates models. Based on such connection, we develop case-deletion model complexity (CMC) measures for quantifying the model complexity and case-deletion information criteria (CIC) for model selection. We develop these new measures and criteria based on the likelihood function and the Q-function, respectively. Some properties of CMC and CIC are investigated. Simulations and real data analysis show that CIC is a valuable tool for analysis of models with missing data.
- Date of publication
- May 2015
- Subject
- DOI
- Identifier
- Resource type
- Rights statement
- In Copyright
- Advisor
- Bangdiwala, Shrikant
- Andrews, Elizabeth
- Li, Yun
- Sun, Wei
- Zhu, Hongtu
- Degree
- Doctor of Philosophy
- Degree granting institution
- University of North Carolina at Chapel Hill Graduate School
- Graduation year
- 2015
- Language
- Publisher
- Place of publication
- Chapel Hill, NC
- Access right
- There are no restrictions to this item.
- Date uploaded
- June 23, 2015
Relations
- Parents:
This work has no parents.
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
Zhou_unc_0153D_15251.pdf | 2019-04-11 | Public | Download |