Marginalized two-part models for semicontinuous data with application to medical costs

Smith, Valerie

Download PDF

Request Version for Screen Reader

Last Modified

March 19, 2019

Creator

Smith, Valerie
- Affiliation: Gillings School of Global Public Health, Department of Biostatistics

Abstract

In health services research, it is common to encounter semicontinuous data characterized by a point mass at zero followed by a right-skewed continuous distribution with positive support. Examples include health expenditures, in which the zeros represent a subpopulation of patients who do not use health services, while the continuous distribution describes the level of expenditures among health services users. Semicontinuous data are typically analyzed using two-part mixture models that separately model the probability of health services use and the distribution of positive expenditures among users. However, because the second part conditions on a nonzero response, conventional two-part models do not provide a marginal interpretation of covariate effects on the overall population of health service users and non-users, even though this is often of greatest interest to investigators. Here, we propose a marginalized two-part model that yields more interpretable effect estimates in two-part models by parameterizing the model in terms of the marginal mean. This model maintains many of the important features of conventional two-part models, such as capturing zero-inflation and skewness, but allows investigators to examine covariate effects on the overall marginal mean, a target of primary interest in many applications. Using a simulation study, we examine properties of the maximum likelihood estimators from this model. We illustrate the approach by evaluating the effect of a behavioral weight loss intervention on health care expenditures in the Veterans Affairs (VA) health care system. We then extend this marginalized two-part model to clustered or longitudinal data structures by incorporating random effects. This longitudinal marginalized two-part model is fit following a fully Bayesian approach with non-informative or weakly informative prior distributions, and we illustrate it by analyzing the effect of a copayment increase in the VA health system. Finally, using simulation studies, we compare the performance of the marginalized two-part model to commonly used one-part generalized linear models (GLMs) fit via quasi-likelihood estimation over a range of simulated data scenarios with varying percentages of zero-valued observations.

Date of publication

May 2015

Keyword

Subject

DOI

https://doi.org/10.17615/4r0v-wk40

Identifier

Smith_unc_0153D_15272.pdf

Resource type

Dissertation

Rights statement

In Copyright

Advisor

Koch, Gary
Herring, Amy
Preisser, John
Maciejewski, Matthew
Neelon, Brian

Degree

Doctor of Public Health

Degree granting institution

University of North Carolina at Chapel Hill Graduate School

Graduation year

2015

Language

English

Publisher

University of North Carolina at Chapel Hill Graduate School

Place of publication

Chapel Hill, NC

Access right

There are no restrictions to this item.

Date uploaded

June 23, 2015

Relations

Parents:

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
	Smith_unc_0153D_15272.pdf	2019-04-09	Public	Download

Marginalized two-part models for semicontinuous data with application to medical costs

Downloadable Content

Relations

Items