Affiliation: Gillings School of Global Public Health, Department of Biostatistics
In many epidemiology studies, the biomarkers and survival endpoints of diseases are collected to investigate the association between risk factors and disease incidence. The risk of the disease may change when a certain risk factor exceeds a threshold. Finding this threshold value is important for individual risk prediction and disease prevention. In this dissertation, we develop semiparametric statistical approaches for the change point in both univariate and clustered survival data. Family studies are very popular in public health studies due to its cost-efficiency in collecting the data. In Chapter 3, we propose a Cox-type marginal hazards model with an unknown change point for clustered event data. The marginal pseudo-partial likelihood functions are maximized to estimate the change point. We develop a supremum test and $m$ out of $n$ bootstrap to make inferences of the change point. We establish the consistency and asymptotic distributions of the proposed estimators. We evaluate the small sample performance of the proposed methods via extensive simulation studies. Finally, the Strong Heart Family Study dataset is analyzed to illustrate the methods. To improve the performance in identifying the high-risk individuals, we propose a change hyperplane model, which is an extended change point model based on a linear combination of multiple risk factors. We develop the Cox proportional hazards model and Cox-type marginal hazards model with a change hyperplane for univariate and clustered event data in Chapter 4 and 5, respectively. To ensure identifiability, two different sets of criterion are shown to be equivalent and proved to guarantee the identifiability. The two-step procedure with application of the genetic algorithm is applied to maximize the partial or pseudo-partial likelihood function. We introduce the $m$ out of $n$ bootstrap to generate confidence intervals for the change hyperplane parameters. The supremum tests with score and robust score statistics are conducted to verify the existence of the change hyperplane in univariate and clustered data, respectively. The asymptotic properties of the proposed estimators are derived in both cases. The performance of the proposed approach is demonstrated via simulation studies and analyses in the Cardiovascular Health Study and the Strong Heart Family Study.