Affiliation: Gillings School of Global Public Health, Department of Biostatistics
Bayesian nonparametric (BNP or NP Bayes) methods have enjoyed great strides forward in recent years. BNP methods embody the belief that inference is best driven by the data itself with minimal assumptions about the underlying model; this approach has motivated a wide variety of BNP techniques that have met with with much success. In the first dissertation paper, we address a long-standing complaint about the nonparametric priors used in BNP analyses, that they do not necessarily reflect the analyst's prior belief or intention, and so are not really Bayesian. In fact, it can be demonstrated that a supposedly uninformative nonparametric prior framework is actually very informative about certain aspects of the distribution it models. We develop a novel method to incorporate prior information about functionals of the unknown distribution, replacing undesirable induced priors on those functionals with prior distributions that reflect real prior belief. We show that the new prior enjoys the support characteristics of the original prior, and we demonstrate with examples the effect of the marginal prior on the quality of inference. In the second and third dissertation papers, we address challenges in the analysis of high-dimensional data, with a focus on density regression. Many areas of inquiry, particularly in genetics research, are concerned with the modeling of a continuous physical trait as some function of a very large set of predictors. In most cases the number of predictors is much larger than the number of observations. In addition, the response to be modeled may have a nontrivial conditional distribution. In the second dissertation paper we develop a solution for this problem in the context of uncorrelated observations, and apply the technique to a problem in molecular epidemiology. In the third dissertation paper we expand the method to address correlated observations. We illustrate the utility of the proposed method in an application to a family-based data from a whole-genome linkage analysis of a neurological condition.