Missing Data in Non-parameteric Tests of Correlated Data Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 22, 2019
  • Howard, Annie G.
    • Affiliation: Gillings School of Global Public Health, Department of Biostatistics
  • Many public health studies are designed to test for differences in repeated measurements. Measurements on the same subject are not independent and therefore analysis methods must take correlation into account. A number of tests have been developed to analyze this type of data. Two prominent non-parametric methods, used often when one is not willing to make any distributional assumptions about the data, include Fried- man's test and a variation on Friedman's test proposed by Koch and Sen that requires no assumptions to be made about the correlation between measurements. While both tests require complete and balanced data, in many studies missing data can arise for a variety of reasons. Researchers have developed a number of methods to adapt Friedman's test to situations involving missing data when it can be assumed that the missing data are missing completely at random. We propose applying these same adjustments to the test statistic proposed by Koch and Sen to adapt this test to deal with data that are missing completely at random. This method involves using the sum of the reduced ranks, rather than the average rank, across all subjects to allow for meaningful comparisons across subjects. An inflation factor is used to ensure the missing data do not result in a substantial loss of power. The assumption that the data are missing completely at random is often too strict an assumption for correlated data. Often the reason for the data to be missing is directly related to the outcome values. A new strategy is proposed for adjusting both Friedman's test and Koch and Sen's test to informative missing data scenarios. The method put forth in this paper involves the use of single imputation to impute missing rank values along with a weighting scheme which assigns smaller weight to individuals with more missing data. Guidelines and suggestions are put forward as to when this new method would be preferred to the method currently used to address problems with missing completely at random data.
Date of publication
Resource type
Rights statement
  • In Copyright
  • ... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biostatistics.
  • Bangdiwala, Shrikant
Degree granting institution
  • University of North Carolina at Chapel Hill

This work has no parents.