Class-Sensitive Principal Components Analysis Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 19, 2019
  • Miao, Di
    • Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
  • Research in a number of fields requires the analysis of complex datasets. Principal Components Analysis (PCA) is a popular exploratory method. However it is driven entirely by variation in the dataset without using any predefined class label information. Linear classifiers make up a family of popular discrimination methods. However, these will face the data piling issue often when the dimension of the dataset gets higher. In this dissertation, we first study the geometric representation of an interesting dataset with strongly auto-regressive errors under the High Dimensional Low Sample Size (HDLSS) setting and understand why the Maximal Data Piling (MDP), proposed by Ahn et al. (2007), is the best in terms of classification compared with several other commonly used linear discrimination methods. Then we introduce the Class-Sensitive Principal Components Analysis (CSPCA), which is a compromise of PCA and MDP, that seeks new direction vectors for better Class-Sensitive visualization. Specifically, this method will be applied to the Thyroid Cancer dataset (see Agrawal et al. (2014)). Additionally, we investigate the asymptotic behavior of the sample and population MDP normal vector and Class-Sensitive Principal Component directions under the HDLSS setting. Moreover, the Multi-class version of CSPCA (MCSP) will be introduced as the last part of this dissertation.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Liu, Yufeng
  • Bair, Eric
  • Fine, Jason
  • Marron, James Stephen
  • Nobel, Andrew
  • Doctor of Philosophy
Degree granting institution
  • University of North Carolina at Chapel Hill Graduate School
Graduation year
  • 2015
Place of publication
  • Chapel Hill, NC
  • There are no restrictions to this item.

This work has no parents.