The analysis and advanced extensions of canonical correlation analysis Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 22, 2019
  • Samarov, Daniel V.
    • Affiliation: College of Arts and Sciences, Department of Statistics and Operations Research
  • Drug discovery is the process of identifying compounds which have potentially meaningful biological activity. A problem that arises is that the number of compounds to search over can be quite large, sometimes numbering in the millions, making experimental testing intractable. For this reason computational methods are employed to filter out those compounds which do not exhibit strong biological activity. This filtering step, also called virtual screening reduces the search space, allowing for the remaining compounds to be experimentally tested. In this dissertation I will provide an approach to the problem of virtual screening based on Canonical Correlation Analysis (CCA) and several extensions which use kernel and spectral learning ideas. Specifically these methods will be applied to the protein ligand matching problem. Additionally, theoretical results analyzing the behavior of CCA in the High Dimension Low Sample Size (HDLSS) setting will be provided.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Marron, James Stephen
Degree granting institution
  • University of North Carolina at Chapel Hill
  • Open access

This work has no parents.