Methods in literature-based drug discovery Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 21, 2019
  • Baker, Nancy C.
    • Affiliation: School of Information and Library Science
  • This dissertation work implemented two literature-based methods for predicting new therapeutic uses for drugs, or drug reprofiling (also known as drug repositioning or drug repurposing). Both methods used data stored in ChemoText, a repository of MeSH terms extracted from Medline records and created and designed to support drug discovery algorithms. The first method was an implementation of Swanson's ABC paradigm that used explicit connections between disease, protein, and chemical annotations to find implicit connections between drugs and disease that could be potential new therapeutic drug treatments. The validation approach implemented in the ABC study divided the corpus into two segments based on a year cutoff. The data in the earlier or baseline period was used to create the hypotheses, and the later period data was used to validate the hypotheses. Ranking approaches were used to put the likeliest drug reprofiling candidates near the top of the hypothesis set. The approaches were successful at reproducing Swanson's link between magnesium and migraine and at identifying other significant reprofiled drugs. The second literature-based discovery method used the patterns in side effect annotations to predict drug molecular activity, specifically 5-HT6 binding and dopamine antagonism. Following a study design adopted from QSAR experiments, side effect information for chemicals with known activity was input as binary vectors into classification algorithms. Models were trained on this data to predict the molecular activity. When the best validated models were applied to a large set of chemicals in a virtual screening step, they successfully identified known 5-HT6 binders and dopamine antagonists based solely on side effect profiles. Both studies addressed research areas relevant to current drug discovery, and both studies incorporated rigorous validation steps. For these reasons, the text mining methods presented here, in addition to the ChemoText repository, have the potential to be adopted in the computational drug discovery laboratory and integrated into existing toolsets.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Hemminger, Bradley M.
Degree granting institution
  • University of North Carolina at Chapel Hill
  • Open access

This work has no parents.