PIE - the protein inference engine Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 21, 2019
  • Jefferys, Stuart R.
    • Affiliation: School of Medicine, Curriculum in Genetics and Molecular Biology
  • Posttranslational modifications are vital to protein function but are hard to study, especially since several modification isoforms may be present simultaneously. Mass spectrometers are a great tool for investigating modified proteins, but the data they generate are often incomplete, ambiguous, and difficult to interpret. Combining data from multiple experimental techniques provides complementary information. Having both top-down (intact protein mass data) and bottom-up (peptide data) is especially valuable. In the context of background knowledge, combined data is used by human experts to interpret what modifications are present and where they are located. However, this process is arduous and for high-throughput applications needs to be automated. To explore a data integration methodology based on Markov chain Monte Carlo and simulated annealing, I developed the PIE (Protein Inference Engine). This java application integrates information using a modular approach which allows different types of data to be considered simultaneously and for new data types to be added as needed. Validation of the PIE was carried out using two realistically imperfect theoretical data sets. The first, based on the L7/L12 ribosomal protein, tested the limits of PIEs performance as intact mass accuracy and peptide coverage decreases. The second set, based on a mix of two modification variants of the H23c Histone protein, tested PIEs ability to handle isoform mixtures and up to eight simultaneous modifications. The PIE was then applied to analysis of experimental data from an investigation of the modification state of the L7/L12 ribosomal protein. This data consisted of a set of peptides identified as associated with some L7/L12 modification variant and nine intact masses measurements identified as an L7/ L12 modification variant. From this data, PIE was able to make consistent predictions, comparable to expert manual interpretation. Software, source code, user manuals, and demo projects replicating the analyses described in the following can be downloaded from http://pie.giddingslab.org/.
Date of publication
Resource type
Rights statement
  • In Copyright
  • " ... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Curriculum of Genetics and Molecular Biology."
  • Giddings, Morgan C.
Degree granting institution
  • University of North Carolina at Chapel Hill
Place of publication
  • Chapel Hill, NC
  • Open access

This work has no parents.