Robert Chalkley
The ability to reliably assign sites of post-translational modification is one of the more challenging aspects of mass-spectrometry-based proteomic analysis. Higher quality data is generally required to identify a site of modification than is required to identify the peptide and its modification state, such that it is commonly the case that a modification can be reliably assigned but the exact location of the modification within the peptide may be ambiguous.
Data analysis is performed by one of a range of database search engines, but these are usually optimized for peptide identification rather than modification site assignment. For example, software normally apply an intensity threshold to the peak list to reduce the number of noise peaks, leading to a higher percentage of peaks being matched; e.g. it is statistically more significant if 25 out of the 30 most intense peaks match than 30 out of the 60 most intense peaks. However, those five extra peak matches may be important for isolating the modification site. The second weakness of most search engines for modification site assignment is that they do not clearly indicate which modification site assignments are reliable and which are ambiguous; i.e. where the data is consistent with the site reported, but another site is equally well supported. Hence, further analysis of spectra identifying PTM sites is generally required. This is currently most commonly performed by ‘manual verification’, which can be reliable if the person is experienced in spectral analysis, but is a subjective process and is going to be inconsistent. Alternatively, a second piece of software specifically designed for modification site assignment may be employed. This software is not going to be as powerful and sophisticated as an expert manual reviewer, but will provide more consistent results and it may be possible to assign a measure of reliability to the site assignments.
Recognizing this weakness, the MCP journal publication guidelines for studies reporting PTM site assignments require authors to make available annotated spectra for all biological PTM site assignments, so that readers can view the spectra and decide whether they agree with the assignment. This raises the questions: how often do people disagree with other people’s interpretations and how reliable are PTM site assignments reported in the literature?
The ‘Proteome Informatics Research Group’ (iPRG) of ABRF is seeking to answer these questions in their 2010 study. They are distributing a phosphopeptide dataset and are asking participants to report modification site assignments and methods they used to derive these sites. Results will be published at the ABRF conference in March 2010, will be presented at other conferences and will be made available through the ABRF website. As there are a number of phosphorylation publications in the literature reporting hundreds or even thousands of phosphorylation sites, these results will be very interesting for assessing the reliability of already published sites.
Participation in this study is not restricted only to members of ABRF, so anyone interested in partaking should send an e-mail to iPRG2010@gmail.com. Data will be distributed mid-November and results should be returned by Monday January 10th 2010.