UNIGE document Scientific Article
previous document  unige:103680  next document
add to browser collection

Peptide mass fingerprinting peak intensity prediction: Extracting knowledge from spectra

Published in Proteomics. 2002, vol. 2, no. 10, p. 1374-1391
Abstract Matrix‐assisted laser desorption/ionization‐time of flight mass spectrometry has become a valuable tool in proteomics. With the increasing acquisition rate of mass spectrometers, one of the major issues is the development of accurate, efficient and automatic peptide mass fingerprinting (PMF) identification tools. Current tools are mostly based on counting the number of experimental peptide masses matching with theoretical masses. Almost all of them use additional criteria such as isoelectric point, molecular weight, PTMs, taxonomy or enzymatic cleavage rules to enhance prediction performance. However, these identification tools seldom use peak intensities as parameter as there is currently no model predicting the intensities based on the physicochemical properties of peptides. In this work, we used standard datamining methods such as classification and regression methods to find correlations between peak intensities and the properties of the peptides composing a PMF spectrum. These methods were applied on a dataset comprising a series of PMF experiments involving 157 proteins. We found that the C4.5 method gave the more informative results for the classification task (prediction of the presence or absence of a peptide in a spectra) and M5' for the regression methods (prediction of the normalized intensity of a peptide peak). The C4.5 result correctly classified 88% of the theoretical peaks; whereas the M5' peak intensities had a correlation coefficient of 0.6743 with the experimental peak intensities. These methods enabled us to obtain decision and model trees that can be directly used for prediction and identification of PMF results. The work performed permitted to lay the foundations of a method to analyze factors influencing the peak intensity of PMF spectra. A simple extension of this analysis could lead to improve the accuracy of the results by using a larger dataset. Additional peptide characteristics or even PMF experimental parameters can also be taken into account in the datamining process to analyze their influence on the peak intensity. Furthermore, this datamining approach can certainly be extended to the tandem mass spectrometry domain or other mass spectrometry derived methods.
Keywords DataminingMass spectrometryPeak intensityPeptide mass fingerprintingSWISS-PROTTheoretical predictio
Full text
Article (Published version) (356 Kb) - document accessible for UNIGE members only Limited access to UNIGE
(ISO format)
GAY, Steven Daryl et al. Peptide mass fingerprinting peak intensity prediction: Extracting knowledge from spectra. In: Proteomics, 2002, vol. 2, n° 10, p. 1374-1391. doi: 10.1002/1615-9861(200210)2:10<1374::AID-PROT1374>3.0.CO;2-D https://archive-ouverte.unige.ch/unige:103680

264 hits

0 download


Deposited on : 2018-04-19

Export document
Format :
Citation style :