Computational protocol: A New Parameterized Algorithm for Rapid Peptide Sequencing

Similar protocols

Protocol publication

[…] To evaluate both the values of the tree widths and the path widths of extended spectrum graphs. We generated simulated tandem mass spectra for 100,000 fully tryptic digested peptides of proteins in the Yeast genome. We removed the peptides that contain less than 5 amino acids and more than 24 amino acids from the set of spectra. In order to obtain spectra that are similar to experimental ones, we create additional noisy mass peaks in these simulated spectra. These noisy mass peaks are generated in groups and the differences of mass values from mass peaks in the same group are those of single amino acids or their combinations. shows the distribution of path widths of the extended spectrum graphs in the presence of different amount of noise. We use PW to denote the path width and N/S is the ratio of the number of noisy peaks to that of the real peaks in the spectrum. We can see from the table that the values of the path widths of the majority of the extended spectrum graphs are less than 6 for most of the extended spectrum graphs. In addition, the value of the path width increases when more noisy peaks appear in a spectrum.We then use the program to process each extended spectrum graph and obtain the sequence of amino acids in the peptide for the longest antisymmetric path the program has found in the graph. We evaluate the accuracy of a sequencing result by the percentage of the amino acids that are correctly determined by the program. and compare the sequencing accuracy and computation time of our program (PDS) with that of PepNovo , NovoHMM , a computer program that solves the de novo sequencing problem with a hidden markov model, and our previous work (TDS) which can compute the longest antisymmetric path in an extended spectrum graph based on a tree decomposition of an extended spectrum graph. We can see from the tables that the program based on PDS is slightly faster than PepNovo and is significantly faster than both NovoHMM and TDS, since NovoHMM needs quadratic computation time and TDS needs time, where is the tree width of the extended spectrum graph. Although the sequencing accuracy drops slightly when the amount of noise increases, all four programs achieve a sequencing accuracy above 95%. PDS can achieve the same sequencing accuracy as that of TDS since both programs do the sequencing by computing the longest antisymmetric path in the extended spectrum graph. […]

Pipeline specifications

Software tools PepNovo, NovoHMM
Application MS-based untargeted proteomics
Diseases Multiple Sclerosis