Unlock your biological data


Try: RNA sequencing CRISPR Genomic databases DESeq

Database search software tools | Mass spectrometry-based untargeted proteomics

Protein identification and analysis by tandem mass spectrometry relies mostly on matching spectra to a database of protein sequences and scoring those matches.

Source text:
(Fournier et al., 2014) rTANDEM, an R/Bioconductor package for MS/MS protein identification. Bioinformatics.

1 - 50 of 88 results
filter_list Filters
healing Disease
settings_input_component Operating System
tv Interface
computer Computer Skill
copyright License
1 - 50 of 88 results
Finds all the spectra that correspond to a specific compound across different databases. SPLASH is a database-independent spectrum identifier that contains separate blocks for defining different layers of information, separated by dashes. The software was developed and refined on a dataset of more than 563000 mass spectra from MassBank, GNPS, HMDB, ReSpect, FiehnLib and NIST. It can be used for cross-reference identification and also allows for coarse similarity comparisons.
TopPIC / TOP-Down Mass Spectrometry Based Proteoform Identification and Characterization
A software tool for identification and characterization of proteoforms at the whole proteome level by top-down tandem mass spectra using database search. TopPIC efficiently identifies proteoforms with unexpected mutations and post-translational modifications and accurately estimates statistical significance of identifications. It uses several techniques, such indexes, spectral alignment, and a generation function method, to increase its speed, sensitivity, and accuracy.
MSPLIT-DIA / Mixture-Spectrum Partitioning using Libraries of Identified Tandem mass spectra
Permits untargeted and sensitive peptide identification in data-independent acquisition (DIA) data. MSPLIT-DIA is a spectral matching tool that uses spectrum projections to match library spectra to each DIA spectrum. This method also evaluates the similarity of the matched peaks between library spectra and multiplexed spectra across multiple consecutive DIA spectra. Assay libraries for targeted extraction tools are automatically generated by MSPLIT-DIA to facilitate coupling of sensitive identification with accurate quantification from DIA data.
An MS2 peak intensity prediction server that computes MS2 charge 2+ and 3+ spectra from peptide sequences for the most common fragment ions. The server integrates the Unimod public domain post-translational modification database for modified peptides. The prediction model is an improvement of the previously published MS2PIP model for Orbitrap-LTQ CID spectra. Predicted MS2 spectra can be downloaded as a spectrum file and can be visualized in the browser for comparisons with observations.
Associates uninterpreted tandem mass spectra of peptides with amino acid sequences (AAS). SEQUEST uses fragmentation patterns in tandem mass spectra to detect AAS from protein and nucleotide database. It correlates the spectrum with the experimental data via the prediction of fragment ions of an AAS. This software allows database searches with experimental data directly and offers to users the ability to correlate exactly uninterpreted tandem mass spectra to sequences to the database.
Serves for site localization analysis of phosphorylation events using tandem mass spectrometry (MS/MS) data. LuciPHOr consists of a target-decoy framework specifically designed for phospho-site localization. It can be used for estimating false localization rate (FLR) for any PTMs of a fixed mass. Moreover, this program constructs a probability model of peak intensity and mass accuracy across all spectra for correct and incorrect localizations respectively. It also computes the likelihood ratio as the localization score for each candidate site.
Permits to create and search spectrum libraries. BiblioSpec is a flexible spectrum comparison program that could be used in other applications such as for rapid detection of specific peptides of interest, for finding spectra in common between two experiments. It consists of several independent programs. The three mains are (i) BlibBuild, which creates a spectral library of mass spectra, (ii) BlibFilter, which modifies an existing spectrum library to contain only one spectrum per peptide, and (iii) BlibSearch, which matches query spectra to library spectra.
Interfaces the X!Tandem protein identification algorithm. rTANDEM can run the multi-threaded algorithm on proteomic data files directly from R. It also provides functions to convert search parameters and results to/from R as well as functions to manipulate parameters and automate searches. This brings to proteomics the many advantages of building an analysis pipeline in the R/Bioconductor statistical platform: easy deployment on high-performance computing and cloud computing through Bioconductor Cloud Amazon Machine Image (AMI), fully open-source workflows, interconnectivity of annotation and analytic packages, full reproducibility of analysis, etc.
A software tool to significantly improve the accuracy and efficiency of mass spectral data analysis in top-down proteomics (TDP). The precursor mass offers crucial clues to infer the potential post-translational modifications co-occurring on the protein, the reliability of which relies heavily on its mass accuracy. Concentrating on detecting the precursors more accurately, a machine-learning model incorporating a variety of spectral features was trained online in pTop via a support vector machine (SVM). pTop employs the sequence tags extracted from the MS/MS spectra and a dynamic programming algorithm to accelerate the search speed, especially for those spectra with multiple post-translational modifications.
Analyzes the results of user’s preferred cross-link search engine. xiFDR is a search-tool-independent application that supports two modes of operation for cross-links: directional and nondirectional. The application also maximizes the number of returned hits for a desired false discovery rate (FDR). xiFDR applies a stepwise FDR: it first filters the peptide-spectrum matches (PSMs) to a specified FDR and then aggregates the PSMs to unique combinations of peptide pairs, classifies these again by an FDR and accumulates the peptide pairs that pass the FDR to unique residue pairs.
Identifies peptides from a sequence database with tandem mass spectrometry data. PEAKS employs de novo sequencing as a subroutine and exploits the de novo sequencing results to improve both the speed and accuracy of the database search. Each protein obtains a score by adding its three highest peptide CAA scores, and the protein feature of a peptide is the maximum score of the proteins containing this peptide. PEAKS also provides a user-friendly interface to show each resultant peptide spectrum match from de novo sequencing.
MS Amanda
A scoring system to identify peptides out of tandem mass spectrometry data using a database of known proteins. This algorithm is especially designed for high resolution and high accuracy tandem mass spectra. One advantage of MS Amanda is the high speed of spectrum identification, as it can achieve on average 1.5 ms per spectrum on well-resourced workstations. In addition, MS Amanda is also very accurate, as we observe a high overlap of identified spectra with gold-standard algorithms Mascot and SEQUEST.
Morpheus search algorithm
A software tool designed specifically for high-mass accuracy data, based on a simple score that is little more than the number of matching products. For a diverse collection of data sets from a variety of organisms (E. coli, yeast, human) acquired on a variety of instruments (quadrupole-time-of-flight, ion trap-orbitrap, and quadrupole-orbitrap) in different laboratories, Morpheus gives more spectrum, peptide, and protein identifications at a 1% false discovery rate (FDR) than Mascot, Open Mass Spectrometry Search Algorithm (OMSSA), and Sequest. Additionally, Morpheus is 1.5 to 4.6 times faster, depending on the data set, than the next fastest algorithm, OMSSA.
Analyses the results from a number of different mass spectrometry (MS)/MS search engines, based on decoy database searching. It accepts results in a number of different formats and outputs a list of candidate peptide and protein identifications in mzIdentML, tab-separated, and comma separated formats. The program acts as a native-to-mzIdentML converter and can combine the results from different search engines to give a set of consensus results that have greater reliability and sensitivity than the results from any single search engine.
Cascaded search
An iterative procedure for incorporating information about peptide groups into the database search and confidence estimation procedure. cascade search provides a principled and flexible way to assign peptides to observed spectra with high statistical power, as long as the user is willing to provide in advance a statistical confidence threshold and a series of appropriately ordered peptide databases. Cascade search is particularly valuable in studies that include increasingly diverse types of PTMs and particularly in the context of large proteogenomics studies where unexpected sequence variants must be considered.
Assesses experimental m/z error and derives parameters to search a Liquid Chromatography-Mass Spectrometry (LC−MS)/MS experiment. Param-Medic assumes that LC−MS/MS experiments are likely to make multiple observations of many peptide ions. It exploits repeated measurements to provide valuable information about the m/z tolerance characteristics of the experiment. The tool was tested on eight data sets from public repositories from a variety of organisms and instruments. The running time is much shorter than that of Preview.
Performs automatically multiple spectral library searching, class-specific false-discovery rate (FDR) control and result integration. Epsilon-Q demonstrates good performance in identifying and quantifying proteins by supporting standard mass spectrometry data formats and spectrum-to-spectrum matching. It can be a versatile tool for comparative proteome analysis based on multiple spectral libraries and label-free quantification. The tool allows the user to perform multiple spectral library searching.
hEIDI / (h) Exploitation et Integration des Donnees d'Identification
Manages and combines both identifications and semiquantitative data related to multiple liquid chromatography-mass spectrometry (LC−MS)/MS analyses. hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. It allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. The tool was able to compare pools of analyses in projects for which up to 1500 search results had been combined.
PIPI / PTM-Invariant Peptide Identification
A method to achieve PTM-invariant peptide identification. PIPI first codes peptide sequences into Boolean vectors and converts experimental spectra into real-valued vectors. Then, it finds the top 10 peptide-coded vectors for each spectrum-coded vector. After that, PIPI uses a dynamic programming algorithm to localize and characterize modified amino acids. Simulations and real data experiments have shown that PIPI outperforms existing tools by identifying more peptide-spectrum matches (PSMs) and reporting fewer false positives. It also runs much faster than existing tools when the database is large.
DRIP / Dynamic bayesian network for Rapid Identification of Peptides
A tool which utilizes a dynamic Bayesian network (DBN) for rapid identification of peptides in tandem mass spectra. Given an observed spectrum, DRIP scores a peptide by aligning the peptide's theoretical spectrum and the observed spectrum, i.e., computing the most probable sequence of insertions (spurious observed peaks) and deletions (missing theoretical peaks). DBN inference is efficiently performed utilizing the Graphical Models Toolkit, which allows easy alteration of the model.
Builds an index for a peptide sequence database. PepID employs clusters of peptide amino acid attribute vectors (PAAV) to proceed. It can assist users to detect peptides and proteins. This tool is able to remove redundant clusters with a user-defined distance termed the radius of elimination, and to iteratively attribute PAAVs to cluster centers. It starts by aggregating peptide vectors into large clusters and then applies a second clustering to group them into smaller clusters.
Provides an extensible search platform for shotgun proteomics. IdentiPy is a Python package responsible for peptide identification. It also includes a GUI encompassing peptide identification, post-search validation, protein inference and quantification. The user interface includes an authentication system and allows uploading tandem mass-spectrometry (MS/MS) data, protein databases and configuration files to the server, starting searches, as well as viewing the results of the completed searches.
MS Data Miner
Aims at minimizing the time required for the analysis, validation, data comparison, and presentation of data files generated in mass spectrometry (MS) software. MS Data Miner was developed to significantly decrease the time required to process large proteomic data sets for publication. This open source system includes a spectra validation system and an automatic screenshot generation tool for Mascot-assigned spectra. In addition, a Gene Ontology term analysis function and a tool for generating comparative Excel data reports are included.
0 - 0 of 0 results
1 - 3 of 3 results
filter_list Filters
person Position
thumb_up Fields of Interest
public Country
language Programming Language
1 - 3 of 3 results