Best bioinformatics software for MS-based proteomics analysis
Proteomics are the next step after genomics and transcriptomics to study biological systems. However, analyzing the proteome is much more difficult than the genome or transcriptome, because each cell expresses its own set of proteins. Mass spectrometry (MS) has emerged as the most important and popular tool to identify, characterize, and quantify proteins and their post-translational modifications with high throughput and on a large scale (Zhang et al.). To help you perform your experiments in the best conditions, we asked the OMICtools community to choose the best MS-based untargeted proteomics analysis tools.
LC-MS based untargeted proteomics
Two-dimension liquid-chromatography (LC) coupled with mass-spectrometry (LC-MS) is the leading technology for high-throughput proteomics. LC is used to separate proteins from different samples in parallel, and is then followed by selection and staining of differentially expressed proteins to be identified by tandem mass spectrometry.
Untargeted proteomics is a discovery-based strategy, where the goal is usually to identify as many proteins as possible. A typical workflow will consist of digesting proteins into peptides, followed by chromatographic separation and MS-based analysis, to yield a list of detected peaks characterized by their retention time, mass-over-charge ratio (m/z) values, and intensities (Tsai et al.).
To help you choose between all available tools, we asked OMICtools members to choose for their favorite MS-based untargeted proteomics analysis tools. Here is the top 3 of this survey.
First position for MaxQuant
MaxQuant is a quantitative proteomics software package designed for analyzing large-scale mass-spectrometric data sets, developed by the Max Planck Institute of Biochemistry. It supports all main labeling techniques like SILAC, Di-methyl, TMT and iTRAQ as well as label-free quantification.
MaxQuant is a comprehensive software that performs several analysis steps:
- Peak detection and scoring of peptides: MaxQuant corrects systematic inaccuracies of measured peptide masses and corresponding retention times.
- Mass calibration: It detects mass and intensity of peptide peaks in MS spectra and assemble them into 3D peak hills over m/z retention time plane, followed by filtration to identify isotope patterns.
- Database searches for protein identification: Peptide and fragment masses (in case of an MS/MS spectra) are searched in an organism specific sequence database, and are then scored by a probability-based approach termed peptide score.
- Protein quantification: High mass accuracy is achieved by weighted averaging and through mass recalibration.
The software is written in C# and freely available (download and installation guide).
Second position for Peaks
PEAKS Studio performs LC-MS/MS data analysis and statistics according to the experimental design. Following the identification of peptides with MS/MS spectra, the resulting peptide sequences are used to determine the original protein components of the samples.
PEAKS studio main features include:
- Peptide/Protein identification: de novosequencing, database search, post-translational modification (PTM) search with 500+ modification, sequence variant and mutation search.
- Protein quantification in complex biological samples: Label-free, label-based: TMT (MS2, MS3) / iTRAQ, SILAC, 18O labeling, ICAT.
- Supporting fragmentation types: CID, HCD, ETD/ECD, EThcD, IRMPD, and UVPD.
The Software proposes 3 main view modes: Protein view, peptide view (figure 3), and quantification view (figure 4).
PEAKS Studio is licensed commercially by Bioinformatics Solutions Inc. and a free trial available here http://www.bioinfor.com/download-peaks-studio/.
Third position for OpenMS
OpenMS is an open-source software C++ library for LC/MS data management and analyses, developed at the Free University of Berlin, the University of University of Tübingen, and the ETH Zurich. It provides a large number of tools (+200) to analyze proteomics datasets, in the form of command lines.
These tools can perform the following tasks:
- Import, export and conversion of vendor formats and several open community-driven XML formats (LINK and make a list somewhere).
- Preprocessing of spectra: Filtering based on various properties, Peak picking, Baseline and noise filtering.
- MS2 spectrum identification: Support for third-party peptide search engines, own customisable and extensible basic search engine, indexing of peptides in custom protein databases with SeqAn, statistical validation via posterior error probability and FDR/q-value calculation, combining results of different peptide search engines with ConsensusID.
- Working with MS1 maps.
- Protein inference with Fido.
- Visualisation of spectra (on all MS levels), features and peptide identifications in our TOPPView.
- Finding RNA and protein-protein crosslinks.
- Identification of phosphorylation sites with Luciphor.
- Support for data independent acquisition via OpenSWATH integration.
OpenMS is free software available here under the three clause BSD license and runs under Windows, macOS and Linux.
Zhang et al. (2014). High-Throughput Proteomics. Annu. Rev. Anal. Chem.
Tsai et al. (2016). Preprocessing and Analysis of LC-MS-Based Proteomic Data. Methods Mol Biol.
Tyanova et al. (2015). Visualisation of LC-MS/MS proteomics data in MaxQuant. Proteomics.
Röst et al. (2016). OpenMs: a flexible open-source software platform for mass spectrometry data analysis. Nature Methods.
Zhang et al. (2012). PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification. Mol. Cell. Proteomics.