Libraries/Frameworks software tools | Mass spectrometry-based untargeted proteomics
Open-source frameworks and libraries play an important role in the development and growth of the new MS-based proteomics tools. As a matter of fact, they can greatly simplify the implementation of the basic features needed in most tools and allow the developers to focus on the novel aspects, rather than on the basic functions, which can contribute substantially to achieve a faster development. Basic and complex functionalities are both supported, such as protein sequence digestion, sequence feature predictions, file format readers and converters, spectrum preprocessing and peptide/protein post-processing, among others.
Allows to manage and analyse Liquid chromatography coupled to mass spectrometry (LC-MS) data. OpenMS is a programming library and tool collection integrated into full-featured workflow systems, such as KNIME, Galaxy and WS-PGRADE, to facilitate bioinformatics research in the field of MS on all levels. The software provides pre-built and ready-to-use tools for analysis of both proteomics and non-targeted metabolomics data.
An open source Java program for computational analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data. DIA-Umpire enables untargeted peptide and protein identification and quantitation using DIA data, and also incorporates targeted extraction to reduce the number of cases of missing quantitation.
Manages proteomic mass spectrometry workflows and data analysis. Multiplierz provides a toolset of multiple methods for peptide identification, quantitation, reporting, as well as tools for easily manipulating standard data formats. This software is a Python library compatible with new reporting formats and high-level tools to achieve post-perform proteomic analyses. The architecture of the software environment has seamless integration with native data files via mzAPI.
Discovers the optimal weight factors for high accuracy compound identification in gas chromatography-mass spectrometry (GC-MS). iOPT employs a reference library without any query library. It can be considered as unsupervised learning. This tool focuses on the statistical characteristics of the distribution of mass spectral similarity scores among compounds in a reference library.
A free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources.
An open source framework for rapid and interactive development of LCMS data analysis workflows in Python. The goal was to establish a unique framework with comprehensive basic functionalities that are easy to apply and allow for the extension and modification of the framework in a straightforward manner. eMZed supports the iterative development and prototyping of individual evaluation strategies by providing a computing environment and tools for inspecting and modifying underlying LC/MS data. The framework specifically addresses non-expert programmers, as it requires only basic knowledge of Python and relies largely on existing successful open-source software, e.g. OpenMS.
Provides an interface for basic analysis of mass spectrometry data. mzDesktop gives access to the proteomic analysis algorithms from the library of the Multiplierz software. This software offers several functions existing in Multiplierz but through a graphical user interface. Users can visualize the degree of protein sequence coverage supported by given set of peptide-spectrum match (PSM). It is compatible with mzReports and mzIdentML files.
Performs complex multi-dimensional data analysis in terms of parameter variation. Ursgal is an unified interface that makes computational proteomics unified and scriptable, offering the possibility to exchange or extend any processing steps. This method allows the rapid development of novel workflows that require scriptable and unified access to mass spectrometry (MS) analysis tools. This type of analysis allow researchers to optimize their workflow and MS setup and thus potentially offer a deeper insight into their biological questions.
A reproducible, interactive analysis of AP-MS data. APOSTL contains a number of tools woven together using Galaxy workflows, which are intuitive for the user to move from raw data to publication-quality figures within a single interface. APOSTL is an evolving software project with the potential to customize individual analyses with additional Galaxy tools and widgets using the R web application framework, Shiny.
An open-source Java library for the analysis of mass spectrometry data from large scale proteomics and glycomics experiments. MzJava provides data structures and algorithms for representing and processing mass spectra and their associated biological molecules, such as metabolites, glycans and peptides. MzJava includes functionality to perform mass calculation, peak processing (e.g. centroiding, filtering, transforming), spectrum alignment and clustering, protein digestion, fragmentation of peptides and glycans as well as scoring functions for spectrum-spectrum and peptide/glycan-spectrum matches. For data import and export MzJava implements readers and writers for commonly used data formats. For many classes support for the Hadoop MapReduce and Apache Spark frameworks for cluster computing was implemented.
Provides a tool for mass spectrometry data visualization, annotation, and notebooking. mzStudio allows researchers to corroborate peptide-spectral-matches and associated quantitative measures across large, multidimensional liquid chromatography tandem-mass spectrometry (LC-MS/MS) data sets. Instrument platforms and search engines are also concerned. This software is appropriate to assist for describing novel modifications or surprising gas phase fragmentation behavior.
Permits users to organize and store user’s data in a relational database. Proline is a suite of software and components dedicated to mass spectrometry proteomics. It processes and analyzes these data to visualize and extract knowledge from mass spectrometry (MS) based proteomics results. The Proline process can be divided in several parts: importation, validation, quantification, visualization, publication.
Supplies a framework dedicated to the analysis of proteomic data. PIPE is a program for biological inference with a focus on ID mapping and Gene Ontology enrichment tasks. The application is built around a modulable architecture, allowing users to assemble features of interest according their needs. Its functionalities include a network viewer, an ID mapper, a module for generating Venn diagram and a search engine to retrieve organisms and terms within UniProt and Gene Ontology databases.