MIDAs / Molecular Isotopic Distribution Analysis
Allows users to determine theoretical isotopic distribution for molecules. MIDAs is an application available as both a software and a web interface. The program is made of two algorithms designed for identifying the targeted molecule in various formats for assessing coarse-grained isotopic distribution (CGIDs) and fine-grained isotopic distribution (FGIDs). Users can adjust the parameters by choosing an algorithm or by setting the distribution mass accuracy.
Investigates data stream in protein-protein interaction (PPI) networks. ITM Probe allows users to choose between three different types of models: emitting, absorbing or channel. The application permits to select interaction graphs and to determine the nodes that have to be excluded. Users also can set sinks, sources (only for the channel model), and dissipation criteria. It also includes a functionality for retrieving the submitted queries by ID.
MENGA / Multimodal Environment for Neuroimaging and Genomic Analysis
Allows exploration of correlation patterns between neuroimaging data with Allen human brain database (ABA) mRNA gene expression profiles. MENGA was applied to six different imaging datasets that target the dopamine and serotonin receptor systems and the myelin molecular structure in the human brain. It is useful to compare genomic and imaging data. This tool gives a quantitative assessment of the amount of the variability in the image phenotype.
SAKE / Spectral Analysis for Kinetic Estimation
Conducts complex biological quantification. SAKE can be useful in spectral analysis. It processes Positron Emission Tomography (PET) exams data. This tool allows visualization of data and provides several functions to view results.
NSR / Nonlinear stochastic regularization
Quantifies both cerebral blood flow (CBF) and mean transit time (MTT). NSR is based on a Gaussian prior approach. It consists in formulating a filtered version of a log-normal prior for R(t). This tool utilizes a virtual grid for reconstructing the residue function of tissue concentration samples. It was tested on simulated data and compared with other methods in presence and absence of bolus dispersion.
NSIT / novel sequence identification tool
Identifies novel sequences in a de novo human genome assembly. NSIT offers a graphical viewer to assist in studying the overlap of novel sequences with other set of selected sequences. It permits quick visual inspection of sequence contamination. This tool follows three steps: (1) k-mer hash table construction, (2) chromosome assignment, and (3) query alignment. It is capable of aligning de novo sequences to the reference genome.
ARIA / Automatic Root Image Analysis
Provides a root system architecture characterization software. ARIA is based on a mathematically rigorous approach of converting root images into graphs. It automates phenotyping with the potential of adding additional features. This tool completes large phenotyping experiments required for many quantitative genetic studies. It is able to analyze 2D flat plane images and 3D images of roots.
Predicts virulent proteins in both metagenomic and genomic datasets. MP3 is based on a support vector machine-hidden Markov model (SVM-HMM) approach. It can be used to compare the proportion of pathogenic proteins in a healthy and diseased sample without the use of time-consuming homology-based alignment. This tool can predict pathogenic proteins in genomic datasets with such high accuracy and sensitivity.
Permits metabolic network analysis. optGpSampler is a software for uniform sampling of the steady-state solution space of metabolic networks. It exploits p processors and generates p chains in parallel. This tool is useful to constraint-based metabolic network analysis. It is based on the Artificial Centering Hit-and-Run algorithm but was implemented in a different manner.
Recognizes single nucleotide polymorphisms (SNPs) and other alterations with respect to a reference genome using sequence reads. galign can detect candidate sequence lesions leading to an observable mutant defect in the organism under investigation. It is useful for small and medium-sized genomes and has been extensively tested on the heavily annotated genome of Caenorhabditis elegans.
SiDCoN / Simulated DNA Copy Number
Allows interpretation of complex regions of change. SiDCoN can be useful for training researchers to accurately score whole-genome profiles in the presence of significant stromal contamination. It can serve to the estimation of the level of stromal contamination within a tumour sample. This tool offers a way to users to assess a wide variety of SNP-aCGH data interpretations. It can be employed to estimate the stromal contamination rate within tumour biopsies.
Generates an evolutionary gene print (EvoP) of invariant DNA sequences as they appear in the reference DNA. EVOPRINTER superimposes multiple alignment readouts of individual reference-DNA versus test-genome alignments to proceed. It permits users to find multispecies-conserved sequences (MCSs) that are shared among three or more orthologous DNAs. This tool can be useful to understand gene regulation in all animals.
Compares usage frequencies of the codons used to encode a protein sequence of interest to hypothetical mRNA sequences encoding the same protein with the most common or rare codons. %MinMax detects synonymous codon usage patterns most likely to affect co-translational folding. It removes the underlying usage frequency variations due to differences in amino acid usage. This tool assists in understanding local fluctuations in translation rate that might impact folding.
CoLIde / Co-expression based sRNA Loci Identification
Determines sRNA loci. CoLIde integrates dynamic sRNA expression levels and size class with genomic location to assist in finding distinct loci. It was applied to a total of four plant data sets on Arabidopsis thaliana, Solanum lycopersicum, and the Drosophila melanogaster, animal data set. This tool can preserve patterns from the sRNA level to locus level. It tends to predict compact loci for which the probability of hitting two distinct annotations is low.
Facilitates visualization of genome structure. GMOL was created for Jmol. It adds and modifies several additional functions to make genome structure visualization possible and sufficient. This tool allows users to select any unit, at any scale, and scale it up to a lower resolution or down to a higher resolution. It supports querying from Ensembl and from a local database.
Furnishes a method for tracking real-time auditory attention from non-invasive M/EEG recordings. Real-time-Tracking-of-Selective-Auditory-Attention is a software, based on Bayesian filtering, that performs in three steps: (i) estimation of dynamic models of encoding and decoding in real-time; (ii) extracting an attention-modulated feature; and (iii) determination of the given feature by using a state-space simulator and translation of the results to provide an evaluation of the attentional state.
Quaternary Structure Evaluation Tool
Allows users to detect incorrect quaternary structures in the Protein Data Bank (PDB) and assign the most probable quaternary structures for the hypothetic incorrect annotations. Quaternary Structure Evaluation Tool is a web application that proposes a combined approach of four methods: clustering for, text mining, PISA and EPPIC, for providing various results that can be compared or visualized individually.
Determines pairwise orthology relations. OrthoGNC first calculates pairwise sequence similarities, then identifies homologous sequences, for finally inferring orthologs according to gene neighborhood conservation. It authorizes to set various parameters all along the process according to the desired output. Besides, the software also can be used for investigating gene neighborhood conservation for refinement and assessment of other orthology inference methods.
Determines lipid peroxidation products (LPPs) from Liquid chromatography coupled to tandem mass spectrometry liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) data dependent acquisition (DDA) datasets. LPPtiger is an open-source software that uses three algorithms coupled to a sample-specific native lipidome to predict phospholipids (PL)-bound LPPs and simulates a tandem mass spectra library for LPP identification. Moreover, the software can be customized to be fitted to users’ goals.
Detects essential proteins in a protein-protein interaction (PPI) network using graph topological features as well as experimental data. DiffSLC is a computational method that merge node and edge centrality methods with gene expression data to improve the detection of protein essentiality in yeast protein interaction networks. The software intends to provide an alternative to other centrality methods.
ADEPt / Adverse Drug Event annotation Pipeline
Allows users to detect and annotate temporally anchored mentions of Adverse drug events (ADEs) from a clinical text corpus. ADEPt is a modular pipeline that first perform ADE mentions’ identification, and then, organize it, for finally refining the classification thanks to contextual indicators furnished by the source. The application also includes a way for targeting ADE-specific patterns in psychiatric clinical text and an expandable dictionary depicting over 60 common ADEs.
Provides identification of natural organic matter (NOM) species using high-resolution mass spectra (HR MS). Formularity is a software that couples a two algorithmic searches functions through a graphical interface as well as a least squares regression based internal calibration function. Formulas can be assigned for high-resolution mass spectra collected in positive or negative ionization mode, with proton or electron ion physics, charge states, and different molecular adducts.
SVAw / Surrogate Variable Analysis web
Allows surrogate variant analysis of high- throughput datasets. SVAw is a web and standalone application that enables researchers to utilize Surrogate Variable Analysis (SVA) when analyzing high throughput genomic data. It aims to capture heterogeneities in the dataset that can potentially lead to biased analysis of the data. The software calculates probe/gene statistics such as the fold change and p-value for both pre (unadjusted) and post SVA analysis (adjusted with sva).
RECOT / REad COordinate Transformer
Transforms the coordinates of short reads between two species. RECOT is a set of programs that converts the alignment or mapping coordinates of short reads obtained from the query species to a comparison target species. The software allows comparative genomics and comparative transcriptomics between model and non-model organisms. It can be used for species in which the genome sequence is unavailable.
OCPAT / Online Codon-Preserved Alignment Tool
Aligns genes with the protein coding frames preserved. OCPAT automates gene alignments on a genome-wide scale with the reading-frame preserved for each set of putatively orthologous coding sequences. The software consists in 6 steps: (1) submission, (2) ortholog extraction, (3) error correction, (4) determination of reading frame, (5) core alignment and (6) output. The alignments can be applied to evolutionary analyses using appropriate software.
GDT / Genome Display Tool
Allows examination of various types of data in the context of a whole bacterial genome. GDT aims to allow users to think about the properties of genes in a genome wide context. The software allows simultaneous visualization of the occurrence of multiple biological features relating to a bacterial genome, using different colors and shapes. It can also be used to display data relating to groups of genes, proteins, or groups of proteins.
Mutation Reporter Tool
Displays loci of interest and patterns of residues for any sequence data. Mutation Reporter Tool is an online tool developed to assist scientists with data analysis. The software allows users to analyze phylogenetics nucleotide or amino acid sequence data from any organism. It can be used for both genotyping and serotyping of hepatitis B virus (HBV) without the requirement of computer skills or knowledge of phylogenetics.
Fragment Merger Tool
Merges long overlapping sequence fragments. Fragment Merger Tool is a genome-agnostic, web-based, assembly software developed using hepatitis B virus (HBV) sequence data. It allows automated assembly of two to twelve long overlapping sequence fragments and enables assembly of sequence data from insertion or deletion mutants and recombinants, as a reference sequence is not used for assembly. The software can be used by researchers without specialist computer skills.
Allows generation and analysis of viral sequences with large-scale synonymous mutation. CodonShuffle allows users to generate synonymously mutated sequences and to analyze them for differences in dinucleotide frequency, codon usage, codon pair bias, and free energy of RNA folding. It can be useful in designing genomes for subsequent experimental studies of the fitness impacts of synonymous mutation.
Deep Threshold Tool
Allows exploration of ultra-deep pyrosequencing (UDPS) data. Deep Threshold Tool is a web application that examines the number of errors in each position (column) in an alignment, depending on the probability of error value. The software provides the researcher with detailed output of variation at different probabilities of error. It can process data for a project, so that a probability of error can be selected for that specific project, organism or assay.
Allows exploration of ultra-deep pyrosequencing (UDPS) data. Rosetta Tool was used to analyze sequence data at the amino acid level.
ZEOGS / Zebrafish Expression Ontology of Gene Sets
Calculates enrichment of zebrafish gene expression pattern features. ZEOGS a web-based computational method that predicts the anatomical region(s) in which changes in expression patterns take place for a given set of input genes. The prediction is based on the expression information available in the Zebrafish Model Organism Database (ZFIN) database and the Anatomical Ontology (AO) of zebrafish.
RecDraw / Recombinant HIV-1 Drawing Tool
Allows users to graphically represent and compare recombinant HIV-1 structures and breakpoints. RecDraw is a stand-alone application that was used to represent recombinants among CRF01_AE, and subtypes B and C sampled in Asia. Users can select the order in which the strains are depicted, the colors to represent the different subtypes and positions of interest can be highlighted with a fine vertical line. The software can be useful for molecular epidemiology and virology.
Allows comprehensive evaluation of hypermutated genomes. HyperPack is a platform-flexible, stand-alone application based on knowledge of apolipoprotein B mRNA-editing catalytic polypeptides (APOBECs). The software consists of five modules: (1) bases, (2) substitutions, (3) context, (4) substratescan, and (5) hyperscan. The capacity to simultaneously compare a query sequence to a number of reference sequences is aimed at more effective discrimination of the hypermutation signal from background mutations.
Consists in a version of the software GeneXplorer, for visualization of microarray datasets, modified to work with the online program Stainfinder.
Allows visualization of microarray datasets. GeneXplorer was developed for use in web supplements of microarray publications whose raw data are housed within the Stanford Microarray Database (SMD), and for use as a tool to allow SMD users to browse their own data within SMD before publication. It has been used by many publications to provide access to microarray datasets through their web supplements, as well as the basis for visualization of fuzzy k-means cluster data.
Performs nonlinear regression analysis of DNA re-association kinetics data, in association with the package SAS. CotQuest, is a suite of scripts implementing an algorithm that eliminates the need for input of parameter guesses. The software is available in two downloadable variations: (1) CotQuestU which can be used with any SAS-compatible operating system and (2) CotQuestG with a graphical user interface (GUI) that guides users through the Cot analysis process. It has been tested on Cot data from eight species.
SRCP / Sequence Read Classification Pipeline
Allows characterization of genomes, based on sample shotgun sequencing. SRCP is a sequence read classification pipeline that calculates the fraction of base pairs in each category, and thus provides an overview of genome structure while facilitating initial annotation of query sequences. The software can be used to determine the efficiency of reduced-representation. It can also be coupled with other scripts that allow further utilization of the sequence data.
Allows integrative context-dependent analyses of diverse local and remotely hosted datasets, as well as annotation and spatial querying. CruzDB is a parallelizable programmatic interface with University of California, Santa Cruz (UCSC) genome browser that offers a syntax to address common use-cases including annotation and spatial querying. The software can be used for any organism and version available in the UCSC database.
Allows integrated analysis of nucleosome positioning and transcription factor (TF) binding sites in the promoter regions of yeast genes. Ceres is a web-based software platform that provides analysis, visualization and mining tools. The software offers five features: (1) visualization, (2) chromatin viewer, (3) gene set analysis, (4) data mining, and (5) analysis suite. It also provides access to predicted, conserved and experimentally identified binding sites throughout the yeast genome for 105 distinct yeast TFs.
Designs selector probes for exon resequencing of a set of genes. Disperse is an integrated software system that generates a set of selector probe sequences, designed to select the largest possible portion of the targeted sequence. The software performs the design work in a pipeline fashion, where several steps are executed sequentially. Disperse depends on external data sources, programs and libraries.
CaDrA / Candidate Driver Analysis
Searches for the set of genomic alterations associated with a user-provided ranking of samples within a dataset. CaDrA is based on a stepwise heuristic search to recognize a subset of features whose union is maximally-associated with the observed sample ranking. It can carry out rigorous statistical significance testing based on sample permutation. This tool enables users to select sets of genomic features that drive certain oncogenic phenotypes in cancer.
YAPSA / Yet Another Package for Signature Analysis
Permits users to analyze somatic signatures. YAPSA gathers a set of functions and routines to proceed. It allows signature analysis thanks to known signatures (LCD = linear combination decomposition) and to stratified mutational catalogue (SMC = stratify mutational catalogue). This tool provides a function to iteratively add information to an annotation data structure. It can group single nucleotide variants (SNVs) into 6 different categories.
TPP / Thermal Proteome Profiling
Allows analysis of thermal proteome profiling (TPP) experiments. TPP is able to proceed with varying temperatures (TR) or compound concentrations (CCR). It invokes routines for data import, data processing, fold change computation, median normalization, TPP-CCR curve fitting, plotting and production of the result table. This tool is able to remove zero sumionarea values and compute fold changes from raw data.
IHW / Independent Hypothesis Weighting
Finds associations in large datasets. IHW is based on the Benjamini-Hochberg procedure and uses weights derived from the data. It divides the tests into groups based on the covariate. This tool assigns low weight to covariate-groups with low signal. It is capable of avoiding loss of false discovery rate (FDR) control by employing randomization in the form of hypothesis splitting into k-folds.
CESAM / Cis Expression Structural Alteration Mapping
Finds somatic copy-number alterations (SCNAs) mediating gene dysregulation in cis by integrating SCNAs, expression and chromatin interaction domain data. CESAM employs statistical concepts from expression quantitative trait locus mapping to proceed. It integrates SCNA breakpoint data with donor-matched transcriptome (mRNA-seq) data to recognize candidate genes in cis. This tool can be useful to uncover genetic driver alterations in cancer genomes.
SIAMCAT / Statistical Inference of Associations between Microbial Communities And host phenoTypes
Aims to identify changes in community composition that are related with environmental factors. SIAMCAT analyses relation between microbial communities and host phenotypes. It supports data pre-processing, statistical association testing, statistical modelling. This tool provides functions for evaluation and interpretation of statistical models, such as cross validation, parameter selection, ROC analysis and diagnostic model plots.
Predicts exact RNA matching. ExpaRNA can compare RNAs by exact local matches. It computes the best arrangement of sequence-structure motifs common to two RNAs. This tool is able to find information about identical structural motifs. It determines the set of all possible exact pattern matches (EPMs) for two given RNAs in a pre-processing step. ExpaRNA is useful for comparative sequence analysis in biology and in high-throughput RNA analysis tasks.
CS-Rosetta / Chemical Shift Rosetta
Generates de novo protein structure. CS-Rosetta allows selection of database fragments that match the structure of the unknown protein. It was used to determine the structures of nine in progress structural genomics targets with sizes in the 65-129 residue range, yielding structural models that are highly consistent with their independently solved experimental nuclear magnetic resonance (NMR) structures.
SAPP / Semantic Annotation Platform with Provenance
Annotates automatically genome sequences using standard tools. SAPP is a semantic framework for large scale comparative functional genomics studies. Annotation results and their provenance are stored in a Linked Data format, thus enabling the deployment of mining capabilities of the Semantic Web. This application supports periodic querying, comparison and linking of diverse annotation sources, resulting in up-to-date genome annotations.
Enables users to conveniently analyze various types of controllability of biomolecular networks. CytoCtrlAnalyser is a Cytoscape app that integrates nine recently developed algorithms to establish a comprehensive platform. This module offers comprehensive calculations and integrates algorithms for investigating the controllability of biomolecular networks as well as other various complex networks.
Allows an accurate single nucleotide polymorphisms (SNPs)-aware alignment for allele-specific expression (ASE) analysis. ASElux is an approach that focuses on SNP-overlapping reads and combines the alignment and estimation of allelic expression into one step. It builds a personal allelic reference genome by using the individual’s existing genotype information to generate all possible ASE reads and pre-screen the RNA-seq data.
Offers a method to create and manage ad hoc pipelines with the capability of producing production quality pipelines. PipelineDog provides cross-platform compatibility, project management capabilities, code formatting and error checking functions. This pipeline is available as (i) a web interface with sufficient functionality for constructing and debugging pipelines and as (ii) simple scripts designed for users without extensive programming experience.
Assists users in removing biases in single-cell Hi-C data. scHiCNorm is a software package that uses zero-inflated and hurdle models. This model eliminates systematic biases for single-cell Hi-C data, which better reveal variations between cells in chromosomal structures. This method also includes cutting sites, GC content, and mappability.
Provides methods for computing different p-value based correction methods. Myriads incorporates a simulator for two sample t-tests and Cochran-Armitage case-control tests. The simulator can be used to obtain estimates of the variance of the number of false discoveries jointly with the per family error rate (PFER) committed by any of the correction methods. This software presents three main aspects: (i) correction methods, (ii) dependence test and (iii) the incorporation of an autocorrelation test based on the generalized Durbin-Watson statistic.
Permits simulating and specifying complex systems biology models at multiple levels of organization. ML-Rules is a rule-based modeling language for dynamically nested biochemical reaction networks. This method is based on the Direct Method of the stochastic simulation algorithm. The state of an ML-Rules model is defined by a multi-set of nested and attributed entities.
Provides an alternative way of specifying simulation experiments. SESSL is a domain-specific language developed as binding for ML-Rules and adds an easy-to-use interface for running simulations with the new simulation algorithms. This application allows for efficiently performing simulation studies by exploiting the asynchronous, parallelized simulation interface to execute replications simultaneously. It can also be used for simulation-system agnostic and other simulation systems.
Assists users in exploiting the alignment information contained in the SAM/BAM files. CALQ is a lossy compressor for quality values that computes a genotype certainty level per genomic locus to determine the acceptable coarseness of quality value quantization for all the quality values associated to that locus. It also uses the alignment information to determine the acceptable level of distortion for the quality values such that subsequent downstream analyses are presumably not affected.
QScomp / Quality Score compression
Provides a quality score compression software. QScomp compresses quality scores extracted from FASTQ, SAM, or BAM file.
Provides a toolkit for handling sequence data from different platforms. It evolves with new input file specifications such that it will remain a universal standard for lossless pre-processing sequencing data. It also offers (i) an easy and extensible sequence read management, (ii) a common framework to process singled or paired -end datasets and (iii) a consistent output following the standard SAM specifications.
Assists users in creation of Circos. shinyCircos is an R/Shiny application that provides a graphical user interface for interactive creation of Circos plot. Various types of Circos plots could be easily generated and decorated with simple mouse-click. It allows users to upload up to ten datasets to visualize in different tracks. For each track, six types of plots can be created, including scatter plot, line plot, bar plot, rectangle, heatmap and chromosome ideogram.
Leverages global information at the document level generated by the attention mechanism to ensure consistent tagging across multiple instances of a single token in a document. Att-ChemdNER is a neural network approach to document-level chemical named entity recognition (NER) that aims to automatically detect the chemical mentions in biomedical literature, which is a fundamental step for further biomedical text mining.
Facilitates the development of web applications enabling real-time, interactive rendering of millions of data points on a wide range of devices. Fun is a framework that is a powerful tool to provide an additional route of access to big data, which can only be searched or summarized. It can be used for visual inspection of conceptual chemical spaces as a help to cope with big data in chemistry for the case of large databases of molecules.
protk / Proteomics toolkit
Provides a suite of tools for proteomics. Protk is an application that provides analysis tasks supports: (i) tandem mass spectrometry (MS) search with X!Tandem, Mascot, OMSSA and MS-GF+, (ii) peptide and protein inference with Peptide Prophet, iProphet and Protein Prophet, (iii) conversion of pepXML or protXML to tabular format and (iv) proteogenomics (mapping peptides to genomic coordinates).
OSSE / Online Sample Size Estimator
Estimates sample size for case-control association studies. OSSE is sample size estimator determines the necessary sample size in the setting of a pilot study, with unknown actual minor allele frequencies. The base values are set for the conventionally used significance level of 5% at 80% power. User can choose to calculate significance level or power instead by providing the other variables.
Assists in creating a virtual Bioinformatics laboratory. GeneGrid is a Grid computing framework that accomplishes the seamless integration of a myriad of heterogeneous resources. It spans multiple administrative domains and locations and provides the scientists an integrated environment for the streamlined access of a number of bioinformatics and other accessory programs through a simple and intuitive interface.
bicycle / BIsulfite-based methylCYtosine CalLEr
Analyzes whole genome bisulfite sequencing (WGBS) data. bicycle is a next-generation sequencing (NGS) bioinformatic pipeline that can process data from directional (Lister) and non-directional (Cokus) bisulfite sequencing protocols and from single-end and paired-end sequencing. It also performs methylation calls for cytosines in CG and non-CG contexts (CHG and CHH). It provides statistical methylcytosine calling and offers several filters to screen reads.
ISPRED / protein Interaction Sites PREDiction
Provides a server for the prediction of interaction sites in proteins. ISPRED is a web application that needs PDB, DSSP and HSSP file.
Identifies disease-causing non-synonymous Single Nucleotide Variants (nsSNVs). Meta-SNP is a meta-predictor that couples some of the leading methodologies in prediction of nsSNV-disease (PhD-SNP) and nsSNV-function associations (PANTHER, SIFT, SNAP). It provides an accurate way of assessing disease-association of human variants. It also enables discrimination between disease associated and polymorphic variants in unconserved sites.
Predicts the effects of single amino acid polymorphisms (SAPs). Dr. Cancer is a machine learning approach that expects if a non-synonymous Single Nucleotide Polymorphisms (SNPs) is related to cancer. It calculates deleterious single point mutations considering in a unique context protein structure information, used for the prediction of stability changes and protein sequence, evolutionary and functional information.
LUX Score / Lipidome jUXtaposition score
Calculates the similarity between lipidomes. LUX Score is a single metric based solely upon an identity matrix for exchange values, which does not account for quantitative changes. It facilitates inter-species functional association that are applied in comparative genomics. This workflow is customizable in regard to the complexity of the lipidome study. This approach is also compatible with high-throughput lipidomics.
MIRU-VNTRplus / Mycobacterial Interspersed Repetitive-Unit-Variable-Number Tandem-Repeat plus
Allows a robust-lineage identification based on the combination of different genotyping data. MIRU-VNTRplus is a system with a focus on evaluation of a reference database. It offers three main functions: (i) phylogenetic lineage identification by using a reference database; (ii) analysis and visualization of genotyping data; and (iii) access to the MLVA MtbC15-9 nomenclature service. It includes features minimum spanning tree (MST), geographic mapping and the nomenclature service.
Identifies repeat junctions and then designs repeat junction marker (RJM) primers. RJPrimers is a high-throughput computational tool that employs a BLASTN search and a repeat junction finding algorithm. It includes three contiguous operational steps, a BLASTN search against a repeat database, repeat junction identification and primer design. Its performance depends on the number of sequences and their sizes, the number of repeat junctions in the sequences, the size of the repeat database selected, and the speed of the computer.
Provides 3D models. SAM-T08 is a protein structure prediction server that offers a large number of intermediate results, which are often interesting in their own right: multiple sequence alignments (MSAs) of putative homologs, prediction of local structure features, lists of potential templates of known structure, alignments to templates and residue–residue contact predictions. This server has been validated as part of the CASP8 assessment of structure prediction.
CCR XP / Clusters of Conserved Residues XP
Explores clusters of conserved residues. CCR XP is a web application that automates the detection and analysis of such clusters in protein structures. It is composed of two input modules: (i) CCR XP lite that extracts file from atom records, finds aligned sequences, calculates conservation scores, clusters conserved residues and reports structural properties of each residue, and (ii) CCRXP ADV, an advanced version that allows users to select a number of filtering and clustering options that can be used.
ENM / Elastic Network Model
Allows users to calculate biomolecular systems dynamics. ENM is an online application analyzing the dynamics of structurally resolved systems and permitting users to produce information about the collective dynamics of any structure. It includes two different elastic network models (ENMs): (1) the Anisotropic Network Model (ANM) and the Gaussian Network Model (GNM).
Assists users in understanding association and interrelation of age-related disorders (ARDs) and associated proteins, pathways, and drugs. ARDnet allows construction of networks of ARDs associated proteins, drugs, and pathways and provides a methodology for analyzing and visualizing ARDs related data. This tool incorporates several age-related disorders and their associated proteins information as well as information about drugs and their ARDs protein targets.
Allows prediction of the disease potential of Kv-channel variants. KvSNP is an online application assisting users in forecasting the disease-causing ability of 172 uncharacterized on-synonymous single nucleotide polymorphisms (nsSNP). Ultimately, this machine learning classifier is applied to Kv-channel nsSNPs to predict the likelihood that the variant causes disease. In addition, KvSNP is able to separate benign SNPs (bSNPs) and cause disease SNPs (dcSNPs) in probability space.
SCENERY / Single CEll NEtwork Reconstruction sYstem
Allows analysis of cytometry data. SCENERY is a web application permitting users to upload their data and configure the study of their data. It provides a wide range of data analysis methods including: (1) basic pre-processing methods allowing users to transform, compensate and manually gate samples; (2) univariate analysis methods such as regression and factor analysis; and (3) advanced machine learning methods for association and causal network reconstruction (NR) that identify interactions between the measured quantities.
Allows processing and analysis of different types of omics data and combination of their results following the non-parametric combination (NPC) principles. omicsNPC is a program that can be used for the integration of different omics data. It is able to include co-variates into the analysis and process sets of datasets that share only part of the samples. This tool produces biological insights with respect to analyzing data modalities in isolation.
MXM / Mens eX Machina
Compiles multiple features selection algorithms for predictive or diagnostic models along with (Bayesian) network construction algorithms. MXM is a package including statistically equivalent signatures (SES) and an extension of the orthogonal matching pursuit (OMG) method. It can handle multiple response variables, such as continuous, binary, multiclass, ordinal, left censored or proportions, repeated measurements and more.
Brings protein-protein interaction (PPI) networks and gene expression data together to infer molecular networks that are dysregulated in patient samples. dysprog is an approach based on frequent subgraph mining (FSM) that can be used for both glioblastoma and breast cancer datasets collected with microarray and next generation sequencing (NGS) approaches. It also assists users in identification of personalized dysregulated signaling networks to allow diagnostic and treatment of patients.
Serves for the global alignment of protein-protein interaction (PPI) networks. INDEX is an algorithm employing two concepts for the depth traversal and stepwise growth of the alignment core. It allows computation of the scores of all the pairs of nodes and the neighbors’ score of the pairs. It creates an initial alignment based on the matching score strategy and selects a subset of the aligned proteins between the two networks as the alignment core.
Assists users in searching similar net- works from a given database. NSSRF performs network search by considering the topology of query network and target network. It is an algorithm composed of two phases: (1) the offline model building phase, where it uses subgraph signatures and cosine similarity score as features; and (2) similarity query phase where each query is inputted into the trained regression model from which the similarity score and similar networks are returned.
Allows exploration of the space of possible interactions between heterogeneous cell populations within a putative tumour. popFBA is an extension of flux balance analysis (FBA) able to cope with the presence of several subpopulations exchanging a defined set of metabolites. This tool can be applied to a model of several clones of the metabolic network of human central carbon metabolism, simulating a plasma supply of glucose, glutamine and oxygen.
PREMER / Paralell Reverse Engineering with Mutual information & Entropy Reduction
Allows users to recover and determine the strength and causality of interactions. PREMER allows incorporation of prior knowledge, imputation of missing data, and correction of outliers. This tool is a network inference toolbox, able to generate inferred network with plots of the mutual information between variables for different possible time delays. PREMER uses several information-theoretic measures which are estimated from data using adaptive partitioning algorithm.
TOPAS / TOol for PArticle Simulation
Serves for particle simulation. TOPAS is a program assisting users in modelling a patient geometry based on computed tomography (CT) images, a passive scattering treatment head, or score dose. It supplies detailed graphics and offers a four-dimensional (4D) interface to manage variations in beam delivery and patient geometry during treatment.
Allows users to realize proton beam calculations. BGware is a suite of program offering users several types of functions. It includes: (1) BPW; (2) CPO2; (3) FITSCAN; (4) LAMINATE; (5) LOOKUP; (6) NEU; and (7) SCANFOR.
Provides structural controllability with consideration of node connection strength in biological networks. WDNfinder is composed of two algorithms: (1) maximum weight MDS (MWMDS) identification, (2) MWMDS sampling and node classification. It uses its algorithms to find optimal prices for dual problem of maximum weight MCM (MWMCM) linear programming by adding and assigning the dummy edges with small weights. This tool can be used for the human cancer signaling network and p53-mediate DNA damage response network.
Assists users in understanding of the molecular drivers of synaptic and neurophysiologic dysfunction in Alzheimer’s disease (AD). This method allows identification of master regulators of synaptic and neurophysiologic dysfunction in AD, and contains data mining techniques to neuronal RNA expression data from AD brain tissue. It permits users to work on the protein ZCCHC17, also known as pNO40.
Permits amino acid confidence evaluation and modification site localization. pSite is based on a support vector machine (SVM) method to identify post-translational modifications and on a Bayesian model to evaluate the false amino-acid rate (FAR) at any given threshold. It follows five steps: (1) pre-processing tandem mass spectrometry (MS/MS) data, (2) enumerating the competitive sequences, (3) extracting features for each amino acid site, (4) estimating the confidence of each amino acid site, and (5) controlling the FAR of the reported amino acids.
OLMAT / Online MassSpec Data Analysis Tool
Allows to proceed analysis of mass fingerprinting data. OLMAT is able to: transform large number of MassSpec files into Excel spreadsheets; make comparison between multiple test samples and controls; and realise analysis of data by subtracting and/or intersecting multiple experimental datasets. It is able to bring correction such as removal of contaminants. This tool automates a way to obtain functional info of the proteins in a list.
Intersects published interactors of some queried gene products with the uploaded reference database.
BASALT / Biological Sequence Analysis Tool
Allows analysis of protein sequences for regular expression motifs.
Identifies the optimum protease to be used in mass fingerprinting analysis for a given protein sequence. PeptiCut can sort the protease that wouldn't cut at the motif of interest (MOI). It returns a list classified by number of optimum peptides and then number of preserved motifs.
Produces a consensus sequence. MotifGen employs stacked sequences of the same length to proceed.
Creates simulated DNA and protein sequences where the motifs within the amino acid sequences are permitted to mutate. rMotifGen can randomly build DNA or amino acid sequences using a simple background model. It is based on a position-specific scoring matrix (PSSM) approach. This tool is available through a command line version and a graphical interface. It is flexible to suit any desired need, including the construction of large scale sets for simulations.
Predicts human promoter. FPROM is based on linear discriminant functions combining characteristics that describe functional motifs and oligonucleotide composition of potential start positions. It can automatically identify protein coding genes, pseudogenes and promoters in eukaryotic genomes. This tool is able to recognize 80% of TATA promoter sequences with one false positive prediction per 2,000 base pairs.
Identifies eukaryotic promoter. TSSG employs different learning set of promoter sequences. It is able to recognize human RNA polymerase II promoter region and start of transcription.
Identifies functional motifs. Nsite searches for one- or two-boxes statistically non-random regulatory elements (REs) using their sequences or consensuses in a single or a set of query sequences. It is able to proceed on human, animals and plants nucleotide sequence.