Citation analyticsNew
Protocol design
Bioinformatics tools
Agos / Argonaute-binding domain screener
Identifies de novo tryptophan–glycine and/or glycine–tryptophan (WG/GW) Argonaute-binding domains in eukaryotic proteins. Agos provides a web server that can interpret a single protein or a DNA sequence and screen for all regions containing WG/GW domains. It includes double scoring system which provides a measurement of the degree of compositional compatibility of the new domains with the already-confirmed AGO-binding proteins.
Classifies single nucleotide polymorphisms (SNPs) according to a genetic association interaction network. SNPrank supplies a network-based approach to detect important hub SNPs through conditional dependence with other SNPs and the phenotype. Each SNP is displayed by the algorithm in proportion to its contribution to the phenotype. The software is available in three different implementations: MATLAB, Python and Java.
Compares experimental evidence available for single nucleotide polymorphisms (SNPs) to identify those which can be functionally important. ASSIMILATOR queries information about SNPs from the UCSC Genome Browser’s public database and displays results in a simplified manner. The obtained file can be viewed in a standard web browser.
Analyzes copy number variation (CNV) data from large datasets. CNVineta is a package composed of appliances to detect and visualize common CNVs for a phenotype of interest and disease-associated rare for in-depth data mining of a given dataset. The software aims to provide an accurate interpretation of CNV by combining CNV prediction and raw data of the entire set.
Evaluates transcription factor (TF) activities according to gene-expression data combined with architectural information about the regulatory network. TFInfer allows users to visualize TF activity, a time series activity profile with associated error bars and graphs which can be saved in a variety of formats. Besides, the software provides extra functionalities such as handling both time-series data and data from several independent conditions, possibly with replicates.
CisGenome Browser
Improves genomic data visualization in a wide range of formats. CisGenome Browser authorizes searching by gene names or genomic regions, panning or customizing tracks. The software can be run in two ways: manually operated by a user to visualize results from an independent experiment or, integrated in a third-party program to display results from this application as they form a unified software.
Visualizes genotype cluster plots designed to be integrated into quality control workflows for Genome-wide association studies (GWAS). Evoker provides a wide range of functionalities such as calling plots for particular markers or viewing a set of single nucleotide polymorphisms (SNPs) showing evidence for association. The software also allows users to visualize the effect triggers by excluding specific samples, and view multiple collections side by side to compare genotype calls across sample sets.
COMPASSS / COMplex PAttern of Sequence Search Software
Detects presence of complex elements by mining whole genomes. COMPASSS is based on Wu-Manber multiple pattern matching algorithm and intends to identify motifs from an entire sequence. The software allows users to make searches from both conserved and degenerated sequences. It was tested on three experimentally validated complex patterns, demonstrating its capacities in both distinguishing protein domains as well as cis-acting semi-conserved elements.
Compares and visualizes multiple genetic maps. CMap3D allows users to add genetic maps to the three-dimensional (3D) viewing space to detect correspondence between them and interact with the viewing space by manipulating objects and camera positions. It also offers the possibility to move maps around, adjust zoom levels and features such as show a map without requiring a redrawing of the viewing space.
MAVEN / Management, Analysis, Visualization and rEsults sharing
Allows users to visualize Genome-wide association studies (GWAS) results. MAVEN provides four main functionalities: (i) it accepts and stores GWAS results based on single-locus analysis methods, not the raw data; (ii) it offers several types of filtering capabilities for retrieving single nucleotide polymorphisms (SNPs) and gene regions; (iii) it displays search results in a tabular format and/or a graphical format; and (iv) it provides a functional annotation of a selected SNP from the result table. Furthermore, the software is not restricted to a single phenotype or single statistics.
Identifies local similarities between sequences by using short exact word matches. Bridges can perform multiple queries that are individually compared to the database. Besides, users can compare several sequences to a database in a single run, configure twenty parameters to adapt a search, choose to look for similarities on the direct strand, reverse-complemented strand or both and ignore similarities residing on the diagonal of the alignment matrix.
Correlates users’ data with biological information from nine bioinformatics resources (Seattle SNPs, PharmGKB, IIDB, NCBI, OMIM, Genetic Association Database, dbSNP, KEGG, and UCSC Genome Browser). Path compares input data with information retrieved from the resources and conducts studies on the single nucleotide polymorphism (SNP)–SNP interactions according to user’s choices. Then, the imported data and results of the analysis are stored in a local database.
Permits users to run publicly available genetic software for extremely large genome wide association studies (GWAS) on super-computing grid infrastructures. GRIMP will improve the learning curve for new users and will reduce human errors involved in the management of large databases. It gives access to distributed computing to primarily biomedical researchers with or without experience, but with extreme computational demands.
Allows rapid testing of hypotheses about codon usage in sequenced genes and genomes. CodonExplorer can reveal patterns associated with gene expression changes, mutational biases and horizontal gene transfer (HGT). It allows convenient discovery of genes by genome, function or orthology and visualization of the composition of these genes. This tool employs Monte Carlo techniques for testing the statistical significance of differences in codon usage or nucleotide sequence composition.
Constructs structure-based alignments of an extensive dataset of eukaryotic SECIS (a stem-loop structure on the 3' UTR of Selenoprotein mRNA transcripts) sequences. SECISaln can be applied on sequences which are known to contain a SECIS element. It provides a large, manually curated collection of eukaryotic SECISes. This tool is useful for the analysis and understanding of SECIS elements.
Merges large amount of genotype data and devised functions. IGG aims to facilitate whole-genome imputation. It integrates HapMap genotypes into local projects by using annotation-guided method. This tool can encode genotypes of single nucleotide polymorphism (SNP) into binary codes to save storage space. It is able to generate complete sets of input files for six popular genotype imputation tools (Plink, Merlin, IMPUTE, MACH, BEAGLE, and fastPhase).
Facilitates access to the literature relevant for the problem of kinetic modelling of metabolism. KiPar is able to retrieve documents that are likely to contain a value of a given parameter applicable to a given reaction. It aims to reduce the time involved in the kinetic modelling of metabolic pathways. This tool user to put multiple reactions in a single search request. It can serve to identify patterns for extracting information regarding kinetic parameters.
Enables a range of flexible query mechanisms for Allen Brain Atlas (ABA). ALLENMINER allows users to define custom regions of interest, search for genes that are graded or patterned in regions of interest, and view 3D ABA data on platforms where the BrainExplorer is not available. It can serve for identification of genes or combinations of genes that express in a specific region or cell type of the mouse brain.
ABWGAT / Anchor-Based Whole Genome Analysis Tool
Automates identification and listing of genomic variations in a pair-wise manner. ABWGAT identifies insertion, deletion, single nucleotide variants (SNVs), repeat expansion and inversion. It uses a pair of completely sequenced genomes to proceed. This tool can be used for comparative genomics analysis without prior knowledge of bioinformatics or computation methods. It can find presence of tandem repeats, number of copies.
Permits querying functional annotations from a database and displaying them in context with a phylogeny. TreeQ-VISTA is able to display a phenotype, a gene’s properties and a genomic presence/absence profile of a gene family, domain or pathway. It permits users to query microbial phenotypes and genotypes simultaneously. This tool allows researchers to perform whole exploratory routine, combined with a visual presentation in an efficient user-friendly manner.
MAVIANT / Multipurpose Alignment Viewing and Annotation Tool
Provides contig and DNA chromatogram views and allows visual inspection of the polymorphic sites. MAVIANT generates views build from html, png image and javascript files and is platform independent. It employs predictions based on contig clusters originating from a few hundred thousand sequences. This tool allows single nucleotide polymorphisms (SNPs) annotation online, collaboration on SNP evaluation, annotation or selection of candidates.
Estimates recombination rates, has sophisticated interpolation methods and allows complex queries. MareyMap is integrated with physical and genetic maps from different organisms: one vertebrate - Homo sapiens; two invertebrates - Drosophila melanogaster, Caenorhabditis elegans; and one plant Arabidopsis thaliana. It includes sliding window to get local estimation with the slope of the best line fit to the data in a local window. The software is also available as a simplified web service for permitting to get recombination rates from personalized data or a public database.
Allows analysis of genomic sequences, concentrating on pairwise local alignments. AuberGene identifies orthologous fragments such as exons or regulatory elements to align regions that are difficult to align. It permits the identification of false-positive, non-homologous alignments which can be corrected based on the new information provided by intermediate sequences. This tool follows three steps: (1) segment decomposition, (2) constructing a weighted bipartite graph from transitive alignments and (3) generating the collective alignment.
START / Sequence Tag Analysis and Reporting Tool
Allows comprehensive analysis of serial analysis of chromatin occupancy technique (SACO) data. START is applicable to experiments performed in the yeast, fruit fly, mouse, rat and human genomes. It permits investigating the genome-wide mapping of a transcription factor (TF) of interest in an unbiased, yet cost-effective and sensitive manner. This tool integrates a wide range of genomic annotations and resources and TF binding site prediction data for a large number of genomes.
Converts Affymetrix genotype data into linkage and haplotype information. ALOHOMORA can perform a comprehensive quality control of the data. It facilitates linkage analysis with chip data. This tool is applicable in small and large families with any genetic model. It provides functions to employ different genetic maps or ethnicity-specific allele frequencies.
Allows automated image acquisition and real-time signal quantification. LabArray aims to simplify the process of microarray image analysis in non-equilibrium dissociation studies. It is able to proceed real-time monitoring. This tool made use of the covariance of an image containing periodic objects to calculate the distance between any two given objects or probe spots. It can identify spots that were misaligned during the printing process.
Scores dinucleotide composition differences of individual sequence entries with a chosen representative host genome sequence. Deltarho-web compares the dinucleotide composition of an input sequence with the composition of a selectable complete genome sequence. It allows whole-genome composition analysis with a selectable window size, to visualize large anomalous gene clusters in a prokaryotic genome.
vConTACT / Viral CONTigs Automatic Clustering and Taxonomy
Allows classification of double-stranded DNA viruses that infect Archaea and Bacteria. vConTACT analyses are based on gene sharing network methods. It displays the extent of shared genes between genomes as edges. This tool enables large-scale, automated virus classification. VConTACT is integrated in a virus ecology-focused set of tools named iVirus. It enables any user to run the application simply by providing viral sequences alongside.
Caters to the varying protocols of different next-generation sequencing protocols, to detect copy number alterations (CNAs). SynthEx uses a “synthetic-normal” strategy to correct for sample-specific bias in target regions due to pre-analytical variation between tumor-normal matched pairs. It employs a synthetic normal to mimic the technical bias of the tumor to be assayed. This tool utilizes whole exome sequencing (WES) data with improved precision and accuracy.
Prioritizes deleterious synonymous single-nucleotide variants (sSNVs). regSNPs-splicing shows good results when using pathogenic variants contain substantial number of variants. It includes protein structure features in order to dramatically increases its ability for identifying disease-causing synonymous single-nucleotide variants (SNVs). This tool was tested on a training data set that includes both disease-causing and neutral sSNVs.
Maximizes the reproducibility and research timelines, reduces the working time of the researchers. DoriTool combines different bioinformatics algorithms and public databases to perform a complete functional in silico assessment, taking as its starting point a list of genome wide association studies (GWAS)- or next generation sequencing (NGS)-derived variants. It offers a way to find ontologies and pathways, shedding light on the underlying biology.
Permits analysis of multiplexed barcode-seq data. Barcas maps sequenced reads based on the trie data structure for fast and efficient imperfect matching. It constructs a trie data structure from the barcode library sequences and maps input reads to sequences in the trie. This tool can find over-represented or depleted target barcodes through comparing intensity between case samples and control samples.
Enables users to retrieve mass spectrometry based proteomics data. PGMiner provides the main steps of proteogenomics in a fully automated manner. It can perform peptide identification by using multi-algorithm support. This tool supports machine aided assessment of gene models. It is able to map identified peptides and proposal of new gene models. Users can choose which databases they want to use for processing assessment.
BPP / Bacterial Proteogenomic Pipeline
Allows proteogenomics analyses with emphasize on the visualization of results. BPP employs tandem mass spectrometry (MS/MS) data, except for the peptide identification step, which is done by search engines. It was teste on datasets of Bradyrhizobium Japonicum samples grown in cowpea nodules and of Synechocystis sp. PCC 6803 samples, which were cultivated under different environmental conditions.
Allows interpretation of non-invasive prenatal testing (NIPT) results obtained by genome-wide methods. NIPTRIC calculates the personalised a posteriori risk (PPR) according to the a priori risk. It takes into account both test and patient characteristics. This tool can be useful for cell-free foetal DNA screening providers and healthcare professionals. It aims to facilitate understand of NIPT results and their implications in clinical practice.
CUT / Codon UTilization tool
Allows analysis of all genes or transcripts banked for a species. CUT is based on a standard “Model View Controller” (MVC). It methodically counts the codon content in gene or transcript sequences beginning with the start codon. This tool was applied for analysis of yeast, mouse and rat genome sequence data. It provides a database offering details of all 1-, 2-, 3-, 4- and 5-codon combinations for all genes or transcripts in yeast, mice and rats.
Automates reference-assisted building of genomic chromosomes. Chromosomer employs alignments between fragments to construct draft chromosomes. It maps fragments to a reference genome by using results of pairwise alignments between fragments and chromosomes of reference genome. This tool offers two parameters that influence assembly process: the alignment score ratio threshold and the insertion size.
Generates virtual clone maps. BACCardI allows the projection of read pair information as obtained from positioning of end sequences onto the genome assembly. It offers two different modes: (1) the circular mode, representing a whole genome assembly of a prokaryotic genome, and (2) the linear mode, allowing a detailed analysis of a specific genomic region. This tool permits genome comparison via mapping of large insert clone libraries onto related genomes and finishing support.
GEM / Gene Environment and Methylation
Explores the associations of Gene, Environment and Methylation. GEM offers linear regression models to facilitate analyses in epigenome wide association studies (EWAS). It permits users to test millions of hypotheses in epigenetics. This tool can produce a “segregation scatter plot” for methylation corresponding to environment in different genotype groups.
HATODAS / Heavy-atom Database System
Supports the heavy-atom-derivatization process of a target protein. HATODAS is a software/database of heavy-atom originally developed in protein crystallography. It suggests potential heavy-atom reagents for derivatization experiments based on the amino-acid sequence of the target protein and its crystallization conditions. It includes potentiality scoring to prioritize the heavy-atom reagents suggested for experiments.
Galaxy / Glycoanalysis by the three axes of MS and Chromatography
Provides a useful method for an analytical procedure for N-glycan structures. Galaxy is a 2D/3D mapping method developed for the structural determination of asparagine-linked oligosaccharides (N-glycans) in glycoproteins. The structure of a sample PA-glycan can be estimated by comparing its elution position on the map with the positions of the known reference N-glycans plotted on the 2D map. This application can also be used as a means of isolating large-scale samples for nuclear magnetic resonance (NMR) spectroscopy or mass spectrometry (MS) spectrometry.
Allows characterization of hotspots of epigenetic variability across different cell-types. Haystack can be applied to epigenetic mark and supplies a method to study cell-type identity and the mechanisms underlying epigenetic switches during development. It simplifies biologists’ efforts at analyzing epigenetic data without the burden of coding, and enables researchers to integrate their own sequencing data with information from the public domain.
B-LORE / Bayesian multiple LOgistic REgression
Allows users to study the effect size of single nucleotide polymorphisms (SNPs) modeled by a two-component Gaussian mixture. B-LORE is a Bayesian method using multiple logistic regression of the case-control binary variable and a prior distribution. This tool combines the advantages of multiple logistic regression and meta-analysis and incorporates functional information of the SNPs.
Enables sharing of linkage disequilibrium (LD) information needed for accurate fine-mapping in the era of biobank-scale datasets. LDstore serves for estimation, storage, and seamless sharing of LD information. It uses parallel computing and sparse storage of LD information to achieve small file sizes. This method is useful to collect LD information for the trait-associated genomic regions enabling accurate fine-mapping from summary statistics and thus allow multiple causal variants without time-consuming communication and repeated analysis efforts across the participating cohorts.
Allows detection of individual regulatory single nucleotide polymorphisms (SNPs) from genotypes. DeepWAS proceeds to identification of individual regulatory SNPs by investigating genomic location and sequence alterations. It also identifies single deepSNPs with predicted allele-specific regulatory effect in a function unit, a cell-type and chromatin feature. This tool includes putative regulatory mechanisms in the genome-wide association study (GWAS) analysis from the start. It can control false discovery error by reducing multiple testing.
Serves for modeling spatial management problems, for designing and analyzing policies and for comparing given policies by simulation. GMDPtoolbox is a toolbox dedicated to the Graph-Based Markov Decision Process (GMDP) framework. It is composed of two algorithms providing local policies by approximating the optimal solution of GMDP. This toolbox offers user a structure to encode GMDP problems, as well as modeling tools, solution algorithms, and analysis tools for evaluating and comparing policies.
Calculates several basic parameters of population genetics. GENETIX offers users possibility to study meaning of statistic by using permutation sample. It is useful for the determination of Wright's F-statistics and linkage disequilibrium D. This tool consists of an alternative to bootstrapping and jack-knifing or to exact probability tests. It can evaluate the probability value of departure from the null hypothesis.
SIPPOM-WOSR / Simulator for Integrated Pathogen Population Management adapted to study blackleg on Winter OilSeed Rape
Assesses and ranks Integrated Crop Management (ICM) strategies. SIPPOM-WOSR is a simulator for integrated pathogen population management. It is a model that links epidemiological, population and crop model approaches, simulating both quantitative (size) and qualitative evolution (genetic structure) of the L. maculans population. It is useful for designing and assessing cropping systems that seek to control phoma stem canker while both preserving the efficacy of specific resistance and meeting requirements of ICM.
Quantifies nematode motility/growth. invappParagon assists in identification of class of anthelmintics such as dihydrobenzoxazepinone. It allows user to investigate parasitic diseases, to determine growth and motility of C. elegans. It could also be applied to the study of other human diseases modelled in C. elegans.
Scores C. elegans on agar plates for mortality, size, and fecundity. WormScan is an automated software system using photo scanners. It can replace manual counting of worm populations and score of survival. An archival record is automatically stored as a compressed image file on which identified worms are highlighted.
Parallel Worm Tracker
Records the centroid position of tens of worms in sequential video frames. Parallel Worm Tracker can determine worm speed and derive a measure of the fraction of worms that are paralyzed by drug application. This tool can produce a consistent measurement of locomotion allowing direct comparison of results across experiments and experimenters.
Reaction Balance
Allows calculation of chemical reaction stoichiometries. Reaction Balance can pose an automatic reaction balancing as a mixed integer linear programming (MILP) and is able to define constraints across the reaction for each element and over charges. This tool is able to bring automatic balancing to the widest possible audience.
Pancancer cluster assignment
Serves for studying tumour mutational profiles. Pancancer cluster assignment infers the dependency structure of mutations and performed fully Bayesian inference to capture the uncertainty in network structure learned from mutational profile data. This tool is useful to cluster patient samples into groups with different interactions among mutated genes.
Allows detection of genomic regions identical by decent (IBD) for pairs of haploid samples. hmmIBD uses simulated data, benchmarks it against a previously published method for detecting IBD within populations. This tool is conceived to infer IBD segments shared between pairs of haploid genomes and to estimate two quantities: (1) the marginal posterior probability of the IBD state; and (2) the rate at which the genomes transition between states.
Allows users to work on High Chromosome Contact map (Hi-C) analysis at dynamic scales. SHAMAN is useful to produce randomized matrices conserving empirical number of contacts by restriction fragment and the empirical distribution of genomic distances over contacts. It permits user to reanalyze Hi-C data on mouse embryonic stem cells and human cancer cell lines.
Machaon CVE / Machaon Cluster Validation Environment
Offers multiple clustering and cluster validity methods for DNA microarray data analysis. Machaon CVE is a data mining framework intended for (1) performing clustering on microarray data and (2) evaluating the quality of the clusters obtained, which may also be used for estimating the ‘correct’ number of clusters. This method allows the application of various validation methods to multiple datasets, which may be clustered by third-party tools.
RAT / Recombination Analysis Tool
Helps in finding viral recombinants. RAT is a cross-platform, Java-based application intended for high-throughput, recombination analysis of both DNA and protein multiple sequence alignments, in any one of seven different file formats. It uses the distance-based method of recombination detection. This application indicates large-scale recombination occurs in the Noroviruses. Users can employ this method to examine sequences individually using the Single-sequence viewer, or use the Auto Search option that searches for recombination given a user-defined search criterion.
Reduces markedly the human effort required to integrate information from physical maps and whole-genome shotgun (WGS) assembly. MapLinker allows users to manipulate and save the data output. It also provides a number of features that help the user to improve the analysis. This application can also be useful for physical mapping when draft sequences are available.
GSIT / Genomic Signature Identification Tool
Recognizes and validates genomic signs in a group of similar DNA sequences of microorganisms. GSIT is a web application that employs comparative genomics technique for identification of genomic signatures among subtypes/strain of same species of organism. It also provides a list of significant genomic signatures for many applications including pathogen characterization and epidemiological applications.
MICO / Mutation Information COllector
Presents an unbiased view of all possible predictions on the effects of a given mutation. MICO is a web application that contains six leading prediction tools: Condel, MutationAssesor, Mutation Taster, PolyPhen2, SIFT, and CADD. This method speeds up the understanding of the genetic basis of human diseases. It also enhance research in computational biology and bioinformatics.
Sequence Maneuverer
Extracts sequences from large datasets with few simple steps. Sequence maneuverer consists of three modules named annotator, FASTA line generator and sequence extractor. These modules could be used independently or in combination depending upon users’ objectives. This application can efficiently extract multiple sequences of any desired length from a genome of any organism.
Identifies the putative orthologous proteins between two proteomes. OrFin makes use of Reciprocal Best BLAST Hits (RBBHs) to identify pair of orthologous proteins for a given set of two proteomes. It allows user to alter the criteria to retrieve the RBBHs. This application provides a web interface that can have potential implications to assist features associated with orthologous proteins.
Bacterial Genome Mapper
Helps in finishing and annotating a draft bacterial genome sequence. Bacterial Genome Mapper is a user-friendly web-based program that can be used even by microbiologists with little or no professional informatics training. This application is used to map contigs and construct comparative genome maps by wide alignments between target bacterial genome contigs and one or two reference bacterial genome sequences.
MAP Kinase analyser / Mitogen Activated Protein Kinase analyser
Identifies P-Site, phosphorylation site consensus sequences and domain of the Mitogen Activated Protein Kinase (MAPK) in plant genome. MAP Kinase analyser can recognize if the given protein sequence is a MAP kinase or not on the basis of presence of the specific MAPK domain. This method also identifies the possible kinase substrate by the analysis of P-site consensus sequence pattern.
IntergenicS / Intergenic Sequence
Offers a method for biologists who are interested in regulatory elements in the noncoding regions of microbial genomes. IntergenicS is a user-friendly tool designed for the purpose of sequence extraction and to get guanine-cytosine content. This method extracts intergenic regions of bacterial genomes in silico. It also allows the user to specify intergenic regions of particular size range.
Provides a simple position weight matrix (PWM)-based strategy for the discovery of intergenic small RNAs (sRNAs) genes. sRNAscanner is a method that predicts large numbers of potential sRNA genes in diverse bacterial species. It also hints at the broader power of customized PWMs as a generic strategy for detection of defined genomic features in diverse bacterial genomes.
Assists in the determination of regions of suspected RNAs using information obtained from the RNA of interest. PsRNA is a computing engine for locating all such functionally important regions in prokaryotic genomes. It can be used to fish out regions of interest based on the KEGG Orthology (KO) information collected from positive training data.
Detects accurately miRNAs in human genome. Mi-Discoverer is a computational approach relies on a multiple sequence alignment to predict human miRNAs and successfully applied the program to identify miRNAs. This application provides also a comprehensive database for human genome miRNAs, including all available latest information about pre-miRNA sequence and length of the stem loop region and function.
Detects putative binding sites in DNA or RNA sequences. xFITOM applies more than 10 different detection methods based on information theory which allows an approximate calculation of binding affinity for a specific site. Results can be filtered by using a user-specified threshold. Besides, it also allows users to carry out non-standard analyses, affinity ranking of collection sites or annotated searches of unfinished genome assemblies.
Builds weight matrices based on user defined motif sequences and width. D-MATRIX generates several types of matrices such as alignment, frequency and weight matrix. The software can also perform a weight matrix conversion into different file formats or allows detection of the conserved motifs in co-regulated genes or whole genome.
Performs regulatory network reconstruction applied to Rhodobacter sphaeroides. Rhodobase provides tools for transcriptome meta-analysis, regulatory network visualization and optimized cross-species transcription factor binding sites (TFBS) matching. The software also allows user to choose the desired network complexity level. The platform furnishes a network visualization engine and a search engine for TFBS analysis.
Analyzes regulatory network in higher eukaryotes. CONVIRT integrates whole genome alignments, gene annotation, repeats and sequence information into a virtual chromosome, which allows the location of a chosen region at any distance from a specific gene. Besides, the software can generate all possible transcription factors (TF) combinations as well as the statistical significance of each combination in a given gene set. It actually covers seven organisms.
Handles multivariate and sample locality information analysis in geographic space. mvMapper offers an interface for exploring statistical framework in ordination and geographical space. The software is composed of a statistical panel, a mapping panel - both interactive with tools for panning, zooming in and out, and saving, and a metadata panel for allowing a rapid exploration of metadata with regard to population structure.
Facilitates baits’ designing and testing for hybridization capture. BaitsTools generates high-quality oligonucleotide baits from a wide range of formats from reference sequences or across the break in linearized circular sequences. Optionally, the software can analyze and filter previously generated baits. Besides, it did not have additional dependencies and does not require local compilation.
Calculates “K” estimators. StructureSelector is a web based software which aims to help in selecting and visualizing of the best estimators across a targeted file. The software includes MedMedK, MedMeaK, MaxMedK and MaxMeaK and two other estimators. Besides, it can generate graphical representations of the results for improving data submission and rapid import of graphical plots.
MBN / motif-based network algorithm
Generates network structures with high occurrences. MBN is built upon controlling the occurrences of the basic building blocks of network connectivity and allows user to interpret model parameters. This algorithm offers users to choose the order of edge addition according to certain rules that facilitates the formation of certain structures.
Enables transformation of high-throughput sequence data generation into Human Genome Variation Society (HGVS)-compliant variant descriptions. VariantValidator is an online platform that offers user to produce complete descriptions in the format “genomic reference sequence”. It was designed to provide users with informative advice on errors in the description of variants. It stores regularly updated RefSeq data and displays the corresponding descriptions of transcript reference sequences.
Displays 3D structure of genome with diverse genomic features. Delta is a web based 3D genome visualization platform. This tool can be useful for researchers to infer novel and valid hypothesis via visually integrating multiple datasets. It can also be used to understand the principles behind the nuclear architecture from various perspectives.
Simplifies the 3D exploratory analysis of High Chromosome Contact map (Hi-C) data. HiC-3DViewer is an interactive chromatin visualization tool that maps genome-scale interactions to identify structural characteristics and interactions between all chromosomes. It allows user to highlight any specific genomic regions through the usage of an interactive 2D Hi-C contact map or by uploading a BED file of given regions.
Allows efficient decomposition of any type of genomic data represented as a numerical matrix. Bratwurst provides functionalities to identify patterns in different types of omics data with non-negative matrix factorization (NMF). It selects characteristics very specific to the different patterns and integrates the patterns extracted from the different layers of omics layers. This software can be applied to any omics data type and is useful for multi-omics integration.
VASC / deep Variational Autoencoder for scRNA-seq data
Analyzes and visualizes single cell RNA sequencing (scRNA-seq data). VASC is a deep variational autoencoder can capture non-linear variations and automatically learn a hierarchical representation of the input data. One of its purpose is to simplify visualization of scRNA-seq datasets. VASC has three major parts called: (1) the encoder network, (2) the decoder network and (3) the zero-inflated layer.
FATHMM-indel / Functional analysis through hidden Markov models indel
Discovers and analyzes differences in non-coding mutation loads across populations. FATHMM-indel is an integrative computational model for predicting indel pathogenicity. It can be useful to prioritize causative variants, like those identified in genome wide association studies (GWASs), or for downstream studies to analyze the phenotypic impact of non-coding indels.
Characterizes the spiking and bursting properties of large-scale neuronal networks. SpiCoDyn contains features for: i) conversion and pre-processing of raw data; ii) statistical analysis of the dynamics in terms of spiking and bursting features; and iii) estimation of network connectivity and characterization of the topological properties.
Produces interactive 3D representations of modelled chromatin conformations. TADkit allows users to study the relationship between 3D structure of the genome and its biological function. This tool is available as a web version and a local version.
Makes numerous annotations and metadata collections. tximeta’ purpose is to conserve signature of the transcriptome sequence itself using a hash function.
MPTM / Mining Protein Post-Translational Modifications
Extracts and organizes comprehensive post-translational modifications (PTMs) information from literature in PubMed. MPTM provides literature mining service for hydroxylation, myristoylation. It extracts basic PTM information including substrates and modification sites and also comprehensive PTM-related information such as enzymes, Gene Ontology (GO) terms, organisms, diseases and crosstalk from literature.
kBET / k-nearest neighbour Batch Effect Test
Quantifies batch effects in single-cell RNA-sequencing (scRNA-seq) data. kBET allows users to study high-dimensional data without prior assumptions regarding statistical properties. It can be applied to any type of next-generation sequencing (NGS) data given a reasonable sample size per batch. The software was evaluated on simulated data with three different degrees of batch effects.
Porcupine / PORcupine Creates Ur PipelINE
Allows creation of neuroimaging pipelines by means of a graphical user interface (GUI). Porcupine is a graphical workflow editor that automatically produces analysis code from a graphically composed pipeline. The software provides two independent functionalities: (i) a GUI for the visual design of analysis pipelines and (ii) a framework for the automated creation of docker images to execute and share the designed analysis.
Creates and manages processing pipelines. Fastr is an image processing workflow framework that speeds up the development cycle for creating workflows and minimizes the introduction of errors. The software is designed to build workflows that are agnostic to where (i) the input data are stored, (ii) the resulting output data should be stored, (iii) the steps in the workflow will be executed, and (iv) what information about the data and processing needs to be logged for data provenance.
Nipype / Neuroimaging in Python: Pipelines and Interfaces
Interfaces with existing software for analysis of neuroimaging data and comparative development of algorithms. Nipype is an open-source, community-developed, Python-based software package that consists of three components: (1) interfaces to external tools providing a unified way for setting inputs, executing, and retrieving outputs; (2) a workflow engine for creating analysis pipelines; and (3) plug-ins that execute workflows either locally or in a distributed processing environment.
CBS Tools
Allows ultra-high-resolution brain segmentation at 7 Tesla (T). CBS Tools is an automated computational framework for brain segmentation and cortical reconstruction at the ultra-high resolution of 0.4 mm, based on quantitative T1 images acquired at 7 T with the MP2RAGE sequence. The software is implemented as a plug-in for the MIPAV and JIST medical image processing platforms. It supports many image formats, and includes a user-friendly interface for image visualization and editing as well as a graphical pipeline engine for large-scale processing.
iSVP / integrated Structural Variant calling Pipeline
Allows detection of structural variants (SV) from next-generation sequencing (NGS) data. iSVP is a pipeline that combines existing SV detection methods. The software was applied to human whole genome sequence data from a HapMap NA12878 sample and detected numerous SVs that were biologically explainable. It is applicable to high-coverage whole genome sequencing (WGS) data with reasonable computational resources, and thus can enhance the genome-wide detection of SVs for the identification of disease-causing variants.
M2EFM / Methylation-to-Expression Feature Model
Builds prognostic models. M2EFM is a data-integrated modeling approach that predicts risk based on data integrated from multiple sources, taking advantage of tried-and-true prognostic factors while incorporating data on relevant relationships between molecular data types at the individual level. This approach allows identification of biologically relevant pathways and possible therapeutic targets and genes involved in cancer progression. It was used to build models of overall survival (OS), distant recurrence-free survival (DRFS), and pathologic complete response (pCR) in breast cancer.
RSCU_RS / Relative Synonymous Codon Usage for Ribo-Seq data
Computes a precise and direct measure of codon usage bias (CUB). RSCU_RS is a method based on Ribosome sequencing (Ribo-seq) that permits to directly compute CUB from all translated genes.
Allows users to visualize and interact with both static structure and dynamics of proteins by using virtual reality (VR). The software pipeline enables nonexpert researchers to embed protein structures into VR programs using a combination of widely available software and custom-built codes. The use of VR allows changing of the level of details accessible to a researcher when analyzing protein-ligand interactions or conformational changes.
SSEalign / Secondary Structure Element alignment
Allows homology identification of hypothetical proteins. SSEalign is a web-server that can be useful for re-annotating proteins with unknown functions, especially for the bacterial proteins.
TELS / Transcribed Enhancer Landscape Search
Identifies predictive short motif signatures of transcribed enhancers (TrEn). TELS is a machine-learning algorithm that applies logistic regression (LR) coupled with dimensionality reduction techniques to identify systematically the most informative combinations of short sequence motifs of TrEn in the human genome. The software first identifies candidate combinations of sequence motifs that characterize the class of interest and then assesses, for every candidate combination of motifs, its significance.
Allows different types of sentence extraction. BioIE employs predefined categories of interest relating to proteins and custom extraction around different entities and concepts, together with statistical feedback on the source and extracted text. It uses five predefined categories of interest relating to proteins: structure, function, diseases and therapeutic compounds and localization and familial relationships.
Constructs protein reports from sets of related SWISS-PROT entries. PRECIS offers annotation placed in the context of the family to which query sequence belongs. It returns a structured report detailing the function and structure of the most similar protein family or superfamily, the diseases with which they are associated, relevant literature references and a list of keywords.
METIS / Multiple Extraction Techniques for Informative Sentences
Builds protein reports from related entries in Swiss-Prot. METIS employs data in the Swiss-Prot entries to find relevant literature, or to find search terms with which to seek this out. It reduces the time required to seek out and read relevant literature. This tool is able to extract pertinent sentences from the biomedical literature.