1 - 45 of 45 results

JSpeciesWS / JSpecies Web Server

star_border star_border star_border star_border star_border
star star star star star
Calculates in silico the extent of identity between two genomes. JSpeciesWS is able to determine overall genome relatedness indices (OGRI). It allows rapid comparisons against the reference database offered by the tool, providing a list of the most similar genomes based on their resulting Tetra-nucleotide signature correlation index. This database is composed of NCBI’s genomic sequence data and includes all primary submissions of assembled genome sequences and their associated annotation data.


Enables efficient microbial core genome alignment and single-nucleotide polymorphisms (SNP) detection. Parsnp utilizes a Directed Acyclic Graph (DAG) data structure, called a Compressed Suffix Graph (CSG), to index the reference genome for efficient identification of multi- maximal unique matches (MUMs). The software is suited for outbreak analyses of infectious diseases and offers a highly efficient method for aligning the core genome of thousands of closely related species. Besides, the tool is a part from the Harvest suite.


Improves sequence alignment accuracy by inferring substitution and gap scores that fit the frequencies of substitutions, insertions, and deletions in a given dataset. LAST-TRAIN uses a standard iterative approach: it first aligns the sequences using some initial score parameters, then infers better score parameters from the alignments, then re-aligns and repeats until the parameters stop changing. It achieves adequate speed by an X-drop heuristic, depletes paralogs using LASTSPLIT, and allows different insertion and deletion parameters and non-strand-symmetric substitution parameters.


Allows identification and phylogenetic analysis without requiring genome alignment or the reference genomes. kSNP annotates single nucleotide polymorphisms (SNPs) by automatically downloading the the input files. The software provides for two annotation modes: standard annotation and full annotation. The kSNP3.0 package also includes a number of new tools and utilities that facilitate the downloading of genomes for analysis, preparation of input files, and a variety of post-kSNP run analyses of the output files.

SPANDx / Synergised Pipeline for Analysis of NGS Data in Linux

Consolidates several well-validated, open-source packages into a single tool, mitigating the need to learn and manipulate individual next-generation sequencing (NGS) programs. SPANDx is an all-in-one tool for comprehensive haploid whole genome sequencing (WGS) analysis. The software incorporates burrows wheeler aligner (BWA) for alignment of raw NGS reads against a reference genome or pan-genome, followed by data filtering, variant calling and annotation using Picard, GATK, SAMtools and SnpEff. It produces single-nucleotide polymorphism (SNP) and indel matrices for downstream phylogenetic analyses. Annotated, genome-wide SNPs and indels can also be identified if specified, and are output in human readable format. A presence/absence matrix is also generated to identify the core/accessory genome content across all the genomes.


A web app for defining the relationship between bacterial strains and contributing to the classification and identification of bacterial species using genome data. ANItools helps users directly get average nucleotide identity (ANI) values from online sources. Currently, ANItools web is being used to compare bacterial strains at the genus and species levels. This will provide further clues to define bacterial strain at the genome level and graphically represent the complex relationship among strains, which is helpful for finding a cluster of strains with high similarity (candidate pathogen strains causing an outbreak) in an epidemic study.


Improves the specificity in genome alignments by accurately detecting and removing local alignments that obscure the evolutionary history of genomic rearrangements. chainCleaner improves the alignment of numerous orthologous genes and exposes alignments between exons of orthologous genes that were masked before by alignments to pseudogenes. It also recovers hundreds of kilobases in local alignments that otherwise would fall below a minimum score threshold. The tool can be applied to improve the sensitivity and specificity of genome alignments.

Alpha / ALignment of PHAges

A mathematical model based on partial order graphs for performing multiple alignment of bacteriophage whole genomes, along with algorithms to operate on the model. Relying exclusively on the equality relation, the model is almost parameter free, greatly reducing the need to calibrate the aligner, yet delivers biologically meaningful results. The model has been implemented in the form of an interactive aligner that can perform multiple alignments of dozens of genomes and present the result in an attractive format. We also showed that Alpha, used on bacteriophage genomes, produces biologically meaningful alignments, while avoiding the high rate of misalignments of complex heuristics such as progressiveMauve.


Enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition. LAST can handle big sequence data, e.g: compare two vertebrate genomes and align billions of DNA reads to a genome. It indicates the reliability of each aligned column and uses sequence quality data properly. LAST compares DNA to proteins with frameshifts, compares position-specific scoring matrices (PSSMs) to sequences, calculates the likelihood of chance similarities between random sequences and does split and spliced alignment. Furthermore, it trains alignment parameters for unusual kinds of sequence (e.g. nanopore). LAST is available as a web application or can be download for local use.


A comprehensive pipeline for computationally screening putative long non-coding RNA (lncRNA) transcripts over large multimodal datasets. lncRNA-screen main objective is to facilitate the computational discovery of lncRNA candidates to be further examined by functional experiments. lncRNA-screen provides a fully automated easy-to-run pipeline which performs data download, RNA-seq alignment, assembly, quality assessment, transcript filtration, novel lncRNA identification, coding potential estimation, expression level quantification, histone mark enrichment profile integration, differential expression analysis, annotation with other type of segmented data (copy number variations (CNVs), single nucleotide polymorphisms (SNPs), Hi-C, etc.) and visualization. Importantly, lncRNA-screen generates an interactive report summarizing all interesting lncRNA features including genome browser snapshots and lncRNA-mRNA interactions based on Hi-C data. In summary, lncRNA-screen pipeline provides a comprehensive solution for lncRNA discovery and an intuitive interactive report for identifying promising lncRNA candidates.


Compares genome sequences (Draft and Completed). Gegenees was primarily developed for bacterial genomes but it is also possible to use on viruses and smaller eukaryotes. Gegenees fragments the genomes and compares all pieces against all genomes. Based on this all-against-all comparison, a phylogenetic data can be extracted. It is also possible to define a "target group" and search for genomic regions that have high specificity for the target group. This is referred to as a "genomic signature". The genomic signature regions can be used to find candidate regions for the design of primers and probes for highly specific diagnostic assays. There is also a built in primer/probe verification system that compares new candidate or existing primers and probes to the genomic database and to the defined target groups.

GECKO-MGV / Multi Genome Viewer

A web app to display, browse and analyse the results of genome comparison produced by GECKO and GECKO-CSB. This tool is composed of several independent modules that execute specific tasks for the application: files, canvas and matrices. GECKO-MGV integrates functionality for zooming, filtering and selecting fragments. A novel interesting feature of this tool is the integration of the evolutionary events timeline. GECKO-MGV is a part of GECKO software suite.

GW-CALL / Genome-Wide variant CALLer

Calls variants within the genome based on all mappable reads. GW-CALL exploits information of all reads in a genome-wide decision making process : in particular, it partitions the genome into several independent regions called clusters and incorporates an efficient algorithm to use all reads belonging to a cluster in calling variants within that cluster. This tool calls variants within less ambiguous genomic loci quickly in the first step and processes the remaining ambiguous locations, called critical points, in a more computationally expensive procedure.


Unifies in one package multiple capabilities necessary to carry out various types of comparative analysis of genomic sequences and whole-genome assemblies. GenomeVISTA aligns sequences both in finished and draft format, thus allowing to use it for multiple applications such as genome assembly, mapping newly sequence reads on the reference genome and calculating syntenic regions on complete genome assemblies. Importantly, it also gives access to the results of the alignment through a highly interactive interface that makes comparative analysis of genomic data fast and efficient.

GECKO / GEnome Comparison with K-mers Out-of-core

A package to identify collections of high-scoring segment pairs by pairwise genome comparison procedures, that can then be used to obtain gapped fragments. GECKO facilitates massive comparisons of genome-sized sequences, as well as more complex evolutionary studies. The tests demonstrated that GECKO is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods.


Allows analysis of genomic sequences, concentrating on pairwise local alignments. AuberGene identifies orthologous fragments such as exons or regulatory elements to align regions that are difficult to align. It permits the identification of false-positive, non-homologous alignments which can be corrected based on the new information provided by intermediate sequences. This tool follows three steps: (1) segment decomposition, (2) constructing a weighted bipartite graph from transitive alignments and (3) generating the collective alignment.


A tool for rapidly visualizing and aligning the most highly conserved regions in multiple (typically prokaryote) genomes. M-GCAT is based upon a highly efficient approach to anchor-based multiple genome comparison using a compressed suffix graph and thus can construct multiple genome alignment frameworks in closely related species usually in a few minutes. A couple of important limitations include (1) input sequences MUST be assembled, and (2) the comparison is reference-sequence biased.

Dagr / Directed Acyclic GRaphs

Allows users to write and execute bioinformatics pipelines as directed acyclic graphs. Dagr is a task and pipeline execution system, for working data scientists, programmers and bioinformaticians, which supports scientific, and in particular, genomic analysis workflows. The software contains a small set of predefined genomic analysis tasks and pipelines and supports pre-compiled pipelines and pipelines from scala script files that are compiled on the fly.


Processes alignments of user-submitted sequences. mVISTA performs pairwise alignments of DNA sequences up to megabases long from two and more species. This software allows visualization of alignments with annotations which display global sequence alignments of genomic sequences from different species. It can also assess percentage identity and length cutoffs for identifying a level of active non-coding conservation by comparing all pairwise sequence alignments of three or more species.


Compares database sequence. JESAM alignment algorithm therefore uses dynamic programming only for the final alignments relying on the gross overall overlap being easy to find because the goal was only to discover potentially overlapping subsequences, not distant homologues mutated apart through millennia. The tool uses CORBA interfaces to model organism alignment data. It offers great potential for future enhancements through new public software components interacting over the Internet, or combining public and private data behind corporate firewalls.