1 - 47 of 47 results


star_border star_border star_border star_border star_border
star star star star star
Serves to search and retrieve data from microbial genomes and to compare genomics sequences. EDGAR procures extended phylogenetic analysis features as AAI (Average Amino Acid Identity) or ANI (Average Nucleotide Identity) matrices. This web tool provides a genome set size statistics and visualizations. It proposes an interface which shows evolutionary relationships between microbial genomes and helps about the process of obtaining new biological insights into their differential gene content.

ITEP / Integrated Toolkit for Exploration of microbial Pan-genomes

star_border star_border star_border star_border star_border
star star star star star
A suite of scripts and Python libraries for the comparison of microbial genomes. ITEP includes tools for de novo protein family prediction by clustering, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, cluster curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution.

RPAN / Rice PAN-genome Browser

A comprehensive dataset of rice sequences. RPAN presents analysis results from 3K rice genome data which provides new perspective for rice researchers and breeding experts. RPAN includes the following data: basic information of the rice accessions, sequences and gene annotations for the rice pan-genome, gene presence or absence variations and genome-wide expression profiles. RPAN also provides search and visualization features, a tree browser and a genome browser.


Permits visualization of pangenome analysis. PanViz allows visualization of changes in gene group classification as different subsets of pangenomes are selected, as well as comparisons of individual genomes to pangenomes with gene ontology based navigation of gene groups. It allows for rich and complex visual querying of gene groups in the pangenome. PanViz helps researchers in understanding how different pangenomes differ on a functional level rather than simply in terms of shared gene groups.


Enables automatic large-scale eukaryotic pan-genome analyses. EUPAN can detect gene presence/absence variations (PAVs) at a relatively low sequencing depth. It can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms (SNPs). The capabilities of the tool were showed in the pan-genome analysis of 453 rice genomes and revealed widespread gene PAVs among individual rice genomes. The tool can be directly applied to some of current re-sequencing projects primarily in the goal to explore single nucleotide variations (SNVs).


An implementation of a pan-genome representation based on the Neo4j graph database, focusing on the application to large sets of complex eukaryotic genomes. PanTools allows for the construction of pan-genome databases of many genomes, and contains extensions such as adding sequences, genes and orthology annotations, using relatively modest computational resources. PanTools offers a good starting point for developing various pan-genomic applications, such as multi-genome read mapping, pan-genome exploration (visualization, browsing), structure-based variation detection and comparative genomics.

DeNoGAP / De-Novo Genome Analysis Pipeline

Performs reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis. DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of Genomes. It scales linearly since the homology assignment is based on iteratively refined hidden Markov models. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences.


Allows users to view and examine prokaryotic genomes in a circular or linear context. GView is a research and visualization tool. This software is a complete rewrite of the CGView program that fully supports the CGView XML format, has a public application programming interface (API) that facilitates its incorporation into other Java programs and has a command-line interface that allows it to be run from scripts. The GView Server assists users in performing commonly desired comparative genomics analyses and rendering the corresponding genome maps.

BGDMdocker / Bacterial Genome Data Mining docker-based

Allows analysis and visualization of bacterial pan-genome and biosynthetic gene clusters. BGDMdocker consists of three integrated toolkits, Prokka v1.11, panX, and antiSMASH3.0. It can build docker image and run container for analysing pan-genome of total 44 Bacillus amyloliquefaciens strains. The visualized pangenomic data such as alignment, phylogenetic trees, maps mutations within that cluster to the branches of the tree, infers loss and gain of genes on the core-genome phylogeny for each gene cluster were presented. The tool can be used for other species pan-genome analysis and visualization.


star_border star_border star_border star_border star_border
star star star star star
A tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors.


A tool for efficiently supporting comparative analysis of multiple bacterial strains within the same species. eCAMBer is a highly optimized revision of CAMBer, scaling it up for significantly larger datasets comprising hundreds of bacterial strains. eCAMBer works in two phases. First, it transfers gene annotations among all considered bacterial strains. In this phase, it also identifies homologous gene families and annotation inconsistencies. Second, eCAMBer, tries to improve the quality of annotations by resolving the gene start inconsistencies and filtering out gene families arising from annotation errors propagated in the previous phase.

BFT / Bloom Filter Trie

An alignment-free, reference-free and incremental data structure for storing a pan-genome as a colored de Bruijn graph. The data structure allows to store and compress a set of colored k-mers, and also to efficiently traverse the graph. Bloom filter trie was used to index and query different pangenome datasets. Compared to another state-of-the-art data structure, BFT was up to two times faster to build while using about the same amount of main memory. For querying k-mers, BFT was about 52-66 times faster while using about 5.5-14.3 times less memory.


star_border star_border star_border star_border star_border
star star star star star
An open-source software package that builds on popular orthology-calling approaches making highly customizable and detailed pangenome analyses of microorganisms accessible to nonbioinformaticians. GET_HOMOLOGUES can cluster homologous gene families using the bidirectional best-hit, COGtriangles, or OrthoMCL clustering algorithms. Clustering stringency can be adjusted by scanning the domain composition of proteins using the HMMER3 package, by imposing desired pairwise alignment coverage cutoffs, or by selecting only syntenic genes.


Computes a graphical representation of the pan-genome by exploiting the deep relationships between suffix trees and compressed de Bruijn graphs. Maximal exact matches (MEMs) are readily identified in a suffix tree and through the splitMEM algorithm are efficiently transformed into the nodes and edges of a compressed de Bruijn graph. This algorithm effectively unifies the most widely used sequence data structures in genomics into a single family containing suffix trees, suffix arrays, FM-indexes and now compressed de Bruijn graphs.

BPGA / Bacterial Pan Genome Analysis pipeline

forum (1)
A perl based pipeline to exploit protein clustering data for complete pan-genome analysis of bacterial species. In addition to all types of routine pan-genomic analyses, BPGA includes a number of novel features like exclusive gene family analysis, atypical GC content analysis and subset analysis. BPGA can process outputs of three major clustering tools (USEARCH, CD-HIT and OrthoMCL) to obtain pan-genome profiles of bacterial gene pools.


Allows visualisation of a phylogenetic tree, associated metadata and genomic information such as recombination blocks, pan-genome contents or GWAS results. Phandango is an interactive web app designed around a phylogeny and a linearized genome. It offers viewer for populations of bacterial genomes linked by a phylogeny. Users can directly drag data files onto the browser. Phandango accepts phylogenetics trees, genome annotations, gubbins, brat nextgen, roary pan genome, scatterplots (manhattan plots).

PGAT / Prokaryotic Genome Analysis Tool

A web-based database tool originally developed as a platform by which to provide rapid analysis of bacterial genomes sequenced using next generation technologies. Multi-strain comparison of microbial genome sequences is dependent upon consistency of annotation across the set of genomes in order to accurately assess the presence and absence of genes, and the state of functional operons or pathways. The PGAT application consists of a multi-genome annotation pipeline for bacterial genomes, a genome database and a web interface that supports the identification of the pan-genome, provides a mechanism for manual community annotation and enables database queries by genome, metabolic pathway or a user defined set of genes.

LS-BSR / Large-Scale BLAST Score Ratio

Rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates.


A web-based automated pipeline for the annotation of closely related and well-suited genomes for pan-genome studies, aiming at reducing the manual work to generate reports and corrections of various genome strains. PANNOTATOR achieved 98 and 76% of correctness for gene name and function, respectively, as result of an annotation transfer, with a similarity cut-off of 70%, compared with a gold standard annotation for the same species. PANNOTATOR provides fast and reliable pan-genome annotation; thereby allowing us to maintain the research focus on the main genotype differences between strains.


Provides an application for pan-genome analysis. PanFunPro offers several functionalities (i) homology detection and genome annotation by three HMM-collections, (ii) pan-/core genome calculation within a set of proteomes, (iii) pairwise pan-/coregenome analysis, (iv) specific genome estimation for different sets of genomes as well as pairwise analysis of specific proteomes, (v) basic statistics for the output proteins from the pan-/core-/specificgenome calculation, and finally analysis of available Gene Ontology (GO) information for the output proteins from the pan-/core-/specificgenome calculation.

H3ABioNet / Human Heredity and Health in Africa Bioinformatics Network

Permits to manage and analyse large-scale genomic and biomedical data. H3ABioNet aims to free African scientists from their dependence on collaborators in developed countries for the analysis of data collected on the African continent. The web site provides a small practice dataset in order to help answer any bioinformatics related questions and provide support to various H3Africa and non H3Africa projects.


Allows easy comparison of many sequenced genomes to a defined reference strain. The BLASTatlas is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. This method is able to scale down to each individual nucleotide or amino acid residue. However, it is unable to deal with sequences (or parts thereof) that are not found in the reference genome.