1 - 50 of 113 results

dbNSFP / database for nonsynonymous SNPs' functional predictions

star_border star_border star_border star_border star_border
star star star star star
Eases the process of filtering and prioritizing the presumably functional single nucleotide variants (SNVs) from a long list of SNVs identified in a typical whole exome sequencing (WES) study. dbNSFP can work as a local and self-sustaining database without need for internet connection. The database provides more than 82 800 000 non-synonymous SNVs (nsSNVs) and splice site SNVs (ssSNVs).


Permits ‘genecentric’ annotation of the human genome for laboratory and analytical work carried out at the Core Genotyping Facility (CGF) of the National Cancer Institute. Genewindow integrates data available in the public databases with internal annotations from sequence data generated by our laboratory. It is configured for the human genome and can be applied to other genomes and integrated with the analysis, storage and archiving of data generated in any laboratory setting.

PAIDB / PAthogenicity Island DataBase

forum (1)
A comprehensive relational database of all the reported pathogenicity islands (PAIs) and potential PAI regions which were predicted by a method that combines feature-based analysis and similarity-based analysis. PAIDB v2.0 contains 223 types of PAIs with 1331 accessions, and 88 types of REIs with 108 accessions. With an improved detection scheme, 2673 prokaryotic genomes were analyzed to locate candidate PAIs and REIs. With additional quantitative and qualitative advancements in database content and detection accuracy, PAIDB will continue to facilitate pathogenomic studies of both pathogenic and non-pathogenic organisms.

DoriC / Database of oriC regions in Bacterial and Archaeal genomes

Stores origins of replication (oriC) regions in bacterial genomes. DoriC is an online database that contains oriCs in more than 1520 bacterial and in more than 80 archaeal genomes. It also includes detailed information about repeats in oriCs identified by REPuter program, and URLs that link to NCBI Map Viewer UCSC Archaeal Genome Browser, which are useful to explore and discover the conserved features around oriC region.

ACE / Assessing Changes to Exons

Identifies possible functional changes in gene structure that may result from sequence variants. ACE converts phased genotype calls to a collection of explicit haplotype sequences, maps transcript annotations onto them, detects gene-structure changes and their possible repercussions, and identifies several classes of possible loss of function. The design of ACE’s computational model makes it directly applicable to nonhuman species with minimal re-training, enabling studies of other model and non-model animal and plant species.

Gene ORGANizer

Links human genes to the body parts they affect. Gene ORGANizer is built upon an exhaustive curated database that links more than 7,000 genes to approximately 150 anatomical parts using more than 150,000 gene-organ associations. The tool offers user-friendly platforms to analyze the anatomical effects of individual genes, and identify trends within groups of genes. Gene ORGANizer can be used to make new discoveries and is expected to be useful in a variety of evolutionary, medical and molecular studies aimed at understanding the phenotypic effects of genes.


Enables transformation of high-throughput sequence data generation into Human Genome Variation Society (HGVS)-compliant variant descriptions. VariantValidator is an online platform that offers user to produce complete descriptions in the format “genomic reference sequence”. It was designed to provide users with informative advice on errors in the description of variants. It stores regularly updated RefSeq data and displays the corresponding descriptions of transcript reference sequences.

FEB3 / Finite Element Bioengineering in 3d

Encompasses solid tumour growth and tumour induced angiogenesis. FEB3 is a validated three-dimensional mathematical and computational multiscale framework. The proposed in-silico model of dynamically coupled angiogenic tumour growth is specified to in-vivo and in-vitro data, chosen, where possible, to provide a physiologically consistent description. The model is then validated against in-vivo data from murine mammary carcinomas, with particular focus placed on identifying the influence of mechanical factors.

DASACT / Decision Aiding Software for Axiomatic Consensus Theory

Assists practitioners on choosing the most appropriate consensus function for generating consensus trees. The decision aiding is made on the user’s preference on the nine axiomatic properties, any of which can be marked as either desirable, undesirable, or indifferent. Then the user is presented with a list of consensus functions sorted by relevance to his/her intentions, the best consensus function being presented first. The user may then select any consensus functions in order to determine, for his/her pro- file of trees, their respective consensus trees. This advising is made possible via two different processing steps. A weighted average that may be customized by the user is provided to emphasize on importance of the previous five distances.


Generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. GeneBase is a full parser of the National Center for Biotechnology Information (NCBI) Gene database. It allows users to do original searches, calculations and analyses of the main information about genes which are fully annotated with the ‘Gene Table’ section in NCBI Gene. Furthermore, for a subset of gene records, it integrates nucleotide sequences useful for additional elaboration with the corresponding gene-associated meta-information.

Selenoprotein prediction

TProvides a web app designed for the prediction of eukaryotic selenocysteine insertion sequence (SECIS) elements and selenoprotein genes. Selenoprotein prediction server offers 2 modules: (i) SECISearch3, a method based on the Infernal suite (INFERence of RNA ALignment) that has at its core a manually curated alignment of more than a thousand eukaryotic SECIS elements and (ii) Seblastian, a pipeline to predict selenoprotein genes in nucleotide sequences which employs the identification of SECIS elements.

SPMM / Shifted Poisson Mixture Model

A mathematical tool for early HIV-1 evolution within a subject whose infection originates either from a single or multiple viral variants. SPMM provides a quantitative guideline for segregating viral lineages, which in turn enables to assess when a subject was infected. This tool provides a functional approach to understanding early genetic diversity, one of the most important parameters for deciphering HIV-1 transmission and predicting the rate of disease progression.


An R package for phylogenetic molecular clock analyses of multi-gene data sets. ClockstaR uses the patterns of among lineage rate variation for the different genes to select the clock-partitioning strategy. The method uses a phylogenetic tree distance metric and an unsupervised machine learning algorithm to identify the optimal number of clock-partitions, and which genes should be analysed under each of the partitions. The partitioning strategy selected in ClocsktaR can be used for subsequent molecular clock analysis with programs such as BEAST, MrBayes, PhyloBayes and others. This method will be particularly useful for improving molecular-clock analyses of phylogenomic data, which are often hindered by their computational requirements.


A user-friendly tool to quickly extract human genetic variation data from the latest release of the 1000 Genomes (1KG) Project. Ferret was developed as a straightforward Java application to be accessible even for non-specialists who are not adept at bioinformatics. By converting the 1KG vcf files to a format that can be read by popular pre-existing tools (e.g. Plink and HaploView), Ferret offers easy manipulation and visualization of the 1KG SNP and indel data, easy access to allelic frequency, linkage disequilibrium and haplotype information, and eventually tagSNP design.

Master Regulator Identification

Identifies the master regulator transcription factor in a genome. Master Regulator Identification is advantageous in terms of narrowing down the search space for potential candidate transcription factor biomarkers that can be targeted for drug development of complex diseases. Also, the fact that our method uses only a single data source, e.g. gene expression data, for accurately identifying the master regulator transcription factor makes it very useful in case there is limitation in data sources and data from multiple platforms are not available.


Combines sequence data and species tree information and improve gene tree reconstructions. TreeFix consists of three basic components: (i) a test of statistical equivalence to filter out gene tree topologies that are suboptimal, (ii) a gene tree and species tree reconciliation method to compute the reconciliation cost, and (iii) a tree search to explore the space of alternative gene tree topologies. Authors have compared its performance with that of several other gene tree reconstruction methods. They find that TreeFix shows drastic improvement over existing sequence-only and hybrid approaches, with performance comparable to the most sophisticated species tree aware Bayesian approaches.


Scans DNA sequences, identifies p53 response elements (REs) and classifies them based on predicted transactivation potential. P53retriver was used to search genome wide for high affinity p53 REs and to map naturally occurring single nucleotide polymorphisms (SNPs) that can impact on the DNA binding affinity of p53. It provides features for scoring interactions among groups of mismatches, non-canonical 3Q sites and half sites p53 REs, weighing the impact of consensus mismatches considering their position within the full site RE sequence.


Estimates the extents to which viral load is heritable either via the viral genotype (from donor to recipient) or via the host’s Human Leukocyte Antigen (HLA) genotype. HIVheritability uses linear mixed models (LMMs) to explain inter-patient differences in spVL while taking into account host and viral genetic relatedness. The method uses the pairwise relatedness of individuals with respect to a large set of features (rather than the individual data points) to estimate the fraction of phenotypic variance attributable to those features.

DICT / DAVID Gene ID Conversion Tool

Provides a comprehensive means for batch translations. DICT is a web based application which covers dozens of commonly used types of gene and protein identifiers. It provides (i) enhanced translation capability, (ii) extensive ID type coverage, (iii) a batch mode interface in support of one-to-one, one-to-many and many-to-many ID relationships, (iv) hyperlinks to in-depth information about genes to exam any potential translation errors, (v) a summary table of the overall translation which is generated for quality control purposes and (vi) capability to handle a mixture of ID types as well as a ‘not sure’ type.

NPS / Network Purifying Selection

Detects cancer vulnerability genes from cancer genomes. NPS is a statistic approach that identifies genes that have a network significantly depleted for mutations indicating that the gene itself is likely to be a vulnerability gene. It works by aggregating weak signals of negative selection across a gene’s first order protein-protein interaction network. This method eliminates the possibility of any bias from the alteration rate of an index gene and enables to complement previously established cancer vulnerability detection methods.

iMADS / integrative Modeling and Analysis of Differential Specificity

Analyzes non-coding variants. iMADS is a general framework that contains a high-throughput data and computational models. It allows users to apply our models for each transcription factor (TF) or TF pair to make predictions on any genomic or custom DNA sequence. This method proves that genomic sites differentially preferred by TF paralogs have different sequence features and DNA shape profiles, and they are involved in distinct biological functions.


Provides a manually curated collection of disease-associated enhancers. DiseaseEnhancer is a database that makes all disease-associated enhancer information publicly available in one location, providing an important and live-updated resource that facilitates the understanding of regulatory mechanisms in disease pathogenesis. It also provides a mutation map plot to show the mutations mapped in the disease-associated enhancers to help understand the roles of enhancers in diseases.


Allows computerized matching of patient-specific genomic profiles to precision clinical trials in cancer medicine. MatchMiner automatically matches the patient-specific genomic events to clinical trials, and makes the results available to trial investigators and clinicians via a Health Insurance Portability and Accountability Act (HIPAA) compliant web-based platform. It allows researchers to use the genomic data of their patient to retrieve trial matches for that specific patient or find trials using a robust search interface.

JRC GMO-Amplicons

Contains and makes available results of a bioinformatic pipeline that regularly screens public nucleotide sequence databanks. JRC GMO-Amplicons includes patents and available whole plant genomes, through in silico determination of PCR amplification. It can be queried by control laboratories to evaluate results of screening/identification analysis or for developing new detection methods and assessing in silico primers specificity and genetically modified organism (GMO) coverage.

ODIN / Oracle for Disorder prediction

Serves to make an accurate prediction for a subset of affected cases while having virtually zero false positive predictions for unaffected samples. ODIN is made to predict an input/test sample to be an affected case according to conditions: (1) the input sample is “close” to many affected case samples, (2) the input sample is far from any unaffected control sample. This tool can also be extended to take into account not only likely gene disruptive (LGD) mutations but also missense mutations to increase the power of the model in predicting a higher percentage of affected cases.


Supports interpretation of variants identified in genes associated with inherited heart conditions. CardioClassifier is an automated and interactive web tool that offers disease specific interpretation of genetic variants in genes associated with Inherited Cardiac Conditions (ICCs), according to guidelines released by the American College of Medical Genetics and Genomics. This web app draws on expert gene-specific knowledge to produce fast and high-quality results for diagnosing inherited heart problems.

HANDS / HSP base Assignment using NGS data through Diploid Similarity

Characterizes homeolog-specific polymorphisms (HSPs) in polyploid genomes. HANDS involves comparative alignments of next-generation sequencing reads from polyploid and diploid progenitor-relatives onto a suitable reference sequence. It provides the ability to relate, on a genome-wide basis, specific homeoallelic variants to particular agronomic traits. This tool increases the precision of crop-breeding solutions to address the challenge of global food security.


A platform-independent online application that implements the Townsend phylogenetic informativeness analysis, providing a quantitative prediction of the utility of loci to solve specific phylogenetic questions. PhyDesign has an easy-to-use interface which facilitates uploading of alignments and ultrametric trees to calculate and depict profiles of informativeness over specified time ranges, and provides rankings of locus prioritization for epochs of interest. PhyDesign facilitates locus prioritization, increasing the efficiency of sequencing for phylogenetic purposes compared to traditional studies with more laborious and low capacity screening methods, as well as increasing the accuracy of phylogenetic studies.

BASID2CS / The basidiomycetes Two Componen Systems repository

A pipeline web server that extends the analysis to the complete genome sequences of basidiomycetes. BASID2CS has been specifically designed for the identification, classification and functional annotation of putative TCS proteins from any predicted proteome. This pipeline is specifically designed and implemented for the bioinformatic screening and extraction of all the putative TCS proteins from each predicted proteome of basidiomycetes in a single step. All the TCS proteins in Agaricus remain to be characterized and the present genomic analysis paves the way for future TCS functional studies in this basidiomycete fungus.