Encrypts a genetic sequence of interest and permits users to search databases in a confidential manner. SIG-DB confronts the encrypted sequence to each item in the database chosen and computes an encrypted similarity score. It allows users to make private sequence-to-sequence comparisons. This tool furnishes a solution for a secure multi-party exchange of information. It is composed of two distinct parts: a Querier and a Database Owner.
Models the multivariate genetic architecture of constellations of traits and incorporates genetic covariance structure into multivariate genome-wide association studies (GWAS) discovery. GenomicSEM is a Two-Stage Structural Equation Modeling approach that allows the user to specify and compare a range of different hypothesized multivariate genetic architectures. The software can be applied to a broad spectrum of traits, including gene expression, hormones, metabolites, brain structure and functions, behaviors, and psychiatric disorders and medical diseases.
Enables users to explore large, integrated genomic datasets. IGV provides next-generation sequencing (NGS) data visualization and provides features for identification of sequencing and analysis artifacts, leading to errant single-nucleotide variant (SNV) calls, as well as support for viewing large-scale structural variants (SV) detected by paired-end read technology. The software also includes features to support third-generation long-read sequencing technologies. Several IGV features have been developed to aid manual review of aligned reads.
Furnishes data standards for high-throughput biological experiments. FuGE has been developed to facilitate data management and the creation of repositories, software or standards. It can also produce modular data transfer standards and complete workflow formats. It intends to solve the challenges brought by developing data standards that are resilient to evolution in technology and to provide a single format for laboratory workflows.
Offers a component-based architecture that allows users to add new functionality in the form of plug-in modules. geWorkbench includes many computational resources permitting to delete many steps that require programming skills. It simplifies the utilization of multi-module analysis pipelines. The tool’s modules consist of wrapped versions of pre-existing third-party software tools.
Allows users to visualize and analyze plant co-function networks. PlaNet integrates genomics, transcriptomics, phenomics, and ontology analyses across seven plant species important both for research and human circumstances. For comparative analyses, it implements NetworkComparer, a pipeline that compares and displays commonalities and differences between the co-expressed node vicinity networks (NVNs) simultaneously across selected species. This platform includes several databases such as famNet database, ensembleNet database.
A framework to provide a collection of rigourously validated tools for the manipulation and analysis of genome biology data sets. PyCogent is a fully integrated and thoroughly tested framework for controlling third-party applications; devising workflows; querying databases; conducting novel probabilistic analyses of biological sequence evolution; and generating publication quality graphics. It is distinguished by many unique built-in capabilities (such as true codon alignment) and the frequent addition of entirely new methods for the analysis of genomic data.
Enhances the query processing speed queries on genomic databases, especially for high-throughput molecular profiling data. CGDM is a collaborative genomic data model that increases the query processing and generates three collaborative global clustering index tables (CGCITs) to deal with the velocity and variety issues without needing much extra volume.
Provides a high-risk pedigrees (HRP) method. SGS recognizes segregating chromosomal segments of statistical merit. It accounts for intra-familial heterogeneity and multiple testing. This tool identifies all genomic segments shared identical-by-state between a defined set of cases. It has the potential to reinvigorate the utilization of extended HRPs in the identification of risk variants that contribute to common, complex disease.
Enables to build custom genome browsers. GIVe is a programming library allowing creation of platforms for releasing datasets produced by national and international consortiums. With a bit of experience and knowledge in mySQL and HTML, user can make a genome browser hosting diverse types of genomic datasets including RNA sequencing, ChIP sequencing, and genome interaction datasets.
Checks transcripts annotations in multiple repositories. UGAHash proposes an open source algorithm based on cryptographic hash functions which aims to provide a standardized annotation of transcripts. The application consists of two platforms: (i) a web application that allows users to inquire various versions of the annotations provided by public databases by GTF or by accession number and (ii) a local python implementation to classify various genomic location.
Permits rigorous statistical investigation of genomic data. Genomic HyperBrowser can serve for a range of genomic investigations that query characteristics of individual tracks or relations between pairs of tracks along the genome. It is able to differentiate 15 types of tracks at the generic level. This tool furnishes programs for customizing data into forms that eases subsequent analyses.
A tool for estimating the significance of overlap between multiple sets of genomic intervals. GAT implements a null model that the two sets of intervals are placed independently of one another, but allows each set's density to depend on external variables, for example, isochore structure or chromosome identity. GAT estimates statistical significance based on simulation and controls for multiple tests using the false discovery rate.
Facilitates translation of biomedical research questions to language amenable for computational analysis. GROK supports various deep sequencing (DS)-related operations such as preprocessing, filtering, file conversion, and sample comparison. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. The tool can facilitate answering biomedical research questions and establish experimentally testable predictions.
Enhances interpretation necessary for doing big data science in genomics. XGR allows ontology tree-aware enrichment and similarity analysis. It can also be useful for cross-disease network and annotation analysis. This tool is able to interpret genomic summary data resulting from modern genetic studies such as differential expression, genome wide association study (GWAS), or expression quantitative trait loci (eQTL) mappings.
Offers a method to identify metastasis signatures. DGS is an approach based on the combination of two denoising auto-encoders (DAEs) used on both labelled and unlabelled cohorts. The application first trains a DAE with a large unlabelled gene expression dataset and transferred the results to the second DAE, which is trained with a smaller labelled dataset. Then, the program selects genes with high weights that are used to train a classifier which lastly identifies genes of the labelled dataset with non-zero coefficients as signature.
A Java-based software package that is designed to integrate genomic and transcriptomic data generated from next-generation sequencing with proteomic data generated from protein mass spectrometry. PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach.
A tool that matches de novo predicted amino acid sequences to the genomic DNA sequence of an organism. The matching procedure is error-tolerant to accommodate possible de novo sequencing errors due to individual missing fragment ions in the MS/MS scan. In addition, spliced peptides can be deduced from genomic DNA.
Predict drug combination effects. DIGRE models the drug response dynamics and gene expression changes after individual drug treatments. It takes into account the sequential effect of the treatment, in agreement with many observations that the sequencing of drug treatment matters to patients’ outcomes. The tool uses a mathematical model of the drug response estimates the genomic residual effect.
A visualization tool to displays the locations of genomic islands in a genome, as well as the corresponding supportive feature information for GIs, including 1) sequence composition based feature, interpolated variable order motifs (IVOM) by third-party software Alien_hunter, 2) mobile gene information Integrase, 3) mobile gene information transposases, 4) tRNA gene, 5) phage information, 6) gene density, 7) intergenic distance and 8) highly expressed genes (HEGs).
Provides overview statistics of tRNA genes within each analyzed genome, including information by isotype and genetic locus, easily downloadable primary sequences, graphical secondary structures and multiple sequence alignments. Direct links for each gene to UCSC eukaryotic and microbial genome browsers provide graphical display of tRNA genes in the context of all other local genetic information. The database can be searched by primary sequence similarity, tRNA characteristics or phylogenetic group.
A database which offers gene annotation of cucurbit. This base offers the genome of Melon (Cucumis melo), Cucumber (Cucumis sativus), Watermelon (Citrullus lanatus), Pumpkin (Cucurbita maxima). The Cucurbitaceae consist of 98 proposed genera with 975 species, mainly in regions tropical and subtropical. All species are sensitive to frost. Most of the plants in this family are annual vines, but some are woody lianas, thorny shrubs, or trees (Dendrosicyos). Cucurbit belongs to the Cucurbitaceae family.
A manually curated database of conditions with known genetic causes, focusing on medically significant genetic data with available interventions. All conditions with identified genetic causes are included in the CGD. For each entry, the database includes the gene symbol, condition(s), allelic conditions, inheritance, age (pediatric or adult) in which interventions are indicated, clinical categorization, and a general description of interventions/rationale. The contents are not intended to serve as nor substitute for comprehensive clinical guidelines, but are rather intended to briefly describe the types of interventions that might be considered.
Includes structural assignments for the proteins encoded within the genomes of over five eukaryotes and 100 prokaryotes. GTD is an online database that presents several options: (1) Blast search; (2) keyword search; (3) summary of predictions; and (4) download GTD lists. This repository assists users to assign proteins with distant sequence homology to known folds.
Extends the functionality of Ensembl Regulatory Build for the three species: human, mouse and rat. NGD is a database containing information on conserved non-coding sequences and on genome-wide occurrences of transcription factor binding site (TFBS) motifs from public Jaspar and Transfac Professional.
Stores prokaryotic genomic islands. Pre_GI is a database that contains more than 20 000 islands found in bacterial and archaeal chromosomes and plasmids. The concept of ‘island’ relevant to this database is defined as a horizontally acquired fragment of DNA in a bacterial/archaeal genome, including those which may have been acquired by ancestral organisms and transferred vertically to its progeny. Users can browse current islands or search and compare their newly predicted islands against Pre_GI records.
Provides a comprehensive summary of structural variation in the human genome. We define structural variation as genomic alterations that involve segments of DNA that are larger than 50bp. The content of the database is only representing structural variation identified in healthy control samples. The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data. The database is continuously updated with new data from peer reviewed research studies.
Compiles information about blueberries. BBGD454 is a free database that compiles about 400 transcriptome sequences from nine cDNA libraries accompanied to related specifications such as taxonomy, tissue specificity or gene ontology. The database includes several functionalities that allows users to search by contig number or by sequence description, to consult a table including the full samples and to compare the gene expression between two different libraries.
Promotes sesame functional genomics research. SesameFG is an integrated repository of comprehensive genotype-phenotype information that provides comprehensive genetic information, phenotypic information and bioinformatics analysis for Sesamum indicum L. functional genomics research. It was constructed using large-scale genetic and phenotypic sesame resources that came from public databases, literature, and sesame functional genomics consortium inputs.
Contains all available genome and expressed sequence tag (EST) sequences, genetic maps, and transcriptome profiles for cucurbit species. CuGenDB is an online repository that offers the possibility to analyze and display comparative genomics and expression datasets of different cucurbit species. It provides a feature page for each EST or unigene to list the related sequence and annotation information.
A repository that provides archiving, accessioning and distribution of publicly available genomic structural variants, in all species. The DGVa accepts direct submissions from researchers and performs manual curation from the literature. The DGVa also exchanges data on a regular basis with dbVar.
Contains the full results from all published PGC studies. The results files of the PGC Database are available below along with the LD pruned version suitable for polygenic profile scoring. The purpose of the Psychiatric Genomics Consortium (PGC) is to unite investigators around the world to conduct meta- and mega-analyses of genome-wide genomic data for psychiatric disorders. The PGC includes over 800 investigators from 38 countries, it represents the largest consortium and the largest biological experiment in the history of psychiatry.
Gathers information about reference sequences designed for the reporting of diagnostically relevant variants. Locus Reference Genomic (LRG) is a manually curated database which provides for each record a stable reference sequence including genomic DNA, transcript and protein sequences and an updated section accompanied by the most recent biological information about LRG.
A platform for genome functional annotations and multi-dimensional network analyses in Sorghum (Sorghum bicolor [L.] Moench). SorghumFDB encompassed most information, such as various annotations of whole genome assemblies, miRNA sequences and target genes, common gene families, network constructions using transcriptome data, PPI data and miRNA-target pairs, as well as multiple gene function annotation elements. Visualization tools (Gbrowse, Cytoscape and open-flash-chart) and four analysis-based tools, BLAST, GSEA, motif significance analysis and pattern set, were provided to determine the functional prediction.
Provides archival, data accessioning and distribution services for genomic structural variation (GSV). dbVar is a comprehensive resource that include data originating from the 1000 Genomes project, The Wellcome Trust Sanger Institute Mouse Genomes, COSMIC project and from numerous clinical genetics studies. Users can navigate to particular studies or can perform text-based searches using the standard NCBI Entrez search interface.
Serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. TFGD contains tomato metabolite, sRNA data sets, profiles of numerous flavor and nutrition-related metabolites. It includes a tool to correlate metabolite and transcript profiles, based on the Pearson or Spearman rank correlation coefficient, and to estimate the similarity of profiles.
A public database, developed to organize a large data set of confocal images generated from the maize marker lines, for studying native gene expression in specific cell types and subcellular compartments using fluorescent proteins. Maize Cell Genomics Database represents two types of data: (i) information which describes fluorescent-tagged gene constructs used for maize marker line generation; and (ii) confocal images representing spatial and temporal expression of the fluorescent markers in the maize marker lines.
Provides search and analysis tools for bioinformatics analyses of gene function or regulatory modules. SIFGD was designed to integrate existing data from publications, to improve the proportion of gene annotation, and to provide popular functional analysis tools in a convenient format for use by Setaria researchers. Functional analysis modules, major components of SIFGD, are useful for studying biological processes, such as regulation, signaling, and metabolism.
Collects functional genomics data about cotton. CottonFGD is a repository composed of three main features: (i) a search function allows users to retrieve cotton genes by genomic region, sequence similarity, or gene properties; (ii) the profile page provides information about a specific entry including multiple properties such as gene structure, and expression or sequence variation data; and (iii) an analysis module can generate relevant information and lists.