Cis-regulatory DNA element databases | Genome annotation
Transcription factors (TFs) influence gene expression by binding to specific cis-acting elements in a genomic sequence. Thus, accurate models for describing the binding properties of TFs are essential in modeling transcription. From a set of known transcription factor binding sites (TFBSs) for a given TF, the binding preference is generally represented in the form of a position weight matrix (PWM) (also called position-specific scoring matrix) derived from a position frequency matrix (PFM). A PFM is essentially an occurrence table, summarizing the number of each nucleotide observed at each position of a set of aligned TFBSs. Compared with simpler models like consensus sequences, PWMs allow for an additive probabilistic description of binding preferences.
A Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. In the first release, factorbook contains 457 ChIP-seq datasets on 119 TFs in a number of human cell lines, the average profiles of histone modifications and nucleosome positioning around the TF-binding regions, sequence motifs enriched in the regions and the distance and orientation preferences between motif sites.
Provides non-redundant curated binding models. HOCOMOCO is a comprehensive and carefully hand-curated collection of Transcription Factors Binding Sites (TFBS) models with reduced redundancy of model associations to individual transcription factor (TF). This website provides a system of interactive filters making it easier to browse the tables of the collection. To facilitate a practical application, all models are linked to gene and protein databases.
A database that uncovers the molecular basis of TF binding in the human genome based on regulatory motif analysis of all Transcription Factors (TFs) grouped by family. This allows browsing of all known motifs for each factor, curated from TRANSFAC, Jaspar, and Protein Binding Microarray (PBM) experiments, and their enrichment and instances within corresponding TF binding experiments. It also provides a list of novel regulatory motifs discovered by systematic application of several motif discovery tools (including MEME, MDscan, Weeder, AlignACE) and evaluated based on their enrichment relative to control motifs within TF-bound regions. ENCODE-motifs also provides a genome-wide map of regulatory motif instances in the human genome for both known and novel motifs.
Compiles data on experimentally validated, naturally occurring transcription factors binding sites (TFBS) across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records. CollecTF integrates multiple sources of data automatically and openly, allowing users to dynamically redefine binding motifs and their experimental support base.
A freely available online database of transcription factor (TF) inferred binding specificities. CIS-BP currently incorporates data from >300 species covering >250 TF families, totaling >160,000 TFs (of which, >65,000 have at least one DNA binding motif). CisBP collects data from >25 sources, including other database such as Transfac, JASPAR, HOCOMOCO, FactorBook, UniProbe, Fly Factor Survey, and dozens of additional publications. In addition to housing these “directly determined” DNA binding motifs, CisBP also includes “inferred” motifs. Inferences are performed by mapping motifs across and within species, using DNA binding domain similarity thresholds established separately for each TF family (see publication for details). In other words, if a mouse TF has a known motif, we can infer its human ortholog’s motif, provided that the ortholog’s DNA binding domain is “similar enough”.
A database on composite elements. TRANSCompel focuses on so-called composite elements, consisting of two (or more) neighboring binding sites, characterized by synergistic or antagonistic effects between the two transcription factors binding to them. It contains, in addition to the COMPEL table, a separate table for detailed information on the experimental EVIDENCE on which the composite elements are based.
A public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. ABS lists 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.