Transposable element databases | Genome annotation
Transposable elements (TEs) are segments of DNA that self-replicate in a genome. DNA segments that originated from TE duplications may or may not remain transpositionally active but are herein referred to simply as TEs. TEs form vast families of interspersed repeats and constitute large parts of eukaryotic genomes, for example, over half of the human genome and over four fifths of the maize genome. The repetitive nature of TEs confounds many types of studies, such as gene prediction, variant calling (i.e., the identification of sequence variants such as SNPs or indels), RNA-Seq analysis, and genome alignment.
Gathers information about transposable elements (TEs) and other types of repeats in eukaryotic genomes. Repbase is an online database that can be used for eukaryotic genome sequence analyses and in studies concerning the evolution of TEs and their impact on genomes. This repository contains more than 38,000 sequences of different families or subfamilies.
Allows easy access to all known cases of horizontal transfer of transposable elements (HTT) reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using TEs and/or host species classification and export them in several formats.
Collects and classifies mobile genetic elements (MGEs) including phases and plasmids from various sources. ACLAME provides a platform for analyzing MGE diversity from a global scale down to specific groups of MGEs and tools for the detection of new MGEs integrated in bacterial genomes. The BLAST search interface allows for a simple querying of the ACLAME sequences, returning information such as for each hit sequence, the functional annotation, the MGE, host(s) and protein families it belongs to.
A research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). GyDB is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives.
Contains putatively active LINE-1 insertions residing in human and rodent genomes: intact in the two open reading frames (ORFs), full length L1s (FLI-L1s) and b) L1s with intact ORF2 but disrupted ORF1 (ORF2-L1s). L1Base includes the full-length (>6000 bp) non-intact L1s (FLnI-L1s). L1Base can be searched via the MySQL-driven query system by using criteria, such as conservation of the functional sites important for activity, chromosomal localization and families. The database can also be searched by executing Blastn-based queries with a user-specified L1 sequence.
An open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). Dfam families include retrotransposons, DNA transposons, interspersed repeats of unknown origin, and a number of non- transposable element (TE) entries used to annotate satellites or to avoid annotating noncoding RNA genes as TEs.
Provides complex information on and analysis of retroviral elements found in the human genome. HERVd can be used for searches of individual HERV families, identification of HERV parts, graphical output of HERV structures, comparison of HERVs and identification of retrovirus integration sites.
A database of Transposed elements (TEs) which are located within protein-coding genes of 7 organisms: human, mouse, chicken, zebrafish, fruilt fly, nematode and sea squirt. TranspoGene contains information regarding specific type and family of the Transposed elements, genomic and mRNA location, sequence, supporting transcript accession and alignment to the TE consensus sequence. The database also contains host gene specific data: gene name, genomic location, Swiss-Prot and RefSeq accessions, diseases associated with the gene and splicing pattern.
Provides a curated and comprehensive summary of L1-HS insertion polymorphisms identified in healthy or pathological human samples and published in peer-reviewed journals. euL1db will help understanding the link between L1 retrotransposon insertion polymorphisms and phenotype or disease.
Aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The MAtDB web interface allows access of the data through graphical or list browsing, searching by keywords, names or sequences, and through precompiled tables that summarize general, functional, structural or comparative features. All data from the Arabidopsis Genome Initiative (AGI) have been integrated into MAtDB.
Allows to bring some order to the system of short interspersed elements (SINEs) and to set a basis for further studies on these genomic elements. SINEBase introduces a set of formal definitions about SINEs. More than 170 families have been counted (concerning animals, flowering, plants, and green algae) on the base. These families are classified according to the modular structure of their nucleotide sequences. The website can be used in two ways: (1) exploring the database, (2) analyzing candidate SINE sequences.
A series of databases was constructed to host all MITE sequences from the 41 plant genomes. The databases are available for sequence similarity searches (BLASTN), and MITE sequences can be downloaded by family or by genome. The databases can be used to study the origin and amplification of MITEs, MITE-derived small RNAs and roles of MITEs on gene and genome evolution.
Provides taxon-specific primate Alu elements. AluHunter offers information that can be reuse in phylogeny and population genetics. It characterizes all Alu elements in GenBank sequences and isolates those that may be informative at the generic or subgeneric level. The database aspires to be a classification of all Alu elements ever sequenced and deposited in GenBank.
A catalogue of transposable element (TE) exaptation events, including information on the TE copy but also on the affected gene. C-GATE is interactive and allows users to include missed or new TE exaptation data. C-GATE provides a graphic representation of the entire library, which may be used for future statistical analysis of TE impact on host gene expression.
Contains a collection of P-element insertion on an isogenic genetic background that permits the construction of molecularly defined chromosomal aberrations in Drosophila. DrosDel collection is a database which can be used to generate genetically and molecularly verified deletions and to delete genomic regions with single gene resolution.
An integrated and interactive database of human retrotransposon insertion polymorphisms (RIPs). Users can query the database by a variety of means and have access to the detailed information related to a RIP, including detailed insertion sequences and genotype data.
Assists in storing and retrieving information describing retroviral integration sites. The Retrovirus Integration Database (RID) is an online resource containing information about retrovirus integration sites in host genomes. Users can query all available integration sites or specifically analyze integration sites in specific chromosomes, genes or tissues. It also includes tools to show the distribution of integration sites along a chromosome.
A collective and systematic resource of Sireviruses in plants. MASiVEdb is unlike any other transposable element database, providing a multitude of highly curated and detailed information on a specific genus across its hosts, such as complete set of coordinates, insertion age, and an analytical breakdown of the structure and gene complement of each element. All data are readily available through basic and advanced query interfaces, batch retrieval, and downloadable files. A purpose-built system is also offered for detecting and visualizing similarity between user sequences and Sireviruses, as well as for coding domain discovery and phylogenetic analysis. MASiVEdb is currently the most comprehensive directory of Sireviruses, and as such complements other efforts in cataloguing plant transposable elements and elucidating their role in host genome evolution.