Accurate gene structure prediction plays a fundamental role in functional annotation of genes. The main focus of gene prediction methods is to find patterns in long DNA sequences that indicate the presence of genes.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
Identifies complete exon/intron structures of genes in genomic DNA. GENSCAN uses a homogeneous fifth order Markov model of noncoding regions and a three periodic (inhomogeneous) fifth order Markov model of coding regions. Features of the program include the capacity to predict multiple genes in a sequence, to deal with partial as well as complete genes, and to predict consistent sets of genes occurring on either or both DNA strands.
Predicts genes in eukaryotic genomic sequences. AUGUSTUS is based on the evaluation of hints to potentially protein-coding regions by means of a Generalized Hidden Markov Model (GHMM) that takes both intrinsic and extrinsic information into account. This software models protein families by block profiles, where a block corresponds to an ungapped and highly conserved section of multiple sequence alignments (MSA).
Predicts multiple genes in genomic DNA sequences. FGENESH is appropriate for plant gene identification, especially for coding exons and intros. This ab initio gene prediction software is based on the hidden Markov model (HMM) and has a practically linear run time.
Predicts gene structure using similar protein sequences. GeneWise is heavily used by the Ensembl annotation system. It was developed from a principled combination of hidden Markov models (HMMs). GeneWise is highly accurate and can provide both accurate and complete gene structures when used with the correct evidence.
Builds transcriptomes from RNA-seq data. Trinity is a standalone software composed of three main components: (i) Inchworm, that first generates transcript contigs; (ii) Chrysalis, for clustering them and constructing complete de Bruijn graphs for each cluster and; (iii) Butterfly that processes individual graphs in parallel for finally resulting to the reconstruction of the transcript sequences.