Frameshift (FS) prediction is important for analysis and biological interpretation of metagenomic sequences. Since a genomic context of a short metagenomic sequence is rarely known, there is not enough data available to estimate parameters of species-specific statistical models of protein-coding and non-coding regions.
A gene prediction method which combines sequencing error models and codon usages in a hidden Markov model to improve the prediction of protein-coding region in short reads. The performance of FragGeneScan was comparable to Glimmer and MetaGene for complete genomes. But for short reads, FragGeneScan consistently outperformed MetaGene (accuracy improved ∼62% for reads of 400 bases with 1% sequencing errors, and ∼18% for short reads of 100 bases that are error free). When applied to metagenomes, FragGeneScan recovered substantially more genes than MetaGene predicted (>90% of the genes identified by homology search), and many novel genes with no homologs in current protein sequence database.
Detects and corrects frameshift errors caused by insertions and deletions in DNA sequences. RDP FrameBot uses a dynamic programming technique similar to that used for pairwise protein alignment but uses three sets of two-dimensional (2D) matrices, one for each potential reading frame. The software corrects frameshift errors in query reads and determines their closest matching protein sequences in a set of reference sequences.
A program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences.
Addresses the challenging question of predicting frameshifts in protein-coding regions of metagenomic sequences without extrinsic knowledge. An advantage of ab initio approach is the ability to detect frameshifts in genes of orphan proteins that do not have known homologs. It is shown on multiple test sets that the MetaGeneTack FS detection performance is comparable or better than the one of earlier developed program FragGeneScan.
Permits the identification of potential frameshifts or point mutations in a given open reading frame (ORF). BER can process precomputed blastp results and can be able to handle both wu-blastp and NCBI blastp. It includes several functionalities allowing the following action: converting raw BLAST output to internal btab format, filtering of btab hits by passed cutoffs, or creating a nucleotide database containing the corresponding nucleotide sequence for each query protein whose hits pass cutoffs.
Searches the genomic sequences or mRNA sequences for frameshifting sites. FSFinder is a web-based program capable of finding -1 frameshift sites for most known genes and +1 frameshift sites for two genes: protein chain release factor (prfB ) and ornithine decarboxylase antizyme (oaz). The software can be useful in discovering unknown genes that utilize alternative decoding as well as in analyzing frameshift sites.