An RNA motif identification program that takes an RNA sequence alignment as an input and identifies related sequences using a profile-based dynamic programming algorithm. ERPIN differs from other RNA motif search programs in its ability to capture subtle biases in the training set and produce highly specific and sensitive searches, while keeping CPU requirements at a practical level.
A method using support vector machine for poly(A) site prediction. This program takes a file containing DNA/RNA sequences in the FASTA format as input, and 1) makes prediction for putative mRNA polyadenylation sites [or poly(A) sites] and/or 2) generates results indicating the occurrences of different cis-elements. The program is implemented in PERL and runs under UNIX/LINUX systems.
A web-based software toolbox to recognize functional sites in nucleic acid sequences. Currently in this toolbox, two software tools are provided: TIS Miner and Poly(A) Signal Miner. The TIS Miner can be used to predict translation initiation sites in vertebrate DNA/mRNA/cDNA sequences, and the Poly(A) Signal Miner can be used to predict polyadenylation [poly(A)] signals in human DNA sequences.
A poly(A) motif prediction method based on properties of human genomic DNA sequence surrounding a poly(A) motif. These properties include thermodynamic, physico-chemical and statistical characteristics.
A machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). The program is able to predict the 12 main variants of human poly(A) motifs, i.e., AATAAA, ATTAAA, AAAAAG, AAGAAA, TATAAA, AATACA, AGTAAA, ACTAAA, GATAAA, CATAAA, AATATA, and AATAGA.
Identifies artefactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. cleanUpdTSeq classifies flanking 3’ ends derived from oligo-dT-based sequencing as true or false/internally primed. It is highly accurate, outperforms previous heuristic filters and facilitates identification of novel polyadenylation sites. The naïve Bayes classifier recalled 92.2% of True Negatives and 93.8% of True Positives, while it incorrectly categorized only 3.2% of predicted positives.
Predicts alternative polyadenylation patterns from transcript sequences. Polyadenylation Code is a versatile model that can be generalized to multiple tasks that it was not trained on. It can also classify variants near polyadenylation site (PAS) and can be used for PAS discovery. It provides analysis of what sequences increase and decrease the strength of a PAS, and identifies features that are associated with tissue-specific and constitutive PAS.
Recognizes poly(A) signals (PAS) which are PAS motif variant agnostic. DeeReCT-PolyA is based on a deep neural network method automating the feature extraction. It can extract important patterns of polyadenylation by learning from data. This tool utilizes large less training data in the target dataset allowing users to solve the problem of insufficient data in many species.
A pipeline for RNA-seq method to research polyA. SAPAS performs a systematic search and evaluation of protocols for typical steps to investigate to what extent these can indeed facilitate RNA-seq data analysis. 29 open-source interfaces and 6 of the more widely used interfaces were evaluated in detail. SAPAS processes the sequencing result using SAPAS method, including quality control, mapping to genome using bowtie, generating cleverage sites, internal priming, clustering cleverage sites.
Predicts potential PAS-strong, PAS-weak and PAS-less cleavage/poly(A) sites in human sequences by linear discriminant function (LDF) combining characteristics describing functional motifs (polyadenylation signal [PAS]; cleavage site [CS], motif; GU/U motif) and oligonucleotide composition upstream and/or downstream of these sites. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A) sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A) sites is not very high but is comparable to other available programs.
A program for detection of human polyadenylation signals. To avoid training on possibly flawed data, the development of polyadq began with a de novo characterization of human mRNA 3' processing signals. This information was used in training two quadratic discriminant functions that polyadq uses to evaluate potential polyA signals.
Allows users to perform Empirical Bayesian linear modelling. Fitnoise authorizes to experiment measurements related with genes or genomic features by deducing differential expression testing. The software is available both as a standalone application and as a part as the nesoni software and provides four regular and two experimental noise models. The package is able to analyze PAT-Seq data to determine the differential poly(A) tail length.
A web server for poly(A) site prediction in plants and algae. Currently, PASPA can predict poly(A) sites for ten species, including Arabidopsis, rice, and Medicago truncatula,spikemoss Selaginella moellendorffii, moss Physcomitrella patens, red algae Cyanidioschyzon merolae, two green algae C. reinhardtii and Ostreococcus lucimarinus, and two diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum.