High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis.
An easy-to-use and universally compatible toolkit designed for bench scientists to address the primary sequence analysis needs from high-throughput sequencing of combinatorial selection populations. FASTAptamer performs the simple tasks of counting, normalizing, ranking and sorting the abundance of each unique sequence in a population, comparing sequence distributions for two populations, clustering sequences into sequence families based on Levenshtein edit distance, calculating fold-enrichment for all of the sequences present across populations, and searching degenerately for nucleotide sequence motifs. While FASTAptamer was originally developed for analysis of high-throughput sequencing data from aptamer selections, it offers broad utility for those working on ribozyme or DNAzyme selections, surface display (phage display, mRNA display, etc.) selections, in vivo SELEX, protein mutagenesis selection, or any biocombinatorial selection that results in a DNA-encoded library for sequencing.
Functions to assist in discovering transcription factor DNA binding specificities from SELEX-seq experimental data. SELEX is an R package that offers functions used to calculate and return the affinities and affinity standard errors of K-mers of length k, to count and return the number of instances K-mers of length k appear within the sample’s variable regions.
Serves for designing and analyzing structured pools for in vitro selection. RAGPOOLS is an online application assisting in: (1) design of structured RNA pools with target motif distribution; (2) analysis of structural distributions of RNA pools; and (3) research of novel RNAs via combined experimental and theoretical pool design. It is composed of two different tools, RNA Pool Designer and RNA Pool Analyzer.
A meta-motif based statistical framework and pipeline to predict SELEX derived binding aptamers. Briefly, MPBind calculates four kinds of p-values (1-sided) for each motif, representing different features. Using human embryonic stem cell SELEX-Seq data, MPBind achieved high prediction accuracy for binding potential. Further analysis showed that MPBind is robust to both polymerase chain reaction amplification bias and incomplete sequencing of aptamer pools. These two biases usually confound aptamer analysis.
A computational tool to identify target-specific aptamers from HT-SELEX data and secondary structure information. APTANI builds on AptaMotif algorithm (Hoinka et al., 2012), originally implemented to analyze SELEX data; extends the applicability of AptaMotif to HT-SELEX data; and introduces new functionalities, as the possibility to identify binding motifs, to cluster aptamer families or to compare output results from different HT-SELEX cycles. Tabular and graphical representations facilitate the downstream biological interpretation of results.
An approach for the identification of sequence-structure binding motifs in HT-SELEX derived aptamers. AptaTRACE leverages the experimental design of the SELEX protocol and identifies sequence-structure motifs that show a signature of selection. Because of its unique approach, AptaTRACE can uncover motifs even when these are present in only a minuscule fraction of the pool. Due to these features, our method can help to reduce the number of selection cycles required to produce aptamers with the desired properties, thus reducing cost and time of this rather expensive procedure. The performance of the method on simulated and real data indicates that AptaTRACE can detect sequence-structure motifs even in highly challenging data.
Selects the optimal motif length and calculates the confidence intervals of estimated parameters. BEESEL uses the expectation maximization (EM) algorithm to iteratively find both the optimal position weight matrice (PWM) and the most likely binding position on each sequence read. The tool allows the sequences to be much longer than the binding sites, which requires the simultaneous estimation of the binding site locations and the specificity model.