MCAST / Motif Cluster Alignment and Search Tool
Uses a motif-based hidden Markov model to scan for clusters of motifs. Its key features include a scoring scheme based on p-values and a method for calibrating the resulting scores to obtain statistical confidence estimates. The new version of MCAST offers improved graphical output, a dynamic background model, statistical confidence estimates based on false discovery rate estimation and, most significantly, the ability to predict CRMs while taking into account epigenomic data such as DNase I sensitivity or histone modification data. We demonstrate the validity of MCAST's statistical confidence estimates and the utility of epigenomic priors in identifying CRMs.
Scans a set of sequences (e.g. promoters) from co-regulated or co-expressed genes with motifs describing the binding specificity of known transcription factors and assesses which motifs are significantly over- or under-represented, providing thus hints on which transcription factors could be common regulators of the genes studied, together with the location of their candidate binding sites in the sequences. Pscan does not resort to comparisons with orthologous sequences and experimental results show that it compares favorably to other tools for the same task in terms of false positive predictions and computation time.
Allows to find clusters of pre-specified motifs in DNA sequences. The Cluster-Buster program permits to analyze multiple sequences at once, and especially sequences longer than 100 kb. The main application is detection of sequences that regulate gene transcription, such as enhancers and silencers, but other types of biological regulation may be mediated by motif clusters too.This tool includes a web version and a desktop version. Comparing to the web version, the desktop does not understand GenBank format for sequences and it does not produce graphics for the output.
COMPASSS / COMplex PAttern of Sequence Search Software
Detects presence of complex elements by mining whole genomes. COMPASSS is based on Wu-Manber multiple pattern matching algorithm and intends to identify motifs from an entire sequence. The software allows users to make searches from both conserved and degenerated sequences. It was tested on three experimentally validated complex patterns, demonstrating its capacities in both distinguishing protein domains as well as cis-acting semi-conserved elements.
Provides functionality to compute the statistics related with motif matching and counting of motif matches in DNA sequences. motifcounter provides functions to investigate the per-position and per-strand log-likelihood scores between the Peak-Flow-Meters (PFM) and the Blood Glucose Meters (BGM) across a given sequence of set of sequences. Furthermore, the package facilitates motif matching based on an automatically derived score threshold. To this end the distribution of scores is efficiently determined and the score threshold is chosen for a user-prescribed significance level.
Assists users in collecting summary statistics. Goldilocks is a package that was developed to identify shifts in variation. It can also discover outlier regions, or locate and extract interesting regions from one or more arbitrary genomes for further analysis, for a user-provided definition of interesting. It must also be supplied with a desired census strategy defining the criteria of interest, such as occurrences of individual nucleotide bases, motifs, deviations from a reference or GC-content.
TFM-Explorer / Transcription Factor Matrix Explorer
A program for analysing regulatory regions in eukaryotic genomes. TFM-Explorer takes a set of coregulated gene sequences, and searches for locally overrepresented transcription factor binding sites. It scans sequences for detecting all potential transcription factor binding sites, using weight matrices from JASPAR or TRANSFAC and extracts significant clusters (region of the input sequences associated with a factor) by calculating a score function.
Visualises oligonucleotide patterns and sequence motifs occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature. seqPattern is an R package that offers functions to find positions of specified sequence patterns in a list of sequences of the same length ordered by a provided index. Sequence patterns can be consensus sequences of variable length and can contain IUPAC ambiguity code. Position of each pattern occurrence is specified in two-dimensional matrix, i.e. the first coordinate provides the ordinal number of the sequence and the second coordinate gives the position within the sequence where the pattern occurs.
