Low complexity region prediction software tools | Protein sequence data analysis
Local compositionally biased and low complexity regions (LCRs) in amino acid sequences have initially attracted the interest of researchers due to their implication in generating artifacts in sequence database searches. There is accumulating evidence of the biological significance of LCRs both in physiological and in pathological situations.
Helps following the evolution of homorepeats based on orthology information, using a sensitive but tunable cutoff to help in the identification of emerging homorepeats. dAPE is organized both as a database and as a web server. It allows users to locate polyX regions in query proteins, to contextualize the polyX regions in the sequence, to detect X-rich regions and importantly, to study the evolution of the polyX using orthology data.
Offers tools for displaying low complexity regions (LCRs) from the UniProt/SwissProt knowledgebase, in combination with other relevant protein features, predicted or experimentally verified. Moreover, users may perform powerful queries against a custom designed sequence/LCR-centric database.
Facilitates comprehensive protein sequence analysis. MESSA gathers structural and functional predictions for a protein of interest. It exploits a number of select tools to predict local sequence properties, such as secondary structure, structurally disordered regions, coiled coils, signal peptides and transmembrane helices. This application also detects homologous proteins and assigns the query to a protein family.
Assists users for the fast discovery of protein compositional biases(CB). fLPS is an application that annotates both short highly-biased tracts and regions that have a compositional skew. It can handle large databases such as arise from metagenomics projects. It can also be applied to searching for proteins with similar CB regions, and for making functional inferences about CB regions for a protein of interest.
A service for annotation of compositionally-biased regions, and searching for similar regions in other sequences. The algorithm defines compositional bias through a thorough search for lowest-probability subsequences (LPSs) (i.e., the least likely sequence regions in terms of composition). Users can (i) initially annotate compositionally biased (CB regions) in input protein or nucleotide sequences of interest, and then (ii) query a database of greater than 1,500,000 pre-calculated protein-CB regions, for investigation of further functional hypotheses and inferences, about the specific CB regions that were discovered, and their protein disorder propensities.