Unlock your biological data


Try: RNA sequencing CRISPR Genomic databases DESeq

1 - 50 of 67 results
filter_list Filters
language Programming Language
healing Disease
settings_input_component Operating System
tv Interface
computer Computer Skill
copyright License
1 - 50 of 67 results
Scooby-domain / Sequence hydrophobicity predicts domains
A fast and simple method to identify globular domains in protein sequence, based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method successfully identifies sequence regions that will form a globular structure and those that are likely to be unstructured. The method does not rely on homology searches and, therefore, can identify previously unknown domains for structural elucidation.
A method to predict the domain boundaries of a multidomain protein from its amino acid sequence using a fuzzy mean operator. Using the nr-sequence database together with a reference protein set (RPS) containing known domain boundaries, the operator is used to assign a likelihood value for each residue of the query sequence as belonging to a domain boundary. This procedure robustly identifies contiguous boundary regions. For a dataset with a maximum sequence identity of 30%, the average domain prediction accuracy of our method is 97% for one domain proteins and 58% for multidomain proteins.
Fast H-DROP / Fast Helical-Domain linker pRediction using OPtimal features
Allows prediction of helical linkers. Fast H-DROP is an accelerated version of H-DROP, a support vector machine (SVM)-based tool aiming at specifically predicting helical linkers. The software was tested using an independent dataset consisting of 76 visually inspected helical linkers containing multidomain proteins and a set of sequences classified as single domain proteins according to SCOP 1.73. It can assist users in analyzing novel domains connected by helical linkers.
Allows to conduct global tests in proteomics experiments. RepeatedHighDim is based on a mixed linear model combined with a permutation procedure and missing values imputation. It is able to detect differences between possible omitted experimental groups by using standard protein-wise test in proteomics experiments. The tool aims to facilitate the biological interpretation of a proteomics experiment. It permits the ranking of Gene Ontology (GO) terms related to certain protein sets.
ThreaDom / Threading-based Protein Domain Prediction
A template-based algorithm for protein domain boundary prediction. Given a protein sequence, ThreaDom first threads the target through the PDB library to identify protein template that have similar structure fold. A domain conservation score (DCS) will be calculated for each residue which combines information from template domain structure, terminal and internal gaps and insertions. Finally, the domain boundary information is derived from the DCS profile distributions. ThreaDom is designed to predict both continuous and discontinuous domains.
DROP / Domain linker pRediction using OPtimal features
A support vector machine (SVM)-based domain linker predictor which was trained with 25 optimal features. The optimal combination of features was identified from a set of 3000 features using a random forest algorithm complemented with a stepwise feature selection. DROP demonstrated a prediction sensitivity and precision of 41.3 and 49.4%, respectively. These values were over 19.9% higher than those of control SVM predictors trained with non-optimized features, strongly suggesting the efficiency of our feature selection method.
An application that unifies protein domain annotation, domain arrangement analysis and visualization in a single tool. DoMosaics simplifies the analysis of protein families by consolidating disjunct procedures based on often inconvenient command-line applications and complex analysis tools. It provides a simple user interface with access to domain annotation services such as InterProScan or a local HMMER installation, and can be used to compare, analyze and visualize the evolution of domain architectures.
DAMA / Domain Annotation by a Multi-objective Approach
Treats protein domain architecture prediction as a multi-objective optimization problem. By taking into account known architectural solutions, DAMA identifies them within the protein sequence and integrates new domains into them whenever possible. DAMA has been evaluated over a benchmark containing protein sequences extracted from the Protein DataBank (PDB), over the genome of the poorly annotated malaria parasite Plasmodium falciparum and over two datasets collecting known sequences characterized by large domain architectures and repeated blocks of domains. Our results show that, for all datasets, DAMA outperforms existing computational methods and detects domain architectures presenting co-occurrences.
SANDPUMA / Specificity of AdenylatioN Domain Prediction Using Multiple Algorithms
Assists in prioritization and dereplication of nonribosomal peptide synthetases (NRPs) within large datasets. SANDPUMA can prioritize novel scaffolds and analogs within superfamilies of interest greatly increases the power of genomic natural product discovery efforts. It was created for automatic retraining to ensure its training data remains comprehensive as more NRPS biosynthetic gene clusters (BGCs) are experimentally characterized in Minimum Information about a Biosynthetic Gene Cluster (MIBiG) database.
A web tool for graphically analysing the evolutionary history of domains in multi-domain proteins. Individual domains on the same protein chain may have distinct evolutionary histories, which is important to grasp in order to understand protein function. For instance, it may be important to know whether a domain was duplicated recently or long ago, to know the origin of inserted domains, or to know the pattern of domain loss within a protein family. TreeDom uses the Pfam database as the source of domain annotations, and displays these on a sequence tree. An advantage of TreeDom is that the user can limit the analysis to N sequences that are most similar to a query, or provide a list of sequence IDs to include. Using the Pfam alignment of the selected sequences, a tree is built and displayed together with the domain architecture of each sequence.
CLADE / CLoser sequences for Annotations Directed by Evolution
Identifies domains in proteins by using all known Pfam domains and the large quantity of available genomic data spanning through a large panel of species. CLADE is based on a Support Vector Machine (SVM) method that assigns a confidence score to each domain prediction. It was tested on several datasets of sequences and shows that “multi-source” domain modeling is more appropriate than “mono-source” domain modeling for capturing remote homology.
Predicts RNase H domain of retroviruses. RNHtool is a RNH model that annotates new putative RNHs (np-RNHs) in the retroviruses. It basically predicts RNH domains through recognizing their start and end sites separately with SVM method. The classification accuracy rates are 100%, 99.01% and 97.52% respectively corresponding to jack-knife, 10-fold cross-validation and 5-fold cross-validation test. Subsequently, this model discovers 14,033 np-RNHs after scanning sequences without RNH annotations.
Allows users to infer domains and their boundaries in a query sequence from local gapped alignments generated using PSIBLAST. DOMAINATION uses the distribution of the aligned positions of N- and C-termini from PSI-BLAST local sequence alignments to identify potential domain boundaries. It incorporates an iterative strategy for chopping and joining domains and domain segments in an attempt to track a protein’s “evolutionary pathway” from its loss and gain of domains. This allows the recognition of both continuous and discontinuous domains.
DoBo / Domain Boundary
A method to integrate the classification power of machine learning with evolutionary signals embedded in protein families in order to improve protein domain boundary prediction. The method first extracts putative domain boundary signals from a multiple sequence alignment between a query sequence and its homologs. The putative sites are then classified and scored by support vector machines in conjunction with input features such as sequence profiles, secondary structures, solvent accessibilities around the sites and their positions.
Classifies protein domains based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. The method provides a complementary protein domain classification tool to conventional profile HMM-based methods for data sets containing frameshifts. HMM-FRAME can accept any error model trained on data from high-throughput sequencing technologies and thus achieve high detection sensitivity while maintaining a low false positive rate.
PIDA / Pattern Island Detection Algorithm
Determines patterns within sequences containing “islands” or subsequences of perfect sequence conservation separated by “water” or intervening regions of arbitrarily long unmatched sequence. PIDA compares two sequences of any alphabet and finds patterns which contain islands of matching sequence separated by arbitrary amounts of unmatched sequence, or water. It simplifies attacking both nucleic acid and amino acid motif finding problems.
A project initiated to detect known domain types and predicting domain architectures using sequence similarity searching. SBASE uses a curated collection of domain sequences and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network. It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies.
Scans a protein sequence with motifs from the PROSITE database. patmatmotifs writes a standard EMBOSS report file with details of the location and score of any matching motifs. The full documentation for matching patterns is given in the report. PROSITE is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. It consists of a database of biologically significant sites and patterns formulated in such a way that with appropriate computational tools it can rapidly and reliably identify to which known family of protein the new sequence belongs.
A web service that allows the user to plot the tendency within the query protein for order/globularity and disorder. GlobPlot successfully identifies inter-domain segments containing linear motifs, and also apparently ordered regions that do not contain any recognized domain. GlobPlot may be useful in domain hunting efforts. The plots indicate that instances of known domains may often contain additional N- or C- terminal segments that appear ordered. Thus GlobPlot may be of use in the design of constructs corresponding to globular proteins, as needed for many biochemical studies, particularly structural biology.
0 - 0 of 0 results
1 - 9 of 9 results
filter_list Filters
computer Job seeker
Disable 5
person Position
thumb_up Fields of Interest
public Country
language Programming Language
1 - 9 of 9 results

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.