A classification system designed for metagenomics experiments that assigns taxonomic labels to short DNA reads. PhymmBL combines two components: (i) composition-directed taxonomic predictions from Phymm and (ii) basic local alignment search tool (BLAST)-based homology results. PhymmBL combines these to label each input sequence with its best guess as to the taxonomy of the source organism. Input sequences as short as 100 base pairs can be phylogenetically classified with PhymmBL more accurately than with any other existing method. PhymmBL predicts species, genus, family, order, class and phylum for each read, allowing users to arrange results according to levels of specificity relevant to their research goals.
Implements Bayesian models for splice site prediction. The predictions are implicitly based on the three variables of (i) degree of matching to the splice site consensus, (ii) local compositional contrast, and (iii) assessment of 3-base periodicity in coding regions.
Achieves a method of handy analysis execution without command line operation. gVolante is a web server designed to make the best use of the reference gene set core vertebrate genes (CVG) for more accurate completeness assessment for vertebrates. It allows the standardized scoring of completeness on a uniform computational environment. In addition, an analysis on gVolante gives a concise report of length-based metrics and base compositions.
A set of programs aimed at simulating ancient DNA fragments. Gargamel can simulate most common features of a DNA sequences, including post-mortem DNA damage and base misincorporations. It simulates base compositional bias due to the molecular tools used in library preparation, sequencing bias against GC-rich fragments and errors introduced by the sequencing platform. Gargammel provides researchers with the opportunity to perform various inquiries to evaluate the robustness of various analyses to a DNA properties.
Generates graphical maps of circular genomes that show sequence features, base composition plots, analysis results and sequence similarity plots. CGView can supply the sequences in different format (raw, FASTA, GenBank or EMBL). BLAST is used by the server to compare the sequence sets. Then, the results are converted in a graphical map which shows all the sequence. The web tool includes different options to control which types of features are displayed and how the features are drawn. The server can be used to aid in the identification of conserved or diverged genome segments, instances of horizontal gene transfer, and differences in gene copy number.
Allows for the annotation of Watson-Crick and non-Watson-Crick basepairs, annotation of features (i.e. stems, loops, etc.), collapsing of features (horizontal) and sequences (vertical), along with 2D display of sequences (using VARNA) and base composition given a secondary structure (using KiNG). BoulderALE is a lightweight editor for editing and assessing the quality of small RNA alignments (less than ~1000 nts and ~1000 sequences). BoulderALE was developed to evaluate structure backed RNA alignments, along with the ability to collapse the alignment horizontally, to hide gapped regions of the alignment.
Identifies replication origin (oriCs) in bacterial genomes. Ori-Finder 1 is an online system based on an integrated method comprising the analysis of base composition asymmetry using the Z-curve method, distribution of DnaA boxes, and the occurrence of genes frequently close to oriCs. The program can also deal with the unannotated sequences by integrating the gene-finding program ZCURVE 1.02. Users can define their own DnaA boxes or origin recognition boxes (ORB) elements.
A computational and graphical representation tool for gene identification and sequence annotation. NPACT identifies sequence segments of any length with statistically-significant 3-base compositional periodicities and associated with ORF structures. NPACT produces graphical representations that allow genome-wide uninterrupted visual comparison of compositional profiles, pre-annotated genes and sequence segments of three-base periodicity with ‘Newly Identified ORFs’, enabling frame analysis on a genomic scale.
Provides a specialized knowledge base for proteomics research. PINdb is an online resource focuses on multiprotein nuclear complexes from human or yeast cells that have been biochemically and/or functionally characterized. The information collected for each protein complex includes common names and aliases, methods of isolation, tissue source and compositional, enzymatic and functional properties.
Jordi Martorell-Marugan Bioinformatician in Bioinformatics Unit of Genyo. I perform several analysis of different omics data while I develop new tools to analyze such data.
Pfizer-University of Granada-Junta de Andalucía Centre for Genomics and Oncological Research