Unlock your biological data


Try: RNA sequencing CRISPR Genomic databases DESeq

1 - 50 of 164 results
filter_list Filters
language Programming Language
build Technology
healing Disease
settings_input_component Operating System
tv Interface
computer Computer Skill
copyright License
1 - 50 of 164 results
PBcR / PacBio Corrected Reads
An approach that utilizes short, high-identity sequences to correct the error inherent in long, single-molecule sequences. PBcR, implemented as part of the Celera Assembler, trims and corrects individual long-read sequences by first mapping short-read sequences to them and computing a highly accurate hybrid consensus sequence: improving read accuracy from as low as 80% to over 99.9%. The corrected, “hybrid” PBcR reads may then be de novo assembled alone, in combination with other data, or exported for other applications.
star_border star_border star_border star_border star_border
star star star star star
A single-cell assembler for capturing and sequencing “microbial dark matter” that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. SPAdes is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide Additional contigs can also be provided to be used as long reads. SPAdes supports paired-end reads, mate-pairs and unpaired reads and can take as input several paired-end and mate-pair libraries simultaneously.
MHAP / MinHash Alignment Process
A reference implementation of a probabilistic sequence overlapping algorithm. MHAP is designed to efficiently detect all overlaps between noisy long-read sequence data. It efficiently estimates Jaccard similarity by compressing sequences to their representative fingerprints composed on min-mers (minimum k-mer). MHAP is included within the Canu assembler which is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION).
A variant caller and small genome assembler. The heart of DISCOVAR is a de novo genome assembler, one that is accurate enough to produce assemblies that can be used for variant calling given a reference sequence. DISCOVAR can also generate de novo assemblies for small genomes, but consider using DISCOVAR de novo instead which can assemble genomes up to mammalian size. DISCOVAR provides a more complete inventory of an individual’s genetic variants than had been previously possible. As such, it adds to the tools that can be used to probe the genetic basis of disease. It may be particularly useful in cases where targeted or exome sequencing fails to find causal mutations.
Allows de novo genome assembly and multisample variant calling. Cortex is a modular set of multi-threaded programs for manipulating assembly graphs. Linked de Bruijn Graph (LdBG) data structure and associated algorithms are implemented as part of the software. It was used for two tasks where long-range information is likely to be beneficial: finding large differences from a reference and analysis of genomic context for drug resistance genes, which was validated using a PacBio reference assembled for the sample.
MECAT / Mapping Error Correction and de novo Assembly Tool
Employs novel alignment and error correction algorithms that are much more efficient than the state of art of aligners and error correction tools. MECAT can be used for effectively de novo assembling large genomes. It achieves superior computing efficiency to current assembly pipelines. In particular, MECAT takes only about 7600 CPU core hours to assemble a high quality human CHM1 genome using 54x SMRT data47 (CHM1) on a single 32-threads computing node with 2.0 GHz CPU, which is 34 times faster than the current PBcR-MHAP pipeline. It makes it possible to de novo assemble large genome using Single Molecule Real Time (SMRT) reads with the similar computational cost as that the assembling of Next Generation Sequencing (NGS) reads needs.
star_border star_border star_border star_border star_border
star star star star star
Proposes a powerful and comprehensive suite of next generation sequencing (NGS) analysis tools. Through an intuitive and user-friendly interface, Geneious provides visual sequence alignment and editing, sequence assembly, comprehensive molecular cloning and phylogenetic analysis. Users can also simply import and convert a vast range of data types and customize with their own algorithms, plugins or workflows. Furthermore, Geneious increases process efficiency and improves data organisation. This bioinformatics software platform also proposes a high interoperability with good API to link LIMS and other tools. First released in 2005, Geneious is one of the world’s leading bioinformatics software platforms, used by over 2,500 universities and institutes and commercial companies in more than 65 countries.
Provides a whole‐genome shotgun assembler that can generate high‐quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers. The ALLPATHS-LG assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies. ALLPATHS‐LG requires high sequence coverage of the genome in order to compensate for the shortness of the reads. The precise coverage required depends on the length and quality of the paired reads, but typically is of the order 100x or above.
Enables the assembly of a human genome, using short reads from a high-throughput sequencing platform. ABySS consists of a parallelized sequence assembler that allows parallel computation of the assembly algorithm across a network of commodity computers. This algorithm proceeds in two stages: (1) it generates all possible substrings of length k (termed k-mers) form the sequence reads; and (2) it uses mate-pair information to extend contigs by resolving ambiguities in contig overlaps.
star_border star_border star_border star_border star_border
star star star star star
Provides a de novo assembler for short DNA sequence reads. SSAKE is designed to help leverage the information from short sequences reads by assembling them into contigs and scaffolds that can be used to characterize novel sequencing targets. SSAKE assembles whole reads (not k-mers) and as such, is well-suited for structural variant assembly/detection. SSAKE is written in PERL and runs on Linux. SSAKE cycles through short sequence reads stored in a hash table and progressively searches through a prefix tree for extension candidates. The algorithm assembled 25 to 300 bp (genome, transcriptome, amplicon) reads from viral, bacterial and fungal genomes. SSAKE is lightweight, simple to setup & run and robust.
star_border star_border star_border star_border star_border
star star star star star
Allows integrative investigation of next generation sequencing (NGS) microbiology data. Orione supports the whole life cycle of microbiology research data from production and annotation to publication and sharing. It can be used for a variety of microbiological projects including bacteria resequencing, de novo assembling and microbiome investigations. This tool is implemented on the Galaxy web platform.
Aims to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. Assemblathon is a set of periodic collaborative efforts that all help improve methods of genome assembly. It offers a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This project is an international effort that aims to produce a genomic zoo – sequences that represent the genomes of 10,000 vertebrate species.
MIRA / Mimicking Intelligent Read Assembly
Uses a Swiss army knife of sequence assembly developed and used in the past 16 years to get assembly jobs done efficiently - and especially accurately. MIRA is a whole genome shotgun (WGS) and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio. It supports ancillary data in TRACEINFO format (from NCBI), marks places of interest with tags so that these can be found quickly in finishing programs and has a single nucleotide polymorphism (SNP) analysis pipeline for sequencing data of viruses and prokaryotes.
star_border star_border star_border star_border star_border
star star star star star
Designed to process individually barcoded Restriction-site associated DNA sequencing (RADseq) data (with double cut sites) into informative single nucleotide polymorphisms (SNPs)/Indels for population-level analyses. dDocent uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage.
Builds genetic maps and conducts population genomics and phylogeography. Stacks is a software system developed to work with restriction enzyme-based data, such as RAD-seq. The software produces core population genomic summary statistics and single nucleotide polymorphism (SNP)-by-SNP statistical tests. It aims to be a key resource to empower researchers to efficiently perform ecological and evolutionary genomic studies in model organisms and particularly in organisms with minimal or no genomic resources.
star_border star_border star_border star_border star_border
star star star star star
forum (1)
Assembles reads obtained with new sequencing technologies (Illumina, 454, SOLiD) using MPI 2.2. Ray allows to reduce the number of contigs and the number of errors. It can serve as a basis to develop an assembler that can be of universal utilization. The tool can calculate assemblies in parallel using message passing interface. Ray performs very well on mixed datasets and helps to assemble genomes using high-throughput sequencing.
An algorithm which extracts paths from De Bruijn graph for genome assembly. EPGA uses a score function to evaluate extension candidates based on the distributions of reads and insert size. The distribution of reads can solve problems caused by sequencing errors and short repetitive regions. Through assessing the variation of the distribution of insert size, EPGA can solve problems introduced by some complex repetitive regions. EPGA2 updates some modules in EPGA which can improve memory efficiency in genome asssembly.
An open-source hybrid error correction algorithm. Nanocorr was specifically developed for Oxford Nanopore reads, because existing packages were incapable of assembling the long read lengths (5-50 kbp) at such high error rates (between around 5% and 40% error). With this method, users are able to perform a hybrid error correction of the nanopore reads using complementary MiSeq data and produce a de novo assembly that is highly contiguous and accurate. The contig N50 length is more than ten times greater than an Illumina-only assembly (678 kb versus 59.9 kbp) and has >99.88% consensus identity when compared to the reference.
Allows to create true diploid de novo assemblies. Supernova can separate homologous chromosomes over long distances, in this sense capturing the true biology of a diploid genome. The Supernova approach is based on seven human samples. These assemblies used identical code, with same parameters as a ‘pushbutton’ process that ran in two days on a single server. The diploid human assemblies from this tool use sequence from the same sample. This approach yields much longer phase blocks than the previous diploid human assemblies. In this case, the diploid human assemblies are the first to be validated using finished sequence from the same sample and the first whose phasing accuracy has been validated using parental sequences
MITObim / MITOchondrial Baiting and Iterative Mapping
An in silico approach for the reconstruction of complete mitochondrial genomes of non-model organisms directly from next-generation sequencing (NGS) data-mitochondrial baiting and iterative mapping. MITObim is capable of reconstructing mitochondrial genomes without the need of a reference genome of the targeted species by relying solely on (a) mitochondrial genome information of more distantly related taxa or (b) short mitochondrial barcoding sequences (seeds), such as the commonly used cytochrome-oxidase subunit 1 (COI), as a starting reference.MITObim appeared superior to existing tools in terms of accuracy, runtime and memory requirements and fully automatically recovered mitochondrial genomes exceeding 99.5% accuracy from total genomic DNA derived NGS data sets in <24h using a standard desktop computer.
Celera assembler
star_border star_border star_border star_border star_border
star star star star star
Identifies allelic variation given a Whole Genome Shotgun (WGS) assembly of haploid sequences. Celera assembler is an algorithm to produce a set of haploid consensus sequences rather than a single consensus sequence. It uses a dynamic windowing approach and detects alleles by simultaneously processing the portions of aligned reads spanning a region of sequence variation. Celera assembler also assigns reads to their respective alleles, phases adjacent variant alleles and generates a consensus sequence corresponding to each confirmed allele.
A pipeline that specifically deals with the assembly of heterozygous genomes by introducing a step to recognise and selectively remove alternative heterozygous contigs. Redundans consists of three main steps: (i) detection and selectively removal of redundant contigs from an initial standard assembly, (ii) scaffolding of such non-redundant assembly using paired-end, mate-pair and/or fosmid-based reads and (iii) gap closing. The resulting assembly represents a chimeric reference genome in which each heterozygous region results from a random sorting of the haplotypes. We tested our pipeline on simulated and naturally-occurring heterozygous genomes and compared its accuracy to other existing tools.
Assembles de novo short read sequencing data. An assembler for the de novo assembly of large genomes using short sequence reads via jumping extension and read remapping. JR-Assembler extends a read by other whole reads, that is, it makes a jump. It uses a dynamic back trimming process to avoid extension termination due to sequencing errors. The tool achieves a superior performance on memory use and central processing unit time than most current assemblers when the read length is 150 bp or longer.
forum (1)
Allows variable read lengths while tolerating a significant level of sequencing error. MaSuRCa combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. It transforms large numbers of paired-end reads into a much smaller number of longer ‘superreads’. The tool can significantly improve its assemblies when the original data are augmented with long reads. It has been used to assemble de novo a variety of genomes, sometimes improving on published genomes using added data, sometimes creating the first publicly available draft genome for the species.
A pipeline to assemble de novo RADseq loci with the aim of optimizing coverage across phylogenetic datasets. PyRAD uses a wrapper around an alignment-clustering algorithm, which allows for indel variation within and between samples, as well as for incomplete overlap among reads (e.g. paired-end). pyRAD is intended for use with any type of restriction-site associated DNA. It currently supports RAD, ddRAD, PE-ddRAD, GBS, PE-GBS, EzRAD, PE-EzRAD, 2B-RAD, nextRAD, and can be extended to other types.
Allows transcriptome assembly, short read mapping and coding sequence annotation of 454 and Illumina data sequences. The PopPhyl project aims at characterizing within- and between-species molecular variations in a substantial number of metazoan taxa thanks to next-generation sequencing (NGS) technology, with the hope of linking genome evolutionary patterns to species biology and ecology. The PopPhyl project will explore the molecular diversity of neglected taxa (molluscs, annelids, nemertians, cnidarians), emblematic animals (Galapagos tortoise, king penguin, bath sponge), and species showing remarkable life history or ecology (hydrothermal annelids, 400-year old bivalves, social insects), with the hope of discovering new genes and adaptations, and providing information relevant to conservation biology and environmental sciences.
A mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and assembling single molecule real-time (SMRT) and Oxford Nanopore technologies (ONT) reads without an error correction stage. Miniasm implements the ‘O’ and ‘L’ steps in the overlap-layout consensus (OLC) assembly paradigm. It confirms long noisy reads can be assembled without an error correction stage, and without this stage, the assembly process can be greatly accelerated and simplified, while achieving comparable contiguity and large-scale accuracy to existing pipelines, at least for genomes without excessive repetitive sequences. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold C. elegans data in 9 minutes, orders of magnitude faster than the existing pipelines, though the consensus sequence error rate is as high as raw reads.
A package for phylogenomic analyses of data collected from conserved genomic loci using targeted enrichment. PHYLUCE allows the assembly of raw read data to contigs, the identification of ultra-conserved elements (UCE) contigs, parallel alignment generation, alignment trimming, and alignment data summary methods in preparation for analysis and alignment and SNP calling using UCE or other types of raw-read data. As it stands, the PHYLUCE package is useful for analyzing both data collected from UCE loci and also data collection from other types of loci for phylogenomic studies at the species, population, and individual levels.
0 - 0 of 0 results
1 - 28 of 28 results
filter_list Filters
computer Job seeker
Disable 7
person Position
thumb_up Fields of Interest
public Country
language Programming Language
1 - 28 of 28 results

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.