Focuses on variant discovery and genotyping. GATK provides a toolkit, developed at the Broad Institute, composed of several tools and able to support projects of any size. The application compiles an assortment of command line allowing one to analyze of high-throughput sequencing (HTS) data in various formats such as SAM, BAM, CRAM or VCF. The website includes multiple documentation for guiding users.
Allows users to interact with high-throughput sequencing data. SAMtools permits the manipulation of alignments in the SAM/BAM/CRAM formats: reading, writing, editing, indexing, viewing and converting SAM/BAM/CRAM format. It limits the mapping quality of reads with excessive mismatches and applies base alignment quality to fix alignment errors. This tool can sort and merge alignments, remove polymerase chain reaction (PCR) duplicates or generate per-position information.
Performs peak finding and downstream data analysis for next-generation sequencing analysis. HOMER affords several tools and methods to make use of ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and other types of functional genomics sequencing data sets. This software offers support to UCSC visualization, peaks annotation, quantification of transcripts and repeats or differential features, enrichment and expression.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
Performs gene and isoform level quantification from RNA-Seq data. RSEM is a software package that quantifies gene and isoform abundances from single-end (SE) or paired-end (PE) RNA-Seq data. The software enables visualization of its output through probabilistically-weighted read alignments and read depth plots. It does not require a reference genome and thus can be useful for quantification with de novo transcriptome assemblies.
A Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner.
A flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. Poretools operates directly on the native FAST5 (an application of the HDF5 standard) file format produced by ONT and provides a wealth of format conversion utilities and data exploration and visualization tools.
Provides assistance for the problem of mapping various types of IDs to each other. Onto-Translate brings to users a non-redundant and complete mapping from any type of identification system to any other type. This software exploits the custom design of Onto-Tools database that contains 20 publicly available biological databases such as KEGG or GenBank. It permits to perform conversions of individual genes in one format into another.
Builds mapping assemblies from short reads generated by the next-generation sequencing machines. Maq is particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data. Maq first aligns reads to reference sequences and then calls the consensus. At the mapping stage, maq performs ungapped alignment. For single-end reads, maq is able to find all hits with up to 2 or 3 mismatches, depending on a command-line option; for paired-end reads, it always finds all paired hits with one of the two reads containing up to 1 mismatch. At the assembling stage, maq calls the consensus based on a statistical model.
Allows to manipulate, organize, summarize and visualize MinION nanopore sequencing data. poRe enables users to manipulate MinION FAST5 files into run folders, extract FASTQ, gather statistics on each run and plot a number of key graphs, such as read-length histograms and yield-over-time. Two graphical-user-interfaces (GUIs) for MinION data processing, organization and extraction are available through the package.
Improves the design and use of polymerase chain reaction (PCR)-based methylation assays. methPrimer was developed to store and retrieve validated methylation assays. This resource is intended to be a search portal for validated methylation assays. It also aims to establish a certain level of standardization and uniformity in the use of PCR based methylation assays. Each primer set is provided with a unique identifier to access them directly or refer to in a publication.
Permits users to parse, analyze and manipulate VCF files. VCFtools is a software package for composed of two modules: the first is a general API that allows various operations to be performed on VCF files, including format validation, merging, comparing, intersecting, making complements and basic overall statistics; the second module analyze single-nucleotide polymorphism (SNP) data in VCF format, assisting researchers to estimate allele frequencies, levels of linkage disequilibrium and various quality control (QC) metrics.
Examines epigenomic and transcriptomic next generation sequencing (NGS) data. Octopus-toolkit can be used for antibody- or enzyme-mediated experiments and studies for the quantification of gene expression. It can accelerate the data mining of public epigenomic and transcriptomic NGS data for basic biomedical research. This tool provides a private and a public mode: one to process the user’s own data, and the other to analyze public NGS data by retrieving raw files from the GEO database.
Enables reading of sequencing files from the SRA database and writing files into the same format. The NCBI SRA Toolkit is provided in the form of the SRA SDK, and can be compiled with GCC. It allows users to programmatically access data housed within SRA and convert it from the SRA format: ABI SOLiD native, fasta, fastq, sff, sam, Illumina native. This method is available for all commons platforms.
A statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data. BCFtools can manipulate variant calls in the variant call format (VCF) and its binary counterpart BCF. It also can discover somatic and germline mutations with appropriate input data, efficiently estimate site allele frequency, allele frequency spectrum and linkage disequilibrium, and test Hardy–Weinberg equilibrium and association.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
Advances the automation and visualization of RNA-seq data analyses results. QuickRNASeq is a pipeline that significantly reduces data analysts’ hands-on time, which results in a substantial decrease in the time and effort needed for the primary analyses of RNA-seq data before proceeding to further downstream analysis and interpretation. It provides a dynamic data sharing and interactive visualization environment for end users and enable non-expert end users to interact easily with the RNA-seq data analyses results.
Permits users to convert WIG file into BIGWIG file, a format permitting to view the results of next-generation sequencing experiments as tracks in the UCSC Genome Browser. wigToBigWig is a command-line utility to convert a file to indexed binary format. In addition to the text file, the conversion utilities require a chrom.sizes input file that describes the chromosome (or contig) sizes in a two-column format (chromosome name and chromosome size).
Converts files in BED into BIGBED file, a format permitting to view the results of next-generation sequencing experiments as tracks in the UCSC Genome Browser. bedToBigBed is a command-line utility to convert a file to indexed binary format. In addition to the text file and an optional .as file, the conversion utilities require a chrom.sizes input file that describes the chromosome (or contig) sizes in a two-column format (chromosome name and chromosome size).
Topics (10): WES analysis, WGS analysis, Central Nervous System Neoplasms, Nervous System Neoplasms, Brain Diseases, Breast Neoplasms, Breast Diseases, Neoplasms, Neoplasms, Connective and Soft Tissue, Genetic Diseases, Inborn