A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
Permits users to parse, analyze and manipulate VCF files. VCFtools is a software package for composed of two modules: the first is a general API that allows various operations to be performed on VCF files, including format validation, merging, comparing, intersecting, making complements and basic overall statistics; the second module analyze single-nucleotide polymorphism (SNP) data in VCF format, assisting researchers to estimate allele frequencies, levels of linkage disequilibrium and various quality control (QC) metrics.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
Assesses any pipeline results against a simulated dataset for obtaining an understanding of its performance characteristics in answering a particular biological question. BenchCT is a program that benchmarks the output of bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains. BenchCT is a component of SimBA, a software suite designed to evaluate the performance of an entire RNA-Seq pipeline in the context of a specific biological question.
Implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of genomic position (GP) files produced by next generation sequencing (NGS) experiments. PileLine consists of a set of command-line utilities that are easy to integrate in custom workflows or user-friendly frameworks like Galaxy. The tools comprising PileLine are focussed on two different but complementary activities: (i) processing and annotation, implementing simple but reusable operations over input GP files and (ii) analysis, giving support to more advanced and specific requirements. PileLine contains 10 command-line utilities that have been designed to be memory efficient by performing on-disk operations over sorted GP files.
A quick and extremely permissive method to read and write VCF files. vcflib provides a variety of functions for VCF manipulation: comparison, format conversion, filtering and subsetting, annotation, samples, ordering, variant representation, genotype manipulation, interpretation and classification of variants. Piping provides a convenient method to interface with other libraries (vcf-tools, BedTools, GATK, htslib, bcftools, freebayes) which interface via VCF files, allowing the composition of an immense variety of processing functions.
