A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
Allows users to interact with high-throughput sequencing data. SAMtools permits the manipulation of alignments in the SAM/BAM/CRAM formats: reading, writing, editing, indexing, viewing and converting SAM/BAM/CRAM format. It limits the mapping quality of reads with excessive mismatches and applies base alignment quality to fix alignment errors. This tool can sort and merge alignments, remove polymerase chain reaction (PCR) duplicates or generate per-position information.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
Focuses on variant discovery and genotyping. GATK provides a toolkit, developed at the Broad Institute, composed of several tools and able to support projects of any size. The application compiles an assortment of command line allowing one to analyze of high-throughput sequencing (HTS) data in various formats such as SAM, BAM, CRAM or VCF. The website includes multiple documentation for guiding users.
Enables users to process sequences in FASTA and FASTQ formats. Seqtk parses both FASTA and FASTQ files which can also be optionally compressed by gzip. This application also permits to convert ILLUMINA files to FASTA and to mask bases with low quality. In addition, it includes an option to extract sequences with a specific name or in specified regions.
Facilitates the design, optimization, and tracking of barcoded oligonucleotides. XSTK is useful for projects that require highly multiplexed polymerase chain reaction (PCR) and DNA sequencing. It builds a list of all possible DNA sequences of a specified length and then progressively culls sequences that may interfere with primary PCR amplification and/or sequencing steps.
A quick and extremely permissive method to read and write VCF files. vcflib provides a variety of functions for VCF manipulation: comparison, format conversion, filtering and subsetting, annotation, samples, ordering, variant representation, genotype manipulation, interpretation and classification of variants. Piping provides a convenient method to interface with other libraries (vcf-tools, BedTools, GATK, htslib, bcftools, freebayes) which interface via VCF files, allowing the composition of an immense variety of processing functions.