Allows users to interact with high-throughput sequencing data. SAMtools permits the manipulation of alignments in the SAM/BAM/CRAM formats: reading, writing, editing, indexing, viewing and converting SAM/BAM/CRAM format. It limits the mapping quality of reads with excessive mismatches and applies base alignment quality to fix alignment errors. This tool can sort and merge alignments, remove polymerase chain reaction (PCR) duplicates or generate per-position information.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
A high performance robust tool and library for working with SAM, BAM and CRAM sequence alignment files; the most common file formats for aligned next generation sequencing (NGS) data. Sambamba is a faster alternative to samtools that exploits multi-core processing and dramatically reduces processing time. Sambamba is being adopted at sequencing centers, not only because of its speed, but also because of additional functionality, including coverage analysis and powerful filtering capability.
A package for collating and searching across thousands of next-generation sequence (NGS) samples. Vancouver Short Read Analysis provides a database can be installed easily to rapidly access and store genetic variation information, compare data from any sequencing platform and perform aggregate analyses. The schema of the database makes rapid and insightful queries simple and enables easy annotation of novel or known genetic variations. Filtering can be done by utilizing annotations, matched pair datasets or datasets marked as non-cancer for separating polymorphisms from putative variants.
A flexible and easy to use interface that programmers of many levels of experience can use to access information in the popular and common SAM/BAM format. bio-samtools 2 provides new classes for describing genomic regions and genetic variants, allows the easy addition of newly developed SAMtools features and can produce publication-quality visualizations of data with minimal effort by the coder.
Handles multiple sequences and alignments in batch mode. FasParser provides a platform able to perform several common tasks such as: (i) batch performing alignment building; (ii) concatenating, merging, extracting and filtering of sequences, (iii) alignment format conversion; (iv) designing polymerase chain reaction (PCR) primers, and more. Additionally, the application supplies an editor dedicated to the visualization and the editing of the analyzed sequences.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
A software suite for programmers and end users that facilitates research analysis and data management using BAM files. BamTools provides both the first C++ API publicly available for BAM file support as well as a command-line toolkit. The BamTools C++ API/library has been successfully integrated into a variety of applications. It provides the BAM file support for several utilities in the BEDtools suite.
Offers an assortment of tools suited for sequence analysis. Japsa is an open source package that gathers more than 20 tools including a java library and an API. The application provides a wide range of functionalities that allows users to split multiple sequences files, to perform real-time identification of antibiotic resistance gene with Oxford Nanopore sequencing as well as to normalize the branch length of a phylogeny.
Permits quality control of Next-Generation-Sequencing (NGS) tumor-normal experiments. NGS-Bits is separate into four steps: (1) gather information from raw reads, (2) map reads, (3) extract variant lists, and (4) combine result from precedent steps to then add quality control (QC) metrics for tumor-normal experiments. This tool includes all stages of single-sample NGS data analysis and adds special QC metrics for DNA sequencing of tumor-normal pairs.
Enables genotyping and variant annotation of resequencing data produced by second generation next generation sequencing (NGS) technologies. CoVaCS is an automated system that provides tools for variant calling and annotation along with a pipeline for the analysis of whole genome shotgun (WGS), whole exome sequencing (WES) and targeted resequencing data (TGS). The software allows non-specialists to perform all steps from quality trimming to variant annotation.
Allows users to analyze, filter, annotate or transform biological sequence data. FAST is able to realize automated sampling, permutations and bootstrapping of sequences and sites and compute a population genetic statistics. It can assist empower non-biologist programmers to develop and communicate bioinformatics workflows for scientific investigations and publishing.
Allows users to filter, convert and combine multiple data files produced by high-throughput technologies. HTDP aims to aid global, real-time processing of large data sets using GUI. The software provides unlimited filtering and data reduction capabilities, also using itemized filtering conditions from external files. It can be used for conversion between different standard formats that are commonly used for high-throughput data.
Produces sorting results that could be correctly rendered by JBrowse while saving a significant amount of time. GFF3sort is an efficient script written in Perl to sort GFF3 files for tabix indexing. It could be a useful method to help with processing and visualizing genome annotation data. It has a high correct rate and a fast running speed compared with similar, existing tools. It also runs faster than similar methods.
An open-source software using Clojure, which is a functional programming language that works on the Java Virtual Machine. Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The Clojure code of cljam has fewer lines and an equivalent performance compared with SAMtools and Picard, which are similar tools.
Provides several programs allowing users to perform both common and uncommon tasks with FASTQ files. fastq-tools is a toolkit that provides tools for (1) finding reads matching a regular-expression, (2) counting k-mer occurances, (3) performing local alignment against every FASTQ sequence, (4) sample reads with or without replacement, (5) sorting FASTQ files and (6) filtering reads with identical sequences.
A quick and extremely permissive method to read and write VCF files. vcflib provides a variety of functions for VCF manipulation: comparison, format conversion, filtering and subsetting, annotation, samples, ordering, variant representation, genotype manipulation, interpretation and classification of variants. Piping provides a convenient method to interface with other libraries (vcf-tools, BedTools, GATK, htslib, bcftools, freebayes) which interface via VCF files, allowing the composition of an immense variety of processing functions.
Provides utility modules for bioinformatics. UBU permits users to translate from genome to transcriptome coordinates, to filter reads from a paired end SAM or BAM file, to convert a SAM/BAM file content to FASTQ, to format a single FASTQ file or to count splice junctions in a SAM or BAM file. It also outputs summary statistics per reference for a SAM/BAM file.