Assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. Cufflinks assembles individual transcripts from RNA-seq reads that have been aligned to the genome. This software is able to infer the splicing structure of each gene because reads from multiple splice variants for a given gene can be found in a sample. Quantification of transcript abundances is also possible by preferring a reference annotation to assembling the reads.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
Permits users to parse, analyze and manipulate VCF files. VCFtools is a software package for composed of two modules: the first is a general API that allows various operations to be performed on VCF files, including format validation, merging, comparing, intersecting, making complements and basic overall statistics; the second module analyze single-nucleotide polymorphism (SNP) data in VCF format, assisting researchers to estimate allele frequencies, levels of linkage disequilibrium and various quality control (QC) metrics.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
Analyzes or annotates VCF files and organizes tools that perform diverse analyses using VCF files. VCF-kit adds essential utilities to process and analyze VCF files, including primer generation for variant validation, dendrogram production, genotype imputation from sequence data in linkage studies, and additional tools. It can be used to produce a phylogenetic tree from a VCF. The tool centralizes a collection of tools and scripts using variant call format.
Supports pair-wise comparison of binary sequence alignment/map (BAM) files. BAM-matcher confronts the sample genotypes at pre-determined genomic locations to proceed. It can be employed at early stages of processing pipelines, limit genotype-calling to predetermined positions and confront different types of next generation sequencing (NGS) data, including whole-genome sequencing (WGS), whole-exome sequencing (WES) and RNA-seq data.
Permits quality control of Next-Generation-Sequencing (NGS) tumor-normal experiments. NGS-Bits is separate into four steps: (1) gather information from raw reads, (2) map reads, (3) extract variant lists, and (4) combine result from precedent steps to then add quality control (QC) metrics for tumor-normal experiments. This tool includes all stages of single-sample NGS data analysis and adds special QC metrics for DNA sequencing of tumor-normal pairs.
Provides a collection of software to investigate and compare RNA-seq data obtained by next generation sequencing (NGS) technologies. RACKJ can calculate the number of reads matching to exons and introns, mapping splice junctions (SJs) according to the alignment of reads to the genome. It allows the comparison of two samples and the recognition of genes with most significant difference in exon (splicing)-level.
Provides a variety of tools for manipulating, comparing, and analyzing VCF files beyond the functionality of existing tools. In addition, VTC was written to be easily extended with new tools. Variant Tool Chest brings important functionality that complements and integrates well with existing software.
Assists users for managing, filtering, comparing and annotating genomic position (GP) files. PileLine is a flexible command-line toolbox that provides several functionalities, including (i) full standard annotation with human dbSNP, HGNC Gene Symbol and Ensembl IDs, (ii) custom annotation through standard files, (iii) generation of SIFT, Firestar and PolyPhen compatible outputs, and (v) a genotyping quality control (QC) test for estimating performance metrics on detecting homo/heterozygote variants.
Assesses any pipeline results against a simulated dataset for obtaining an understanding of its performance characteristics in answering a particular biological question. BenchCT is a program that benchmarks the output of bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains. BenchCT is a component of SimBA, a software suite designed to evaluate the performance of an entire RNA-Seq pipeline in the context of a specific biological question.
Enables manipulation and comparison of multiple VCF files, as well as processing of other common next-generation sequencing (NGS) data formats. RTG Tools provides functions dealing with variant representation confounders. The utilities also globally optimize for minimizing discrepancies between a call set and the baseline and perform receiver operator curves (ROC) curve analysis, filtering and annotation of variant calls.
Compares and reports features of two similar sequences. diffseq reads two sequences which typically are very similar or almost identical. It finds regions of overlap between the two sequences and reports on differences between the features of the two sequences within these regions. The start and end positions of the regions of overlap are reported. The differences are also reported for each input sequence as two separate feature table output files.
Provides utility modules for bioinformatics. UBU permits users to translate from genome to transcriptome coordinates, to filter reads from a paired end SAM or BAM file, to convert a SAM/BAM file content to FASTQ, to format a single FASTQ file or to count splice junctions in a SAM or BAM file. It also outputs summary statistics per reference for a SAM/BAM file.
Allows users to interact with files associated with next-generation sequencing (NGS). qMule is composed of three modules: Aligner Compare confronts 2 BAMs aligned from the same FASTQ and separates out reads that are different between the BAMs; BamMismatchCounts provides a tally of how many mismatches were in each read for reads that mapped full-length; and MafFilter that searches for QCMG-annotated MAF files.
A quick and extremely permissive method to read and write VCF files. vcflib provides a variety of functions for VCF manipulation: comparison, format conversion, filtering and subsetting, annotation, samples, ordering, variant representation, genotype manipulation, interpretation and classification of variants. Piping provides a convenient method to interface with other libraries (vcf-tools, BedTools, GATK, htslib, bcftools, freebayes) which interface via VCF files, allowing the composition of an immense variety of processing functions.