1 - 26 of 26 results


star_border star_border star_border star_border star_border
star star star star star
Aligns short read geared toward mammalian re-sequencing. Bowtie is based on a Burrows-Wheeler index based on the full-text minute-space (FM) index. It follows two steps: an initial, ungapped seed-finding stage that derives advantage from the speed and memory efficiency of the full-text minute index and a gapped extension stage that employs dynamic programming and benefits from the efficiency of single-instruction multiple-data (SIMD) parallel processing available on modern processors.


star_border star_border star_border star_border star_border
star star star star star
Allows users to interact with high-throughput sequencing data. SAMtools permits the manipulation of alignments in the SAM/BAM/CRAM formats: reading, writing, editing, indexing, viewing and converting SAM/BAM/CRAM format. It limits the mapping quality of reads with excessive mismatches and applies base alignment quality to fix alignment errors. This tool can sort and merge alignments, remove polymerase chain reaction (PCR) duplicates or generate per-position information.


star_border star_border star_border star_border star_border
star star star star star
A software suite for the comparison, manipulation and annotation of genomic features in browser extensible data (BED) and general feature format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.


star_border star_border star_border star_border star_border
star star star star star
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).


Examines epigenomic and transcriptomic next generation sequencing (NGS) data. Octopus-toolkit can be used for antibody- or enzyme-mediated experiments and studies for the quantification of gene expression. It can accelerate the data mining of public epigenomic and transcriptomic NGS data for basic biomedical research. This tool provides a private and a public mode: one to process the user’s own data, and the other to analyze public NGS data by retrieving raw files from the GEO database.


A high performance robust tool and library for working with SAM, BAM and CRAM sequence alignment files; the most common file formats for aligned next generation sequencing (NGS) data. Sambamba is a faster alternative to samtools that exploits multi-core processing and dramatically reduces processing time. Sambamba is being adopted at sequencing centers, not only because of its speed, but also because of additional functionality, including coverage analysis and powerful filtering capability.


A program that can chop a BAM index (BAI) file into small pieces. The program outputs a list of BAI files each indexing a specified genomic interval. The output files are much smaller in size but maintain compatibility with existing software tools. We show how preprocessing BAI files with chopBAI can lead to a reduction of I/O by more than 95% during the analysis of 10Kbp genomic regions, eventually enabling the joint analysis of more than 10,000 individuals. As sequencing is becoming more and more common, chopBAI will be equally useful for analyzing large sequencing cohorts of other species where the BAI indexing scheme allows for fast access to small subsets of reads.


Can read local files as long as they’ve been indexed with tabix or tribble. JfxNgs is a computational package and a java-based user interface. It can also access the remote files if the hosting server supports ’ByteRange’ requests. The software is available in the jvarkit package. The main window provides tools for indexing BAM and VCF files. The VCF and the BAM windows have common functionalities: filtering the data with javascript, displaying the data in a very simple genome browser, viewing the selected item in a web browser for a common database like Exac.


A tool for constructing the FM-index for a collection of DNA sequences. ropeBWT works by incrementally inserting one or multiple sequences into an existing pseudo-BWT position by position, starting from the end of the sequences. This algorithm can be largely considered a mixture of BCR and dynamic FM-index. Nonetheless, ropeBWT2 is unique in that it may implicitly sort the input into reverse lexicographical order (RLO) or reverse-complement lexicographical order (RCLO) while building the index.


Generates FASTA index for FASTA files. Fastahack is an application for indexing and extracting sequences and subsequences from FASTA files. The included library provides a FASTA reader and indexer that can be embedded into applications which would benefit from directly reading subsequences from FASTA files. This resource also uses the C function fseek64 to extract sequence and subsequence. It permits fastest-possible extraction and makes fastahack a useful method for bioinformatician who need to quickly extract many subsequences from a reference FASTA sequence.


Uses to designe multi-thread sort/merge tools for BAM files. NovoSort reduces run times from multi-threading and by combining sort & merge in one step. It uses a stable sort/merge algorithm that will not change the order of alignments with the same sort key and can optionally create BAM index file. This is a two phase sort merge, the first phase sorts as many reads as possible in memory and then writes segments of sorted records to temporary disk files. The second phase merges the sorted fragments to produce the final sorted file.