One of the core issues of Bioinformatics is dealing with a profusion of (often poorly defined or ambiguous) file formats. Some ad hoc simple human readable formats have over time attained the status of de facto standards. Source text: Cock et al., 2010.

Desktop app
G T A T C G C T A PileLine PileLine

PileLine

Implements a flexible command-line toolkit providing specific support to the…

Implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of genomic position (GP) files produced by next generation sequencing…

Desktop app
G T A T C G C T A Best Alignment… Best Alignment Normalization

BAN Best Alignment Normalization

Applies all the variations in a variant call format (VCF) file to the reference…

Applies all the variations in a variant call format (VCF) file to the reference genome to create a sample genome, and then recalls the variants by aligning this sample genome back with the reference…

Desktop app
G T A T C G C T A Single-molecule… Single-molecule dataset

SMD Single-molecule dataset

Adoption of a common, standard data file format for sharing raw single-molecule…

Adoption of a common, standard data file format for sharing raw single-molecule data and analysis outcomes is a critical step for the emerging and powerful single-molecule field, which will benefit…

G T A T C G C T A FASTQ FASTQ

FASTQ

Stores sequences and Phred qualities in a single file. FASTQ format is concise…

Stores sequences and Phred qualities in a single file. FASTQ format is concise and compact. It has emerged as a common file format for sharing sequencing read data combining both the sequence and an…

G T A T C G C T A bedGraph format bedGraph format

bedGraph format

Allows display of continuous-valued data in track format. This display type is…

Allows display of continuous-valued data in track format. This display type is useful for probability scores and transcriptome data.

G T A T C G C T A bigBed format bigBed format

bigBed format

Stores annotation items that can either be simple, or a linked collection of…

Stores annotation items that can either be simple, or a linked collection of exons, much as BED files do.

G T A T C G C T A FASTA format FASTA format

FASTA format

Used to specify the reference sequence for an imported genome. Each sequence in…

Used to specify the reference sequence for an imported genome. Each sequence in the FASTA file represents the sequence for a chromosome.

G T A T C G C T A HDF5 HDF5

HDF5

A data model, library, and file format for storing and managing data. It…

A data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex…

G T A T C G C T A Sequence Read Format Sequence Read Format

SRF Sequence Read Format

A generic format for DNA sequence data. The primary motivation for creating SRF…

A generic format for DNA sequence data. The primary motivation for creating SRF has been to enable a single format capable of storing data generated by any DNA sequencing technology.

G T A T C G C T A Standard Flowgram… Standard Flowgram Format

SFF Standard Flowgram Format

Used to store the information on one or many 454 Sequencing reads and their…

Used to store the information on one or many 454 Sequencing reads and their trace data.

G T A T C G C T A BAM format BAM format

BAM format

The compressed binary version of the Sequence Alignment/Map (SAM) format, a…

The compressed binary version of the Sequence Alignment/Map (SAM) format, a compact and index-able representation of nucleotide sequence alignments.

G T A T C G C T A bigWig format bigWig format

bigWig format

For display of dense, continuous data that will be displayed in the Genome…

For display of dense, continuous data that will be displayed in the Genome Browser as a graph.

G T A T C G C T A Browser Extensible… Browser Extensible Data format

BED format Browser Extensible Data format

Provides a flexible way to define the data lines that are displayed in an…

Provides a flexible way to define the data lines that are displayed in an annotation track.

G T A T C G C T A Wiggle format Wiggle format

WIG format Wiggle format

An older format for display of dense, continuous data such as GC percent,…

An older format for display of dense, continuous data such as GC percent, probability scores, and transcriptome data.

G T A T C G C T A Generic Feature Format Generic Feature Format

GFF Generic Feature Format

A standard for describing genome annotation data.

A standard for describing genome annotation data.

G T A T C G C T A Genome Variation… Genome Variation Format

GVF Genome Variation Format

An extension of Generic Feature Format version 3 (GFF3), is a simple…

An extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data.

G T A T C G C T A GLF GLF

GLF

A format for storing marginal likelihoods for next-generation sequence data,…

A format for storing marginal likelihoods for next-generation sequence data, conditional on a set of possible genotypes.

G T A T C G C T A Sequence… Sequence Alignment/Map format

SAM format Sequence Alignment/Map format

A generic alignment format for storing read alignments against reference…

A generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms.

G T A T C G C T A Variant Call Format Variant Call Format

VCF Variant Call Format

A generic format for storing DNA polymorphism data such as SNPs, insertions,…

A generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations.

Advertisements
Join Omic Community

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.