Main logo
?
tutorial arrow
×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.
Share new tools with the community
Sign up for free to promote the availability of bioinformatics tools

FASTA/FASTQ file manipulation bioinformatics software tools

Many analysis pipelines involve initial data manipulation (e.g. reformatting, viewing or overview statistics) before downstream processing (e.g. quality control, adapter removal and alignment). Seemingly simple tasks like viewing the first few reads…
G T A T C G C T A
NGSUtils
Desktop

NGSUtils

A suite of software tools for manipulating data common to next-generation…

A suite of software tools for manipulating data common to next-generation sequencing experiments, such as FASTQ, BED and BAM format files. With modules that operate from FASTQ pre-processing through…

G T A T C G C T A
TriageTools
Desktop

TriageTools

A collection of tools for partitioning raw data (fastq reads) from…

A collection of tools for partitioning raw data (fastq reads) from high-throughput sequencing projects.

G T A T C G C T A
fqtools
Desktop

fqtools

A fast and reliable FASTQ file manipulation suite that can process the full set…

A fast and reliable FASTQ file manipulation suite that can process the full set of valid FASTQ files, including those with multi-line sequences, whilst identifying invalid files. fqtools is faster…

G T A T C G C T A
NGS-Bits
Desktop

NGS-Bits

Permits quality control of Next-Generation-Sequencing (NGS) tumor-normal…

Permits quality control of Next-Generation-Sequencing (NGS) tumor-normal experiments. NGS-Bits is separate into four steps: (1) gather information from raw reads, (2) map reads, (3) extract variant…

G T A T C G C T A
fastQ_brew
Desktop

fastQ_brew

Performs quality control, reformatting, filtering, and trimming of FASTQ…

Performs quality control, reformatting, filtering, and trimming of FASTQ formatted sequence datasets. fatsQ_brew does not rely on any modules that are not currently contained within the Perl Core. It…

G T A T C G C T A
FASTdoop
Desktop

FASTdoop

Allows to manage FASTA and FASTQ files. FASTdoop is based on a wide range of…

Allows to manage FASTA and FASTQ files. FASTdoop is based on a wide range of experiments. It supports FASTA files containing one or more short sequences or a single very large sequence of arbitrary…

G T A T C G C T A
FAST
Desktop

FAST Fast Analysis of Sequences Toolbox

Allows users to analyze, filter, annotate or transform biological sequence…

Allows users to analyze, filter, annotate or transform biological sequence data. FAST is able to realize automated sampling, permutations and bootstrapping of sequences and sites and compute a…

G T A T C G C T A
Barracoda
Web

Barracoda

Allows to conduct DNA barcode sequencing data analysis. Barracoda provides…

Allows to conduct DNA barcode sequencing data analysis. Barracoda provides feature to orient sequences. The tool can aid in detecting experimental errors and biases, such as spill-over between wells,…

G T A T C G C T A
Fasta-O-Matic
Desktop

Fasta-O-Matic

A quality control script that makes FASTA format files compatible for a variety…

A quality control script that makes FASTA format files compatible for a variety of downstream bioinformatics tools. Fasta-O-Matic automates handling of common but minor format issues that otherwise…

G T A T C G C T A
V-REVCOMP
Desktop

V-REVCOMP

Determines the conserved loci of rRNA genes’ orientation. V-REVCOMP is a…

Determines the conserved loci of rRNA genes’ orientation. V-REVCOMP is a standalone software, based on hidden Markov models (HMM), that is able to detect and reorient reverse complementary…

G T A T C G C T A
pybio
Desktop

pybio

Permits users to proceed to basic genomics operations. pybio offers motif…

Permits users to proceed to basic genomics operations. pybio offers motif sequence searches functions. It can classify alternative polyadenylation site-pair.

G T A T C G C T A
Fastahack
Desktop

Fastahack

Generates FASTA index for FASTA files. Fastahack is an application for indexing…

Generates FASTA index for FASTA files. Fastahack is an application for indexing and extracting sequences and subsequences from FASTA files. The included library provides a FASTA reader and indexer…

G T A T C G C T A
Fastaq
Desktop

Fastaq

Allows to manipulate several format files as FASTA, FASTQ and others. Fastaq is…

Allows to manipulate several format files as FASTA, FASTQ and others. Fastaq is able to recognize the format of the files uploaded. This tool manipulates sequences and quality scores if present, and…

G T A T C G C T A
Cdbfasta
Desktop

Cdbfasta

Permits to index and retrieve FASTA information in flat file databases.…

Permits to index and retrieve FASTA information in flat file databases. Cdbfasta can be used on compressed files. It is able to create an index file for a multi-FASTA file.

G T A T C G C T A
pasteseq
Desktop

pasteseq

Inserts one sequence into another at a specified position and writes the new,…

Inserts one sequence into another at a specified position and writes the new, combined sequence to an output file. pasteseq is an Emboss package. This tool can be used as a simple sequence editor. It…

G T A T C G C T A
dbxfasta
Desktop

dbxfasta

Indexes a flat file database of one or more files, and builds EMBOSS B+tree…

Indexes a flat file database of one or more files, and builds EMBOSS B+tree format index files. By default, dbxfasta will index the ID name and the accession number (if present). If they are present…

G T A T C G C T A
Needletail
Desktop

Needletail

Writes a fast and well-tested set of functions that more specialized…

Writes a fast and well-tested set of functions that more specialized bioinformatics programs can use. Needletail is a minimal-copying FASTA/FASTQ parser and k-mer processing library for Rust. The…

G T A T C G C T A
descseq
Desktop

descseq

Alters the name or description of a sequence. descseq reads a sequence and…

Alters the name or description of a sequence. descseq reads a sequence and writes it to file but with a different name and/or description. All other records including the sequence itself are left…

G T A T C G C T A
degapseq
Desktop

degapseq

Removes non-alphabetic (e.g. gap) characters from sequences. degapseq reads one…

Removes non-alphabetic (e.g. gap) characters from sequences. degapseq reads one or more sequences and writes them out again but stripped of any non-alphabetic characters. It main purpose is to remove…

G T A T C G C T A
revseq
Desktop

revseq

Reverses and complements a nucleotide sequence. revseq reads one or more…

Reverses and complements a nucleotide sequence. revseq reads one or more nucleotide sequences and writes to file the reverse complement of each sequence. It can return just the reversed sequence or…

G T A T C G C T A
sizeseq
Desktop

sizeseq

Sorts sequences by size. sizeseq classifies the sort in ascending order.

Sorts sequences by size. sizeseq classifies the sort in ascending order.

G T A T C G C T A
prettyseq
Desktop

prettyseq

Writes a nucleotide sequence and its translation to file. prettyseq writes an…

Writes a nucleotide sequence and its translation to file. prettyseq writes an output file containing in a clean format the sequence with the translation displayed beneath it. The translated nucleic…

G T A T C G C T A
nthseqset
Desktop

nthseqset

Reads and writes one set of sequences. nthseqset writes to file a single…

Reads and writes one set of sequences. nthseqset writes to file a single sequence alignment (set) from an input stream of sequence sets. The sequence set is specified by number, which is the order it…

G T A T C G C T A
notseq
Desktop

notseq

Writes to file a subset of an input stream of sequences. notseq is an Emboss…

Writes to file a subset of an input stream of sequences. notseq is an Emboss tool. The list of sequence names or accession numbers to exclude from output is provided as a string. Optionally, the…

G T A T C G C T A
maskambignuc
Web
Desktop

maskambignuc

Masks all ambiguity characters in nucleotide sequences with N. maskambignuc is…

Masks all ambiguity characters in nucleotide sequences with N. maskambignuc is an Emboss tool. Major sequence database sources defined as standard in EMBOSS installations include srs:embl,…

G T A T C G C T A
FASconCAT
Desktop

FASconCAT

Concatenates different nucleotide, amino acid and structure sequence fragments…

Concatenates different nucleotide, amino acid and structure sequence fragments of same taxa to one super matrix file in format which can be used for phylogenetic purposes. FASconCAT extracts taxon…

G T A T C G C T A
splitsource
Desktop

splitsource

Splits one or more sequences into original source sequences. splitsource…

Splits one or more sequences into original source sequences. splitsource processes the "source" features in the feature table. The "source" feature annotated the origin of a…

G T A T C G C T A
seqcount
Desktop

seqcount

Reads and counts sequences. seqcount counts the frequency of occurrence of all…

Reads and counts sequences. seqcount counts the frequency of occurrence of all possible short sequences up to a user given maximum length in a set of sequence data and then writes this frequency…

G T A T C G C T A
nthseq
Desktop

nthseq

Writes to file a single sequence from an input stream of sequences. nthseq…

Writes to file a single sequence from an input stream of sequences. nthseq allows to specify sequence by number, which is the order it appears in the input file. The user can specify the output file…

G T A T C G C T A
showorf
Desktop

showorf

Displays a nucleotide sequence and translation in pretty format. showorf writes…

Displays a nucleotide sequence and translation in pretty format. showorf writes an input nucleotide sequence and its protein translation to an output file in a clear format that is suitable for…

G T A T C G C T A
refseqget
Desktop

refseqget

Gets reference sequence. refseqget reads a reference sequence and returns the…

Gets reference sequence. refseqget reads a reference sequence and returns the data in one of the EMBOSS reference sequence formats.

G T A T C G C T A
extractseq
Desktop

extractseq

Reads a sequence and writes sub-sequences from it to file. extractseq extracts…

Reads a sequence and writes sub-sequences from it to file. extractseq extracts regions from a sequence. The set of regions to extract is specified on the command-line or in a file as pairs of start…

G T A T C G C T A
diffseq
Desktop

diffseq

Compares and reports features of two similar sequences. diffseq reads two…

Compares and reports features of two similar sequences. diffseq reads two sequences which typically are very similar or almost identical. It finds regions of overlap between the two sequences and…

G T A T C G C T A
textsearch
Desktop

textsearch

Searches the textual description of sequences. textsearch seeks for words…

Searches the textual description of sequences. textsearch seeks for words (specified as a regular expression) in the description text of one or more input sequences. It writes an output file with…

G T A T C G C T A
cutseq
Desktop

cutseq

Removes a section from a sequence. cutseq is an Emboss tool that and a simple…

Removes a section from a sequence. cutseq is an Emboss tool that and a simple editing program allowing to cut out a region from a sequence by specifying the begin and end positions of the region to…

G T A T C G C T A
splitter
Desktop

splitter

Splits one or more input sequences into smaller, optionally overlapping,…

Splits one or more input sequences into smaller, optionally overlapping, subsequences. Splitter is an Emboss tool that divides a sequence into sub-sequences of 10,000 bases (the default size) with no…

G T A T C G C T A
geecee
Desktop

geecee

Calculates the fraction of G+C bases of the input nucleic acid sequence(s).…

Calculates the fraction of G+C bases of the input nucleic acid sequence(s). geecee is an Emboss tool that sums the number of G and C bases in the input sequence(s) and writes the result to file as…

G T A T C G C T A
shuffleseq
Desktop

shuffleseq

Shuffles a set of sequences maintaining composition. shuffleseq reads one or…

Shuffles a set of sequences maintaining composition. shuffleseq reads one or more sequences and writes them out again in a random (shuffled) order. The number of shuffles may be specified. Only the…

G T A T C G C T A
prophet
Desktop

prophet

Scans one or more sequences with the supplied Gribskov or Henikoff profile…

Scans one or more sequences with the supplied Gribskov or Henikoff profile (produced by prophecy). Prophet is an EMBOSS tool that writes the highest-scoring (gapped) alignment for each sequence to…

G T A T C G C T A
biosed
Desktop

biosed

Replaces or deletes sequence sections. biosed is a simple sequence editing…

Replaces or deletes sequence sections. biosed is a simple sequence editing utility that searches for a target subsequence in one or more input sequences and replaces it with an insert subsequence, or…

G T A T C G C T A
seqretsplit
Desktop

seqretsplit

Reads sequences and writes them to individual files. seqretsplit is a variant…

Reads sequences and writes them to individual files. seqretsplit is a variant of the standard program for reading and writing sequences, seqret. It performs exactly the same function except that when…

G T A T C G C T A
union
Desktop

union

Concatenates multiple sequences into a single sequence. union reads in several…

Concatenates multiple sequences into a single sequence. union reads in several sequences, concatenates them and writes them out as a single sequence. The input is typically a list file containing…

G T A T C G C T A
marscan
Desktop

marscan

Finds matrix/scaffold recognition (MRS) signatures in DNA sequences. marscan…

Finds matrix/scaffold recognition (MRS) signatures in DNA sequences. marscan reads a DNA sequence and writes a standard EMBOSS report file with details of the MRS signatures identified. It searches…

G T A T C G C T A
seqmatchall
Desktop

seqmatchall

Compares a sequence set by using all-against-all word. seqmatchall takes a set…

Compares a sequence set by using all-against-all word. seqmatchall takes a set of sequences and does an all-against-all pairwise comparison of words of a specified size in the sequences, finding…

G T A T C G C T A
infoseq
Desktop

infoseq

Displays basic information about sequences. infoseq displays on screen basic…

Displays basic information about sequences. infoseq displays on screen basic information about one or more input sequences as: Uniform Sequence Address (USA), name, accession number, type (nucleic or…

G T A T C G C T A
File Chameleon
Web

File Chameleon

Assists in reformatting existing genomic flat files. File Chameleon…

Assists in reformatting existing genomic flat files. File Chameleon doesn't convert formats or merge files, only modifies existing files already available on the Ensembl FTP site. It tries to…

G T A T C G C T A
Fasta File…
Desktop

Fasta File Splitter

Reads a protein FASTA file and splits it apart into a number of sections. Fasta…

Reads a protein FASTA file and splits it apart into a number of sections. Fasta File Splitter is a console application to break apart a large protein FASTA file into a series of smaller FASTA files,…

G T A T C G C T A
bioawk
Desktop

bioawk

An extension to Brian Kernighan's awk, adding the support of several…

An extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with…

G T A T C G C T A
fastq-tools
Desktop

fastq-tools

A collection of small and efficient programs for performing some common and…

A collection of small and efficient programs for performing some common and uncommon tasks with FASTQ files.

G T A T C G C T A
seqmagick
Desktop

seqmagick

A little utility to expose the file format conversion in BioPython in a…

A little utility to expose the file format conversion in BioPython in a convenient way. Seqmagick can be used to query information about sequence files, convert between types, and modify sequence…

G T A T C G C T A
NGS-Cleaner
Desktop

NGS-Cleaner

An application that provides cleaning of FASTQ/A formatted large DNA sequence…

An application that provides cleaning of FASTQ/A formatted large DNA sequence files containing multiple short-reads sequences provided by Next Generation Sequencing platforms.

G T A T C G C T A
NGS-TOOLBOX
Desktop

NGS-TOOLBOX

This collection of simple Perl scripts is adressed to scientists doing research…

This collection of simple Perl scripts is adressed to scientists doing research that bases on high throughput genomic/transcriptomic data.

G T A T C G C T A
ea-utils
Desktop

ea-utils

Allows users to process biological sequencing data. ea-utils works with…

Allows users to process biological sequencing data. ea-utils works with pipeline based on Illumina and can run with other FASTQs.

G T A T C G C T A
FASTX-Toolkit
Desktop

FASTX-Toolkit

A collection of command line tools for Short-Reads FASTA/FASTQ files…

A collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. Next-Generation sequencing machines usually produce FASTA or FASTQ files, containing multiple short-reads sequences…

Related Websites
Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.