×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.

Duplicate read removal software tools | Whole-genome sequencing data analysis

The presence of duplicates introduced by PCR amplification is a major issue in paired short reads from next-generation sequencing platforms. These duplicates might have a serious impact on research applications, such as scaffolding in whole-genome…
G T A T C G C T A
Tally
Desktop

Tally

A package to deduplicate sequence fragments. Tally removes redundancy from…

A package to deduplicate sequence fragments. Tally removes redundancy from sequence files by collapsing identicle reads to a single entry while recording the number of instances of each. It can also…

G T A T C G C T A
MarDRe
Desktop

MarDRe

Processes single-end and paired-end reads from FASTQ/FASTA datasets. MarDRe…

Processes single-end and paired-end reads from FASTQ/FASTA datasets. MarDRe removes near-duplicate reads by using a de novo MapReduce method. It permits to avoid the analysis of not necessary reads.…

G T A T C G C T A
cd-hit-454
Desktop

cd-hit-454

A program to identify artificial duplicates from raw 454 sequencing reads,…

A program to identify artificial duplicates from raw 454 sequencing reads, including exact duplicates and near identical duplicates.

G T A T C G C T A
DupRecover
Desktop

DupRecover

A Maximum Likelihood estimator for sampling-induced read duplication in deep…

A Maximum Likelihood estimator for sampling-induced read duplication in deep sequencing experiments.

G T A T C G C T A
FastUniq
Desktop

FastUniq

Removes duplicates in paired short reads. FastUniq compares sequences between…

Removes duplicates in paired short reads. FastUniq compares sequences between read pairs to identify duplicates. The software can be used with flexibility in almost all next-generation sequencing…

G T A T C G C T A
ParDRe
Desktop

ParDRe

A de novo parallel tool to remove duplicated and near-duplicated reads through…

A de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of single-end or paired-end sequences from fasta or fastq files. ParDRe uses a novel bitwise approach to…

G T A T C G C T A
Fulcrum
Desktop

Fulcrum

Condensing redundant reads from high-throughput sequencing studies.

Condensing redundant reads from high-throughput sequencing studies.

G T A T C G C T A
JATAC
Desktop

JATAC

Identifies duplicate reads based on the flowgram. The distance calculation in…

Identifies duplicate reads based on the flowgram. The distance calculation in JATAC is a more robust way of finding duplicates, as it first identifies read pairs with different homopolymer lengths at…

G T A T C G C T A
QUASR
Desktop

QUASR Quality Assessment of Short Read

A lightweight pipeline written to process and analyse next-generation…

A lightweight pipeline written to process and analyse next-generation sequencing (NGS) data from Illumina, 454, and Ion Torrent platforms.

G T A T C G C T A
SAMBLASTER
Desktop

SAMBLASTER

A tool to mark duplicates and extract discordant and split reads from SAM files.

A tool to mark duplicates and extract discordant and split reads from SAM files.

G T A T C G C T A
RepeatSoaker
Desktop

RepeatSoaker

Removes reads overlapping low-complexity (repeat) regions from aligned…

Removes reads overlapping low-complexity (repeat) regions from aligned sequencing data. RepeatSoaker helps to emphasize the biological signals within the data, reflected by more significant p-values…

G T A T C G C T A
BIGpre
Desktop

BIGpre

A quality assessment package for next-genomics sequencing data. BIGpre contains…

A quality assessment package for next-genomics sequencing data. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read…

G T A T C G C T A
Super Deduper
Desktop

Super Deduper

Examines and uses only a small portion of each read’s sequence in order to…

Examines and uses only a small portion of each read’s sequence in order to identify and remove PCR and/or optical duplicates. Super Deduper is a reference independent technique that performs better…

G T A T C G C T A
G-CNV
Desktop

G-CNV GPU-copy number variation

A graphics processing unit (GPU)-based tool for preparing data to detect copy…

A graphics processing unit (GPU)-based tool for preparing data to detect copy number variations (CNVs) with read-depth methods. G-CNV can be used to (i) filter low-quality sequences, (ii) mask…

G T A T C G C T A
DALIGNER
Desktop

DALIGNER

Finds all significant local alignments between reads. DALIGNER can also be used…

Finds all significant local alignments between reads. DALIGNER can also be used as a general read mapper and string to string comparison tool, as a “read” can now be a DNA sequence that is as…

G T A T C G C T A
biobambam
Desktop

biobambam

An API for efficient BAM file reading supporting the efficient collation of…

An API for efficient BAM file reading supporting the efficient collation of alignments by read name without performing a complete resorting of the input file.

G T A T C G C T A
Picard
Desktop

Picard

A set of tools (in Java) for working with next generation sequencing data in…

A set of tools (in Java) for working with next generation sequencing data in the BAM format.

G T A T C G C T A
PyroTrimmer
Desktop

PyroTrimmer

Removes the barcodes, linkers, and primers, trims sequence regions with low…

Removes the barcodes, linkers, and primers, trims sequence regions with low quality scores, and filters out low-quality sequence reads. Although these functions have previously been implemented in…

G T A T C G C T A
Duplicate…
Desktop

Duplicate reads removal

This tool is specifically well-suited to handle duplicate reads coming from PCR…

This tool is specifically well-suited to handle duplicate reads coming from PCR amplification errors which can have a negative effect because a certain sequence is represented in artificially high…

G T A T C G C T A
ngscmd
Desktop

ngscmd

A C program to manipulate next-generation sequence data files. The ngscmd…

A C program to manipulate next-generation sequence data files. The ngscmd program can work on a single fastQ input file, as well as mate pair files. The fastQ files can be input into the ngscmd…

Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.