Kraken specifications


Unique identifier OMICS_01057
Name Kraken
Software type Toolkit/Suite
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux, Mac OS
Programming languages R
License GNU General Public License version 2.0
Computer skills Advanced
Stability Stable
Source code URL
Maintained Yes


  • Minion
  • Reaper
  • Sequence Imp
  • Tally


  • person_outline Anton Enright <>

PMCID: 5459504
PMID: 28542189
DOI: 10.1371/journal.pntd.0005559

[…] []., we used previously generated small rna-seq libraries from schistosomula stages of s mansoni to assist our description of the sma-mir-277 locus (european nucleotide archive study prjeb3190). kraken [] was used to remove adapter contamination from libraries and collapse identical reads into single sequences while maintaining annotated depth information (supplementary ). reads […]

PMCID: 5474745
PMID: 28608850
DOI: 10.1038/ncomms15691

[…] dnase-treated total rna using ribozero and scriptseq systems (epicentre/illumina) and run on an illumina 2500 sequencer to obtain 100 base paired-end reads. low quality reads were filtered out by kraken. the resulting filtered reads were mapped to the mouse genome version mm10 using tophat. mapped reads were counted with htseq-count and read counts were normalized using deseq2 (ref. ). […]

PMCID: 4886428
PMID: 27245778
DOI: 10.1186/s13073-016-0315-y

[…] were sequenced in one lane of a sequencing flow cell; 16–18 million reads per sample was obtained. sequencing results from small rna libraries were 3′ adapter trimmed and de-duplicated using the reaper and tally command-line tools from kraken []. the processed small rna-seq reads were mapped to all available mus musculus mature mirna sequences (mirbase v21 []). read mapping was performed […]

PMCID: 4940957
PMID: 27401977
DOI: 10.1186/s12864-016-2776-1

[…] using standard illumina truseq small rna library preparation kits, and sequenced using the illumina hiseq 2500 platform to obtain 37 bp single-end reads. reads were quality-trimmed using the kraken set of tools [], and those reads with a length between 20 and 25 nt were selected for subsequent analyses. mapping of the srna reads was performed using the bowtie 1.1.2 version [] to a mixed […]

PMCID: 5011829
PMID: 27599549
DOI: 10.1186/s12866-016-0818-0

[…] at a specific location, and not on the number of reads we obtained at a specific location, which is likely to be highly biased due to pcr artefacts. we thus first deduplicated the reads using tally [], and then used bowtie2 [] to align the reads to the shigella flexneri 2a 2457t genome and the shigella flexneri 2a str. 301 plasmid pcp301. the sequence of the s. flexneri 2457t plasmid […]

PMCID: 5876398
PMID: 29599503
DOI: 10.1038/s41467-018-03668-0

[…] and duplication rates with fastqc (andrews s. 2010, fastqc: a quality control tool for high throughput sequence data. available online at []). reaper version 13–100 was employed to trim reads after a quality drop below a mean of q20 in a window of 10 nucleotides. only reads between 30 and 150 nucleotides were cleared for further analyses. […]

PMCID: 5874924
PMID: 29559567
DOI: 10.1128/mBio.00024-18

[…] were removed with fastq_screen v. 0.9.3 (babraham bioinformatics; remaining reads were further filtered for low complexity with reaper v. 15-065 (). reads were aligned against the spiroplasma poulsonii msro (v2) genome using star v. 2.5.2b (). the number of read counts per gene locus was summarized with htseq-count v. 0.6.1 […]

PMCID: 5738828
PMID: 29262846
DOI: 10.1186/s13104-017-3061-3

[…] 6 bp length., sequencing reads were trimmed using the software cutadapt (version 1.4.2) [], requiring a minimum 4 bp overlap for adapter trimming. duplicates were removed from the trimmed reads using tally [] (version 14-020), and 1,500,000 reads subsampled in order to estimate the fragment length distribution of the total dna (both endogenous and contaminant) recovered using each treatment (addit […]

PMCID: 5670347
PMID: 29163195
DOI: 10.3389/fphys.2017.00849

[…] resulting in minimum of 28 m reads per library with 1 × 75 bp single end setup. the resulting raw reads were assessed for quality, adapter content and duplication rates with fastqc (andrews, ). reaper version 13–100 was employed to trim reads after a quality drop below a mean of q20 in a window of 10 nucleotides (davis et al., ). only reads between 30 and 150 nucleotides were cleared […]

PMCID: 5532819
PMID: 28751381
DOI: 10.1128/genomeA.00562-17

[…] 4,152 bp, representing ~25× coverage. a paired-end 101-bp illumina library was constructed and sequenced at macrogen (rockville, md) from 1 µg of dna. illumina sequence data were deduplicated using tally (), adapter and quality trimmed using trim galore version 0.4.3 (, and errors corrected using spades version 3.10.0 (), resulting […]

EMBL – European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; National Laboratory of Genomics for Biodiversity (Langebio), Cinvestav, Irapuato, Guanajuato, Mexico
Supported by the EU FP7 (SIROCCO, LSHG-CT2006-037900) and BBSRC UK (BB/01589X/1).

