GATK statistics

To access cutting-edge analytics on consensus tools, life science contexts and associated fields, you will need to subscribe to our premium service.


Citations per year

Citations chart

Popular tool citations

chevron_left De novo mutation detection SNP detection SNV detection File format conversion Somatic CNA detection Genotype calling File merging File filtering Indel detection File sampling CNV detection Variant detection Base quality recalibration Indel realignment Variant recalibration Depth of coverage File splitting chevron_right
Popular tools chart

Tool usage distribution map

Tool usage distribution map

Associated diseases

Associated diseases


To access compelling stats and trends, optimize your time and resources and pinpoint new correlations, you will need to subscribe to our premium service.


GATK specifications


Unique identifier OMICS_19453
Alternative names Genome Analysis ToolKit, GenomeAnalysisTK
Software type Application/Script, Toolkit/Suite
Interface Command line interface
Restrictions to use Academic or non-commercial use
Output format SAM, BAM, VCF, BED
Operating system Unix/Linux, Mac OS, Windows
Programming languages Java
License MIT License
Computer skills Advanced
Stability Stable
High performance computing Yes
Registration required Yes
Maintained Yes


  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • AnalyzeCovariates
  • ApplyRecalibration
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MappingQualityRankSumTest
  • AS_MQMateRankSumTest
  • AS_QualByDepth
  • AS_ReadPosRankSumTest
  • AS_RMSMappingQuality
  • AS_StrandOddsRatio
  • ASEReadCounter
  • BadCigarFilter
  • BadMateFilter
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • BaseRecalibrator
  • BeagleCodec
  • BedTableCodec
  • CalculateGenotypePosteriors
  • CallableLoci
  • CatVariants
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClipReads
  • ClusteredReadPosition
  • CombineGVCFs
  • CombineVariants
  • CommandLineGATK
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountingFilteringIterator.CountingReadFilter
  • CountIntervals
  • CountLoci
  • CountMales
  • CountReadEvents
  • CountReads
  • CountRODs
  • CountRODsByRef
  • CountTerminusEvent
  • Coverage
  • DepthOfCoverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • DiagnoseTargets
  • DiffObjects
  • DuplicateReadFilter
  • ErrorRatePerCycle
  • ExcessHet
  • FailsVendorQualityCheckFilter
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • FastaStats
  • FindCoveredIntervals
  • FisherStrand
  • FlagStat
  • FractionInformativeReads
  • GatherBqsrReports
  • GATKPaperGenotyper
  • GCContent
  • GCContentByInterval
  • GenotypeConcordance
  • GenotypeGVCFs
  • GenotypeSummaries
  • HaplotypeCaller
  • HaplotypeResolver
  • HaplotypeScore
  • HardyWeinberg
  • HCMappingQualityFilter
  • HomopolymerRun
  • InbreedingCoeff
  • IndelRealigner
  • LeftAlignAndTrimVariants
  • LeftAlignIndels
  • LibraryReadFilter
  • LikelihoodRankSumTest
  • LowMQ
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityRankSumTest
  • MappingQualityUnavailableFilter
  • MappingQualityZero
  • MappingQualityZeroBySample
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • MuTect2
  • Mutect2
  • MVLikelihoodRatio
  • NBaseCount
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • OxoGReadCounts
  • PathSeq
  • PhaseByTransmission
  • Pileup
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • PossibleDeNovo
  • PrintReads
  • PrintRODs
  • QCRef
  • QualByDepth
  • QualifyMissingIntervals
  • RandomlySplitVariants
  • RawHapMapCodec
  • ReadBackedPhasing
  • ReadClippingStats
  • ReadGroupBlackListFilter
  • ReadGroupProperties
  • ReadLengthDistribution
  • ReadLengthFilter
  • ReadNameFilter
  • ReadPosRankSumTest
  • ReadStrandFilter
  • RealignerTargetCreator
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • ReCapSeg
  • RefSeqCodec
  • RegenotypeVariants
  • RMSMappingQuality
  • SAMPileupCodec
  • SampleFilter
  • SampleList
  • SAMReadCodec
  • SelectHeaders
  • SelectVariants
  • SimulateReadsForVariants
  • SingleReadGroupFilter
  • SpanningDeletions
  • splitNcigar
  • SplitNCigarReads
  • SplitSamFile
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TableCodec
  • TableRecalibration
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • UnifiedGenotyper
  • UnmappedReadFilter
  • ValidateVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantEval
  • VariantFiltration
  • VariantRecalibrator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF
  • VariantType



Add your version



Additional information

Publications for Genome Analysis ToolKit

GATK in pipelines

PMCID: 5754367
PMID: 29302025
DOI: 10.1038/s41467-017-02306-5

[…] version 0.7.7) was used to map all reads to ucsc hg19. pcr duplicates were removed from alignments using picard version 1.96 ( indels were realigned using the genome analysis toolkit (gatk). snvs and short indels were called using gatk haplotype caller (version 3.4–46). all variants were annotated with annovar and in-house scripts, and most likely protein […]

PMCID: 5756397
PMID: 29304727
DOI: 10.1186/s12864-017-4416-9

[…] were aligned to the chicken reference genome (galgal4) [] using burrows-wheeler alignment algorithm implemented in bwa [] and sorted using samtools []. picard tools were used to mark duplicates and gatk was used for calling the snps [, ]. for more details on the preparation pipeline see reimer et al. []., the initial array data set contained 918 animals and 580, 588 snps. snps misplaced […]

PMCID: 5760544
PMID: 29317692
DOI: 10.1038/s41598-017-18358-y

[…] tools were used to remove pcr-duplicated and multiple aligned reads. after filtering out low-quality reads with mapping quality < 13 and base quality < 13, snp calling was performed using gatk software. variants meeting the following three criteria were used for further analysis: (i) depth greater than 50x, (ii) quality value greater than 30, and (iii) allele frequency larger than 1% […]

PMCID: 5766596
PMID: 29330436
DOI: 10.1038/s41598-017-18534-0

[…] with any of the functional databases were predicted by estscan v3.0.2. we used misa v1.0 to detect simple sequence repeats (ssrs; also known as microsatellites sequences) in our unigenes, and used gatk v3.4–0 to detect single nucleotide polymorphism (snp) variants among the individual fmd., we mapped clean reads to unigenes with bowtie2 v2.1.0. we calculated the gene expression level of reads […]

PMCID: 5768770
PMID: 29335443
DOI: 10.1038/s41467-017-02584-z

[…] using default parameters. duplicate reads were marked using picard tools after which realignment around known indels and base quality recalibration was performed at an individual sample level using gatk 2.7 version., somatic mutation calling was performed using mutect allowing up to 5 reads supporting the variant allele in the normal sample up to a maximum of 0.05 allele frequency. the passed […]

To access a full list of citations, you will need to upgrade to our premium service.

GATK in publications

PMCID: 5959919
PMID: 29777105
DOI: 10.1038/s41467-018-04256-y

[…] as the reference sequences for mapping and variant detection. the sequenced reads of taqed strain genome were mapped using a burrows–wheeler aligner (bwa), and small variants were called using the genome analysis toolkit (gatk). for arabidopsis, the genome sequence uploaded onto the tair10 database ( was used as the reference sequence for mapping and variant […]

PMCID: 5955905
PMID: 29769526
DOI: 10.1038/s41467-018-04332-3

[…] and aligned to the reference genome (hg19) with the bwa algorithm and processed with picard ( polymorphic snp and indel sites and genotypes were called with the haplotypecaller from gatk v3.1–. the haplotypecaller algorithm is an assembly-based method that determines genotype likelihoods independently in each sample and then jointly considers data […]

PMCID: 5955954
PMID: 29769555
DOI: 10.1038/s41598-018-25654-8

[…] 870,407,901 trimmed reads. the remaining reads were mapped against the bos taurus umd3.1 genome using the “mem” algorithm of bwa version 0.7.9a-r786 with default settings. variants were called using gatk indelrealigner version 2.3–9-gdcdccbb, resulting in a variant call format (vcf) file. we then applied an iterative string search using the reference bovine genome build umd3.1 to compute […]

PMCID: 5955945
PMID: 29769567
DOI: 10.1038/s41598-018-25800-2

[…] sequence reads were mapped to the human genome (hg19) using burrows-wheeler alignment tool. duplicate read removal was performed using picard and samtools, and local alignment was optimized by the genome analysis toolkit. variant calling was only performed in targeted regions of cancerscan. somatic variant calling of each tumor was based on the results of cancerscan of tumor tissue and rna […]

PMCID: 5954055
PMID: 29765017
DOI: 10.1038/s41467-018-04011-3

[…] windows of 300 genes., esnp-karyotyping was performed as previously described. briefly, rna-seq reads were aligned to the genome (assembly version grch38) using tophat2 and snps were called using gatk haplotypecaller. called snps were filtered by read number, with snps expressed in <20 transcripts discarded, and minor allele frequency and allelic ratio (major to minor) was calculated […]

To access a full list of publications, you will need to upgrade to our premium service.

GATK institution(s)
National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Research Services, Rochester, MN, USA; Mayo Clinic, Department of IT Executive Administration, Rochester, MN, USA; Mayo Clinic, Department of Health Sciences Research, Rochester, MN, USA; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL, USA; Department of Crop Sciences, University of Illinois at Urbana-Champaign, IL, USA; Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Biochemistry and Molecular Biology, Rochester, MN, USA; Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, IL, USA
GATK funding source(s)
Supported by the Mayo Clinic Center for Individualized Medicine and the Todd and Karen Wanek Program for Hypoplastic Left Heart Syndrome.

GATK reviews

star_border star_border star_border star_border star_border
star star star star star

Sangram keshari sahu

star_border star_border star_border star_border star_border
star star star star star
A well maintained and active development tool in Genomics and they are expanding to other domains of NGS also (like transcriptomics). One of the best points about this tool is Adoption/Availability to different platforms. Starting from the command line to UI based Galaxy interphase and recently integration to all major cloud platforms from the development team itself.

Weisheng Wu

star_border star_border star_border star_border star_border
star star star star star
A very versatile package for most analyses involving genomic variants. Documentation is very informative and updated quickly.