GATK statistics

Tool stats & trends

Looking to identify usage trends or leading experts?

GATK specifications


Unique identifier OMICS_19453
Alternative names Genome Analysis ToolKit, GenomeAnalysisTK
Software type Application/Script, Toolkit/Suite
Interface Command line interface
Restrictions to use Academic or non-commercial use
Output format SAM, BAM, VCF, BED
Operating system Unix/Linux, Mac OS, Windows
Programming languages Java
License MIT License
Computer skills Advanced
Stability Stable
High performance computing Yes
Registration required Yes
Maintained Yes


  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • AnalyzeCovariates
  • ApplyRecalibration
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MappingQualityRankSumTest
  • AS_MQMateRankSumTest
  • AS_QualByDepth
  • AS_ReadPosRankSumTest
  • AS_RMSMappingQuality
  • AS_StrandOddsRatio
  • ASEReadCounter
  • BadCigarFilter
  • BadMateFilter
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • BaseRecalibrator
  • BeagleCodec
  • BedTableCodec
  • CalculateGenotypePosteriors
  • CallableLoci
  • CatVariants
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClipReads
  • ClusteredReadPosition
  • CombineGVCFs
  • CombineVariants
  • CommandLineGATK
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountingFilteringIterator.CountingReadFilter
  • CountIntervals
  • CountLoci
  • CountMales
  • CountReadEvents
  • CountReads
  • CountRODs
  • CountRODsByRef
  • CountTerminusEvent
  • Coverage
  • DepthOfCoverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • DiagnoseTargets
  • DiffObjects
  • DuplicateReadFilter
  • ErrorRatePerCycle
  • ExcessHet
  • FailsVendorQualityCheckFilter
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • FastaStats
  • FindCoveredIntervals
  • FisherStrand
  • FlagStat
  • FractionInformativeReads
  • GatherBqsrReports
  • GATKPaperGenotyper
  • GCContent
  • GCContentByInterval
  • GenotypeConcordance
  • GenotypeGVCFs
  • GenotypeSummaries
  • HaplotypeCaller
  • HaplotypeResolver
  • HaplotypeScore
  • HardyWeinberg
  • HCMappingQualityFilter
  • HomopolymerRun
  • InbreedingCoeff
  • IndelRealigner
  • LeftAlignAndTrimVariants
  • LeftAlignIndels
  • LibraryReadFilter
  • LikelihoodRankSumTest
  • LowMQ
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityRankSumTest
  • MappingQualityUnavailableFilter
  • MappingQualityZero
  • MappingQualityZeroBySample
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • MuTect2
  • Mutect2
  • MVLikelihoodRatio
  • NBaseCount
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • OxoGReadCounts
  • PathSeq
  • PhaseByTransmission
  • Pileup
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • PossibleDeNovo
  • PrintReads
  • PrintRODs
  • QCRef
  • QualByDepth
  • QualifyMissingIntervals
  • RandomlySplitVariants
  • RawHapMapCodec
  • ReadBackedPhasing
  • ReadClippingStats
  • ReadGroupBlackListFilter
  • ReadGroupProperties
  • ReadLengthDistribution
  • ReadLengthFilter
  • ReadNameFilter
  • ReadPosRankSumTest
  • ReadStrandFilter
  • RealignerTargetCreator
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • ReCapSeg
  • RefSeqCodec
  • RegenotypeVariants
  • RMSMappingQuality
  • SAMPileupCodec
  • SampleFilter
  • SampleList
  • SAMReadCodec
  • SelectHeaders
  • SelectVariants
  • SimulateReadsForVariants
  • SingleReadGroupFilter
  • SpanningDeletions
  • splitNcigar
  • SplitNCigarReads
  • SplitSamFile
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TableCodec
  • TableRecalibration
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • UnifiedGenotyper
  • UnmappedReadFilter
  • ValidateVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantEval
  • VariantFiltration
  • VariantRecalibrator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF
  • VariantType




No version available



  • person_outline Geraldine Van der Auwera
  • person_outline Liudmila Mainzer

Additional information

Publications for Genome Analysis ToolKit

GATK citations


Phenotypic diversification by enhanced genome restructuring after induction of multiple DNA double strand breaks

Nat Commun
PMCID: 5959919
PMID: 29777105
DOI: 10.1038/s41467-018-04256-y

[…] ome were annotated as SGCs. Because homologous TLs, BIRs, and SGCs represent homologous rearrangements, approximate rearrangement positions were notated. Small variants were detected using Picard and GATK. SNVs and InDels reported in AT-rich regions, rDNA regions, telomeres, and regions with too low-coverage (<50% of the average coverage) and that are often seen especially around the large chromos […]


Mutations in six nephrosis genes delineate a pathogenic pathway amenable to treatment

Nat Commun
PMCID: 5958119
PMID: 29773874
DOI: 10.1038/s41467-018-04193-w

[…] ed reads were aligned to the hg19 human reference using Novoalign V2.08.05 (, and single nucleotide variants (SNVs) and insertions and/or deletions (indels) were called using the Genome Analysis Toolkit (GATK) v1.6-13. […]


A whole genome sequence study identifies genetic risk factors for neuromyelitis optica

Nat Commun
PMCID: 5955905
PMID: 29769526
DOI: 10.1038/s41467-018-04332-3

[…] o the reference genome (hg19) with the BWA algorithm and processed with Picard ( Polymorphic SNP and indel sites and genotypes were called with the HaplotypeCaller from GATK v3.1–. The HaplotypeCaller algorithm is an assembly-based method that determines genotype likelihoods independently in each sample and then jointly considers data from all samples in the cohort t […]


Genomic alterations of ground glass nodular lung adenocarcinoma

Sci Rep
PMCID: 5955945
PMID: 29769567
DOI: 10.1038/s41598-018-25800-2

[…] Sequence reads were mapped to the human genome (hg19) using Burrows-Wheeler Alignment tool. Duplicate read removal was performed using Picard and Samtools, and local alignment was optimized by The Genome Analysis Toolkit. Variant calling was only performed in targeted regions of CancerSCAN. Somatic variant calling of each tumor was based on the results of CancerSCAN of tumor tissue and RNA sequ […]


Assessment of established techniques to determine developmental and malignant potential of human pluripotent stem cells

Nat Commun
PMCID: 5954055
PMID: 29765017
DOI: 10.1038/s41467-018-04011-3

[…] eSNP-karyotyping was performed as previously described. Briefly, RNA-seq reads were aligned to the genome (assembly version GRCh38) using Tophat2 and SNPs were called using GATK HaplotypeCaller. Called SNPs were filtered by read number, with SNPs expressed in <20 transcripts discarded, and minor allele frequency and allelic ratio (major to minor) was calculated for the w […]


Case Report: Identification of an HNF1B p.Arg527Gln mutation in a Maltese patient with atypical early onset diabetes and diabetic nephropathy

PMCID: 5952643
PMID: 29764441
DOI: 10.1186/s12902-018-0257-z
call_split See protocol

[…] the Burrows-Wheeler transformation algorithm, and duplicated reads were removed using Picard [, ]. FastQC was used to check the quality of sequence data []. Calling of SNPs and InDels was done using GATK Unified Genotyper, which uses a Bayesian genotype likelihood model to report alleles and Phred-scaled confidence values []. Variants (SNVs and indels) were called with SAMTools, with reference to […]


Looking to check out a full list of citations?

GATK institution(s)
National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Research Services, Rochester, MN, USA; Mayo Clinic, Department of IT Executive Administration, Rochester, MN, USA; Mayo Clinic, Department of Health Sciences Research, Rochester, MN, USA; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL, USA; Department of Crop Sciences, University of Illinois at Urbana-Champaign, IL, USA; Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Biochemistry and Molecular Biology, Rochester, MN, USA; Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, IL, USA
GATK funding source(s)
Supported by the Mayo Clinic Center for Individualized Medicine and the Todd and Karen Wanek Program for Hypoplastic Left Heart Syndrome.

GATK reviews

star_border star_border star_border star_border star_border
star star star star star

Sangram keshari sahu

star_border star_border star_border star_border star_border
star star star star star
A well maintained and active development tool in Genomics and they are expanding to other domains of NGS also (like transcriptomics). One of the best points about this tool is Adoption/Availability to different platforms. Starting from the command line to UI based Galaxy interphase and recently integration to all major cloud platforms from the development team itself.
Anonymous user #43219's avatar image

Anonymous user #43219

star_border star_border star_border star_border star_border
star star star star star
A very versatile package for most analyses involving genomic variants. Documentation is very informative and updated quickly.