GATK pipeline

GATK specifications

Information


Unique identifier OMICS_19453
Name GATK
Alternative name Genome Analysis ToolKit
Software type Package/Module, Toolkit/Suite
Interface Command line interface
Restrictions to use Academic or non-commercial use
Input format SAM, BAM, VCF, BED, DICT, FASTA, FAI, BAI
Output format SAM, BAM, VCF, BED
Operating system Unix/Linux, Mac OS, Windows
Programming languages Java
License MIT License
Computer skills Advanced
Version 4.0
Stability Stable
High performance computing Yes
Registration required Yes
Maintained Yes

Subtools


  • CommandLineGATK
  • ASEReadCounter
  • AnalyzeCovariates
  • CallableLoci
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountIntervals
  • CountLoci
  • CountMales
  • CountRODs
  • CountRODsByRef
  • CountReadEvents
  • CountReads
  • CountTerminusEvent
  • DepthOfCoverage
  • DiagnoseTargets
  • DiffObjects
  • ErrorRatePerCycle
  • FastaStats
  • FlagStat
  • FindCoveredIntervals
  • GCContentByInterval
  • GatherBqsrReports
  • Pileup
  • PrintRODs
  • QualifyMissingIntervals
  • ReadClippingStats
  • ReadGroupProperties
  • ReadLengthDistribution
  • SimulateReadsForVariants
  • BaseRecalibrator
  • ClipReads
  • IndelRealigner
  • LeftAlignIndels
  • PrintReads
  • RealignerTargetCreator
  • SplitNCigarReads
  • SplitSamFile
  • ApplyRecalibration
  • CalculateGenotypePosteriors
  • GATKPaperGenotyper
  • GenotypeGVCFs
  • HaplotypeCaller
  • MuTect2
  • RegenotypeVariants
  • UnifiedGenotyper
  • VariantRecalibrator
  • GenotypeConcordance
  • ValidateVariants
  • VariantEval
  • VariantFiltration
  • CatVariants
  • CombineGVCFs
  • CombineVariants
  • HaplotypeResolver
  • LeftAlignAndTrimVariants
  • PhaseByTransmission
  • RandomlySplitVariants
  • ReadBackedPhasing
  • SelectHeaders
  • SelectVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MQMateRankSumTest
  • AS_MappingQualityRankSumTest
  • AS_QualByDepth
  • AS_RMSMappingQuality
  • AS_ReadPosRankSumTest
  • AS_StrandOddsRatio
  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClusteredReadPosition
  • Coverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • ExcessHet
  • FisherStrand
  • FractionInformativeReads
  • GCContent
  • GenotypeSummaries
  • HaplotypeScore
  • HardyWeinberg
  • HomopolymerRun
  • InbreedingCoeff
  • LikelihoodRankSumTest
  • LowMQ
  • MVLikelihoodRatio
  • MappingQualityRankSumTest
  • MappingQualityZero
  • MappingQualityZeroBySample
  • NBaseCount
  • OxoGReadCounts
  • PossibleDeNovo
  • QualByDepth
  • RMSMappingQuality
  • ReadPosRankSumTest
  • SampleList
  • SpanningDeletions
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • VariantType
  • BadCigarFilter
  • BadMateFilter
  • CountingFilteringIterator.CountingReadFilter
  • DuplicateReadFilter
  • FailsVendorQualityCheckFilter
  • HCMappingQualityFilter
  • LibraryReadFilter
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityUnavailableFilter
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • ReadGroupBlackListFilter
  • ReadLengthFilter
  • ReadNameFilter
  • ReadStrandFilter
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • SampleFilter
  • SingleReadGroupFilter
  • UnmappedReadFilter
  • BeagleCodec
  • BedTableCodec
  • RawHapMapCodec
  • RefSeqCodec
  • SAMPileupCodec
  • SAMReadCodec
  • TableCodec
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • QCRef
  • splitNcigar
  • ReCapSeg
  • Mutect2

Download


Versioning


Add your version

Documentation


Maintainers


Additional information


https://software.broadinstitute.org/gatk/documentation/quickstart.php

Publications for Genome Analysis ToolKit

GATK citations

 (201)
2018
PMCID: 5876349

[…] coverage of 27.5 × per sample. the generated reads were aligned to the grch37/hg19 reference genome. base quality score recalibration and snp and indel discovery were performed using gatk v.3.6-0 variantrecalibrator and applyrecalibration tools with the default settings. the used ts_filter_level setting was 99.042–44. the data were filtered using the hard cutoffs of the total depth […]

2018
PMCID: 5869741

[…] after sequencing and alignment (16% duplicates vs. 8% for sudep cohort; both normal level of duplicates for ffpe dna in wes). for snp identification, the coverage was normalized by haplotypecaller (gatk) when calculating the quality scores of the variants and the variants were called independently for each sample with an adequate coverage of 65. in general, most variants can be called with 30× […]

library_books

Case report

2018
PMCID: 5881966

[…] and low quality reads were filtered. burrows–wheeler alignment (0.7.12) methods were adopted to map the clean reads to reference genome (ucsc hg19). then, picard (http://picard.sourceforge.net/) and genome analysis toolkit (gatk) methods were used for duplicate removal, local realignment, and base quality recalibration. gatk unified genotyper was used for variants calling., annovar (2015-03-22) […]

2018
PMCID: 5885019

[…] for 3 bp or more is trimmed. the reads are further trimmed using sickle version 1.2 [22] with a minimum window quality score of 20. reads shorter than 10 bp after trimming were removed., the genome analysis toolkit (gatk) indel realigner module was used to realign raw reads around indels [23]. single nucleotide polymorphism, insertion and deletion discovery was performed with gatk's […]

2018
PMCID: 5857600

[…] 1000 genomes, http://www.1000genomes.org, burrows-wheeler aligner, http://bio-bwa.sourceforge.net/, ensembl genome browser, http://www.ensembl.org/index.html, gatk, http://www.broadinstitute.org/gatk/, genbank, http://www.ncbi.nlm.nih.gov/genbank/, nhlbi exome sequencing project (esp) exome variant server, esp6500, http://evs.gs.washington.edu/evs/, picard, […]

GATK institution(s)
National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Research Services, Rochester, MN, USA; Mayo Clinic, Department of IT Executive Administration, Rochester, MN, USA; Mayo Clinic, Department of Health Sciences Research, Rochester, MN, USA; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL, USA; Department of Crop Sciences, University of Illinois at Urbana-Champaign, IL, USA; Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, IL, USA; Mayo Clinic, Department of Biochemistry and Molecular Biology, Rochester, MN, USA; Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, IL, USA
GATK funding source(s)
Supported by the Mayo Clinic Center for Individualized Medicine and the Todd and Karen Wanek Program for Hypoplastic Left Heart Syndrome.

GATK review

star_border star_border star_border star_border star_border
star star star star star

Arup Ghosh

star_border star_border star_border star_border star_border
star star star star star
Desktop
One of the best command-line based tools for variant calling from the aligned files with a variety of options for analysis.