CheckM protocols

View CheckM computational protocol

CheckM statistics

To access cutting-edge analytics on consensus tools, life science contexts and associated fields, you will need to subscribe to our premium service.

Subscribe
info

Citations per year

Citations chart
info

Popular tool citations

chevron_left Assembly evaluation chevron_right
Popular tools chart
info

Tool usage distribution map

Tool usage distribution map
info

Associated diseases

Associated diseases

CheckM specifications

Information


Unique identifier OMICS_08837
Name CheckM
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
License GNU General Public License version 2.0
Computer skills Advanced
Version 0.9.7
Stability Stable
Maintained Yes

Versioning


Add your version

Maintainer


  • person_outline Donovan H. Parks <>

Publication for CheckM

CheckM in pipelines

 (65)
2018
PMCID: 5754496
PMID: 29301887
DOI: 10.1128/genomeA.01416-17

[…] and reassembled using spades () with the “-careful” flag enabled. a final scaffolding step was performed using sspace (). final draft genome completeness and contamination was assessed using checkm (). the final draft genome was annotated using the ncbi prokaryotic genome annotation pipeline (pgap) () and reviewed using rast v2.0 ()., this resulted in a 3.47-mbp draft genome estimated […]

2018
PMCID: 5754498
PMID: 29301889
DOI: 10.1128/genomeA.01419-17

[…] ngen dna assembly software (dnastar, inc., madison, wi, usa) at the site. the assembly produced 31 contigs for uv4 and 34 contigs for uv4/95 with an average coverage of 40×, and an assessment with checkm () characterized the draft genomes as 100% complete. the assemblies were annotated online using the ncbi prokaryotic genome annotation pipeline (pgap) ()., the draft genomes of both isolates […]

2018
PMCID: 5756336
PMID: 29304850
DOI: 10.1186/s40168-017-0392-1

[…] []. de-contamination of retrieved genome bins (gbs) was carried out in prodege []. bonafide gbs were uploaded to rast [] for annotation. all gbs were checked for completeness and contamination using checkm v1.0.5 []., phylophlan v0.99 [] was used to reconstruct the phylogenetic tree of all gbs based on the protein prediction results from prodigal v2.6 []. additionally, we used blast [] […]

2018
PMCID: 5769542
PMID: 29337314
DOI: 10.1038/sdata.2017.203

[…] set (2,009 draft genomes), the alternative marker set (95 draft genomes), or the 16s rrna gene tree (35 draft genomes). the remaining 492 draft genomes were provided a putative phylogeny based on checkm (; supplementaltable4.xlsx, data citation 2)., several of the size fractions used to reconstruct bacterial and archaeal draft genomes were specifically designed to target different biological […]

2018
PMCID: 5794932
PMID: 29437085
DOI: 10.1128/genomeA.00009-18

[…] flavobacterium, and sediminibacterium genera were binned using the manually supervised anvi’o protocol based on the abundance and tetranucleotide frequency distributions () and quality controlled by checkm () (100%, 99%, and 99% predicted completeness, respectively, with an indication of contamination only for pseudomonas at 0.27% and no strain heterogeneity)., the pseudomonas sp. strain […]


To access a full list of citations, you will need to upgrade to our premium service.

CheckM in publications

 (238)
PMCID: 5953950
PMID: 29765033
DOI: 10.1038/s41426-018-0089-y

[…] flag to reduce potential misassembly events. maxbin v2.2.1 was used to cluster contigs according to abundance and gc content, thereby separating chlamydial and non-chlamydial contigs. finally, checkm v1.0.6 was used to assess the quality of the clustering process., for the remaining samples, following illumina gdna shotgun library preparation with bead size selection, whole-genome […]

PMCID: 5946052
PMID: 29748410
DOI: 10.1128/genomeA.00406-18

[…] bins (). the most abundant bin contained 168 contigs (length, 1,527 to 114,900 bp; mean, 19,210 bp) and had a total sequence length of 3.27 mb, with an average gc content of 42.3%. analysis with checkm () showed this genome to have a high completeness (98.8%) and low contamination rate (1.7%), and it was annotated to the marker set lineage uid2565, which is selective for “candidatus […]

PMCID: 5930379
PMID: 29523545
DOI: 10.1128/AEM.00179-18

[…] for alignment, visualization, and manual refinement, including the removal of contigs containing eukaryotic rrnas and plasmids. refined mags were analyzed for completeness and contamination by using checkm (). binning results are presented in table s5, and bin read recruitment percentages for each sample are presented in table s6. open reading frames were identified by using prodigal () […]

PMCID: 5928099
PMID: 29712903
DOI: 10.1038/s41467-018-04041-x

[…] all contigs linked to the contig that contained the full-length 16s rrna gene of the sup05 organism as reconstructed by phyloflash. the genome completeness for all sup05 bins was calculated using checkm version 1.07 and the gammaproteobacterial marker gene set using the taxonomy workflow. annotation was performed using prokka. genes related to nitrate respiration (nirs, narg, norb, and nosz) […]

PMCID: 5928048
PMID: 29712968
DOI: 10.1038/s41598-018-25146-9

[…] contigs, which were distributed among 21 genome bins. more than 80% of metagenomics sequences corresponded to only three bins. analysis of the presence of conservative single-copy marker genes with checkm showed that two bins meets the recently proposed criteria for the high quality metagenome-assembled genomes (>90% completeness with <5% contamination) and one bin was slightly […]


To access a full list of publications, you will need to upgrade to our premium service.

CheckM institution(s)
Australian Centre for Ecogenomics, School of Chemistry & Molecular Biosciences, The University of Queensland, St. Lucia, Queensland, Australia; Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Queensland, Australia; Advanced Water Management Centre, The University of Queensland, St. Lucia, Queensland, Australia

CheckM review

star_border star_border star_border star_border star_border
star star star star star

Sarah Turner

star_border star_border star_border star_border star_border
star star star star star
Desktop
I have found this tool extremely useful for evaluating bins created by various binning tools (MetaBAT, MaxBin, and MyCC are the ones I've used the most). It's very good for getting a preliminary overview of how "good" the bins and sometimes for assigning bins to specific taxa. (I'm working with fairly complex environmental samples, so often my bins will be fairly incomplete/contaminated, and cannot be assigned at the family/genus/species levels.)

Installation can be somewhat challenging, because the tool has a fair number of dependencies that need to be installed as well. Ultimately, I chose to install it on a virtual machine running a 64 bit version of Ubuntu; I could perform most of the basic functions there, but didn't have enough RAM for some of the more computationally-intensive functions (for example, tetra, which calculates the tetranucleotide signatures).

I find the "bin_qa_plot" very useful to asses bins at a glance, and there are a variety of other useful graphical outputs to look more closely at the coverage and composition of individual bin. The tool also offers a number of ways to modify and refine bins manually, and the combination has the potential to be very powerful in recovering more complete draft genomes from metagenomes.

The tool can also be used to evaluate the results of single-cell genomics or genomes recovered from isolates, so it has broader applications than just metagenomics.