High throughput sequencing enables the sequencing of novel genomes on a daily basis. Nevertheless, even their most basic characteristics, such as their size or heterozygosity rate, may be initially unknown, making it difficult to select appropriate analysis methods e.g. read mapper, de novo assembler, or SNP caller (Smolka, et al., 2015). Determining these characteristics in advance can reveal if an analysis is not capturing the full complexity of the genome, such as underreporting the number of variants or failure to assemble a significant fraction of the genome. Source text: Vurture et al., 2016.

