SGA statistics

To access cutting-edge analytics on consensus tools, life science contexts and associated fields, you will need to subscribe to our premium service.


Citations per year

Citations chart

Popular tool citations

chevron_left Error correction Genome assembly Genome profiling chevron_right
Popular tools chart

Tool usage distribution map

Tool usage distribution map

Associated diseases

Associated diseases


To access compelling stats and trends, optimize your time and resources and pinpoint new correlations, you will need to subscribe to our premium service.


SGA specifications


Unique identifier OMICS_00028
Name SGA
Alternative name String Graph Assembler
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
License GNU General Public License version 3.0
Computer skills Advanced
Version 0.10.15
Stability Stable
Maintained Yes


  • SGA-preqc



Add your version



  • person_outline Jared Simpson <>

Additional information

Publications for String Graph Assembler

SGA in pipelines

PMCID: 5793845
PMID: 29385567
DOI: 10.1093/gbe/evy003

[…] platform (). sequencing generated 43 billion nucleotides (nt) of genomic data, which corresponded to 144 million paired-end reads (). the basic genome characteristics were determined using the sga preqc software tool (), showing an estimated genome size of 313 mb (∼140x coverage). adapters were trimmed and the reads were quality filtered before de novo genome assembly, which created […]

PMCID: 5827749
PMID: 29483539
DOI: 10.1038/s41598-018-21919-4

[…] with 2 × 300 bp pair-end library. a total of 794,028 reads (55× coverage of the genome) were generated, cleaned and quality filtered using trimmomatic. reads were then corrected for errors through string graph assembler which utilizes a k-mer centric algorithm. de novo assembly was attempted through idba-ud algorithm centred on de bruijn graph approach. genome annotation was carried out using […]

PMCID: 5334587
PMID: 28254980
DOI: 10.1128/genomeA.01712-16

[…] sequencing was performed on an illumina hiseq using a single library of 90-nucleotide paired-end reads with a 170-bp inset size. quality-filtered reads were assembled into contigs using the string graph assembler (sga) (). contigs were then subject to scaffolding using sspace () and the full set of reads using the settings -k, 10; -a, 0.7; -n, 50; and -o, 20. scaffolds were subject […]

PMCID: 4154752
PMID: 25188499
DOI: 10.1371/journal.pone.0106689

[…] of the highest-covered fragments is 40. the next step uses overlap-based error correction to generate higher quality consensus sequences for each short read. the main assembly steps implement the string graph assembler (sga) which generates contigs using an overlap approach, then scaffolds contigs from the same fragment using paired-end information. gap filling is then conducted to fill […]

PMCID: 4202335
PMID: 25364804
DOI: 10.1093/gbe/evu199

[…] s1, supplementary material online). single nucleotide polymorphism (snp) and indel detection was performed with the genome analysis toolkit (v2.7; []). de novo genome assembly was performed using string graph assembler (v0.9.19; []), and contigs were aligned with mummer (v3.0; []; supplementary table s2, supplementary material online). unique sequence not present in s288c was identified using […]

To access a full list of citations, you will need to upgrade to our premium service.

SGA in publications

PMCID: 5918799
PMID: 29694397
DOI: 10.1371/journal.pone.0195481

[…] ns from both the 5’ and 3’ end as well as discarding reads shorter than 30 nt. sequence data was merged to library level and an additional filtering of low-quality reads was performed using the sga–string graph assembler [] preprocess command, option dust-threshold set to 3. exact match read duplicates were removed, by first indexing the reads using sga index and then filtering using sga filter […]

PMCID: 5902842
PMID: 29661190
DOI: 10.1186/s12864-018-4656-3

[…] to ‘chr30’ were submitted as individual (non-concatenated) scaffolds according to ncbi genbank submission policy. estimates of genome size based on k-mer analysis indicated a genome size of 705 mb (preqc) or 742 mb (jellyfish). these align with estimates from flow cytometry [] that report the genome to be 758 mb in size. the assembly therefore represents ~ 73% of the estimated genome size […]

PMCID: 5894186
PMID: 29636006
DOI: 10.1186/s12864-018-4616-y

[…] the celera assembler [], resulting in a contig assembly (see methods). all illumina reads were mapped to the contig assembly with the burrows-wheeler aligner (bwa) [], and the scaffold module from string graph assembler (sga) [] was used to scaffold the contigs. to reduce gaps and to improve the accuracy of the consensus sequence, all illumina reads were mapped to the scaffold assembly, […]

PMCID: 5895191
PMID: 29732264
DOI: 10.1002/aps3.1034

[…] ) builds an ordered k‐mer hash table in jellyfish 1.x (marçais and kingsford, )., the same preprocessing steps and postanalysis filtering were performed in all analyses. the genome assembly program string graph assembler (simpson and durbin, ; simpson, ) was used to clean raw read data to remove duplicate, repetitive, and low‐quality reads (for details, see: […]

PMCID: 5882950
PMID: 29615780
DOI: 10.1038/s41598-018-23749-w

[…] roperties of the b. leachii genome; provided with statistics from the human, fish (maylandia zebra), bird (melopsittacus undulatus) and oyster (crassostrea gigas) genomes for comparison. firstly, the sga-preqc package was used to estimate the genome size to be 194 mb (194,153,277 bp). this size is similar to that of the solitary c. robusta, c. savigny and m. occidentalis, m. oculata (160 mb, 190 m […]

To access a full list of publications, you will need to upgrade to our premium service.

SGA institution(s)
Ontario Institute for Cancer Research, Toronto, ON, Canada
SGA funding source(s)
Supported by the Wellcome Trust Sanger Institute (Wellcome Trust grant 098051) and by the Ontario Institute for Cancer Research through funding provided by the Government of Ontario.

SGA reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review SGA