MaSuRCA protocols

View MaSuRCA computational protocol

MaSuRCA statistics

You need an account to access this content

info

Citations per year

Citations chart
info

Popular tool citations

chevron_left Genome assembly chevron_right
Popular tools chart
info

Tool usage distribution map

Tool usage distribution map
info

Associated diseases

Associated diseases

MaSuRCA specifications

Information


Unique identifier OMICS_00020
Name MaSuRCA
Alternative name Maryland Super-Read Celera Assembler
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
Computer skills Advanced
Stability Stable
Maintained Yes

Versioning


Add your version

Maintainers


  • person_outline Steven Salzberg <>
  • person_outline Aleksey Zimin <>

Publication for Maryland Super-Read Celera Assembler

library_books

The MaSuRCA genome assembler.

2013 Bioinformatics
PMCID: 3799473
PMID: 23990416
DOI: 10.1093/bioinformatics/btt476

MaSuRCA in pipelines

 (13)
2018
PMCID: 5759298
PMID: 29310588
DOI: 10.1186/s12864-017-4403-1

[…] from the hiseq2500 in fastq format were used for the assembly. prior to assembly, fastqc (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) was run to verify the quality of the reads. masurca assembler (version 2.3.2) [] was used to assemble the raw data into 98,162 scaffolds and has been deposited at ddbj/ena/genbank under the accession peqf00000000. to obtain a more reasonable […]

2017
PMCID: 5454193
PMID: 28572310
DOI: 10.1128/genomeA.00405-17

[…] data for high-quality vector- and adaptor-free reads for genome assembly (cutoff read length for high quality, 80%; cutoff quality score, 20). high-quality vector-filtered reads were assembled with masurca (), under the default parameter predicted by the assembler for each genome, and annotated with the help of the rapid annotations using subsystems technology (rast) (). summarizes assembly […]

2017
PMCID: 5571403
PMID: 28839017
DOI: 10.1128/genomeA.00776-17

[…] reads was approximately 7 billion short-read sequences in pairs of ~300 bp, the number of bases (mb) was 1,447.5, and there was 35.11% g+c content. de novo contig assembly was performed using masurca (), and further downstream processing was performed. coding sequences (cdss) were predicted from the contigs using glimmer (), and 2,792 predicted cdss were found. the predicted cdss […]

2017
PMCID: 5677167
PMID: 28963165
DOI: 10.1534/g3.117.300107

[…] s1 and figure s2 in file s1 show comparisons of preliminary assemblies using illumina data and nanopore data separately, and using different assembly programs [megahit (), spades (), sspace (), and masurca ()]. the figures demonstrate that the illumina-only assembly is more complete although far more fragmented than the nanopore-only assembly. the hybrid assembly combines the benefits […]

2017
PMCID: 5679800
PMID: 29122867
DOI: 10.1128/genomeA.01193-17

[…] paired-end reads was approximately 7 billion short-read sequences in pairs of ~300 bp, the number of bases was 650.26 mb, and the g+c content was 35.28%. de novo contig assembly was performed using masurca (), and further downstream processing was performed. coding sequences (cdss) were predicted from the contigs using glimmer (), and 2,693 predicted cdss were found. the predicted cdss […]

MaSuRCA in publications

 (106)
PMCID: 5951914
PMID: 29760453
DOI: 10.1038/s41467-018-04374-7

[…] species saturnispora dispora (strain nrrl y-1447, spades assembly), ambrosiozyma philentoma (nrrl y-7523, spades), candida boidinii (nrrl y-2332, discovar), and citeromyces matritensis (nrrl y-2407, masurca); the ala clade species nakazawaea wickerhamii (nrrl y-2563, discovar) and peterozyma xylosa (nrrl y-12939, discovar); and the ser2 clade species saccharomycopsis capsularis (nrrl y-17639, […]

PMCID: 5941490
PMID: 29739321
DOI: 10.1186/s12864-018-4711-0

[…] kmer frequencies using soapec v2.01 with default settings []. we then built de novo assemblies from the edited reads using soapdenovo2 v2.04 [] and abyss v1.9.0 []. we also assembled the genome with masurca v2.3.2 [], which uses its own raw data quality control tools. for computational feasibility, the three assemblies used kmer values of 63, 63, and 35 respectively, and we merged scaffolds […]

PMCID: 5945886
PMID: 29780372
DOI: 10.3389/fmicb.2018.00861

[…] on the recommendation of the supplier. fosmids were then sequenced using the miseq system (illumina) at the get-plage genomics platform (auzeville, france). read assembly was performed using masurca. the contigs were cleaned from the pcc1fos vector sequence using crossmatch. the three annotated contig sequences were deposited in the ddbj/ena/genbank nucleotide sequence database […]

PMCID: 5941149
PMID: 29722814
DOI: 10.1093/gigascience/giy044

[…] turkey, and zebra finch (). the size of the longest assembled sequence was 50.28 mb, and 928 scaffolds were longer than 10 kb. the basic statistics of both the contigs and scaffolds assembled using masurca [] are shown in table . the cumulative length plots () and the nx plot for the scaffolds () showed that most of the draft genome consisted of large scaffolds; though many short scaffolds […]

PMCID: 5907361
PMID: 29669603
DOI: 10.1186/s12915-018-0508-5

[…] was used to produce quality metrics and bwa aln (version 0.6.1-r104) to search escherichia coli-, yeast-, and phage-contaminated reads. the reads of a. astaci and a. stellatus were assembled using masurca, version 2.0 [], and the assembly metrics were calculated using the assemblathon_stats.pl script []., the assembled data of aphanomyces were annotated with augustus v2.757 [] trained […]

MaSuRCA institution(s)
Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA; Institute for Physical Sciences and Technology, University of Maryland, College Park, MD, USA; Department of Plant Sciences, University of California, Davis, CA, USA; National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA; Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA; Departments of Mathematics and Physics, University of Maryland, College Park, MD, USA; Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, USA
MaSuRCA funding source(s)
Supported in part by National Science Foundation (NSF) grant IOS-1238231, by National Institutes of Health (NIH) grant R01- HG006677, and by the Intramural Research Program of the National Human Genome Research Institute, NIH.

MaSuRCA reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review MaSuRCA