Computational protocol: The Whole Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum)

Similar protocols

Protocol publication

[…] Prior to R. philippinarum genome assembly, we estimated its genome size based on K-mer analysis with K-mer size of 17-bp using SOAPec v2.01 Tool. The genome size was calculated using the following formula (Genome Size = Total number of K-mer/K-mer depth). The size of R. philippinarum genome was estimated at around 1.37 Gb (see and table S1, online).With only qualified reads from PE and MP libraries, the R. philippinarum genome was assembled with contig construction followed by scaffolding, and gap closure. In the contig construction, the short insert library (500 bp) data were assessed to construct a de Brujin graph using SOAPdenovo software v2.04 with default parameters (). Then, we discarded all erroneous data such as the clip tips, bubbles, and connection with low coverage. All qualified reads were realigned with the contig sequences. Then, we calculated the PE relationship between each pair of contigs and built the scaffolds step by step, from short insert-size to long distance PEs. The synthetic long read data were also used to promote long-range scaffolding of de novo assembly. Sequence reads generated from the synthetic long-read libraries were assessed to construct the TSLR contigs using TruSeq Long-Read Assembly tool v1.1 with default parameters (). Consequently, the reconstruction of scaffolds with the underlying scaffolds and TSLR contigs was conducted with SSPACE-LongRead v1.1 with default parameters (). Finally, the gaps between the scaffolds, mainly derived from repeats, were covered with the high quality PE information by using GapCloser tool v1.12 with default options (). The significant heterozygosity in the R. philippinarum genome was likely to present assembly errors. In addition, we tested alternative assembly methods with default parameters, HaploMerger2 v 20151124 software, to counteract those issues of high level of heterozygosity in the R. philippinarum genome (). [...] All assembled genome data were subjected to perform nucleotides sequence alignments with NCBI nr nucleotide database (Blast v2.2.29) by using megablast algorithm with an E-value cutoff of 1E–5 (). Next, taxonomy assignments by using aligned contigs and scaffolds were performed with the empirical taxonomy information from NCBI database (). The visualization of taxonomy profiling were conducted with the Krona tool v2.5 (). […]

Pipeline specifications

Software tools SOAPec, SOAPdenovo, SSPACE-LongRead, HM, BLASTN, Krona
Organisms Ruditapes philippinarum