Computational protocol: The highest-copy repeats are methylated in the small genome of the early divergent vascular plant Selaginella moellendorffii

Similar protocols

Protocol publication

[…] Total DNA was purified using DNeasy kits (Qiagen, CA) from green tissues of S. moellendorffii plants kept in growth chamber. The DNA was mechanically sheared using a Hydroshear device (Genomic Solutions, MI) and fragments ranging from 3 to 4 kb were eluted from an agarose gel after electrophoresis, end-repaired, and ligated into a cloning vector. DNA ligation reactions were transformed into E. coli DH5α (mcrBC+) to consruct the MF library. The WGS library was constructed by introducing the same ligation reaction into E. coli GC10 (mcrBC-). Recombinant clones were sequenced using Big Dye Terminator chemistry and ABI 3730xl sequencers (Applied Biosystems, CA), and vector and low-quality sequences were electronically trimmed.Chloroplast sequences were identified by BLASTN alignment to the S. uncinata chloroplast genome (GenBank accession AB197035) at high stringency (E value smaller than 10-56). The chloroplast sequences were excluded from any further sequence analyses. Protein sequence alignments against the NIAA database were done using BLAT. Alignments with at least 70% similarity and 40 amino acids long were recorded as matches.Alignments to assembled EST sequences were done using BLASTN at high stringency. Matches showing an E value smaller than 10-56 were recorded.De novo repeats were identified by aligning MF and WGS reads to the JGI-DOE S. moellendorffii genome assembly using BLASTN and matches covering 50% of the read with 95% identity were recorded.Alignments to the curated database of known genes were done as previously reported [], using BLASTX and recording matches with an E value better than 10-7.Known repeats were identified using a nucleotide database and a protein database of known repetitive elements described earlier []. These databases do not contain simple sequence repeats. Repetitive element proteins were identified using the protein database of repeats. The same criteria were used to identify known genes, while repetitive nucleotide sequences were identified using BLASTN with an E value smaller than 10-10.DNA digestion with HpaII was preformed following manufacturer recommendations. PCR assays were carried out using 50 ng of HpaII-digested or undigested genomic DNA as template, and denaturing 3 minutes at 94°C followed by 25 amplification cycles using the following program: 30 seconds at 94°C, 30 seconds at 59°C, and 60 seconds at 72°C. Elongation was allowed for 10 minutes at 72°C after amplification. Target and primer sequences are shown in Additional file . […]

Pipeline specifications

Software tools BLASTN, BLAT, BLASTX
Application WGS analysis
Organisms Selaginella moellendorffii