Computational protocol: Cultivation and characterization of Candidatus Nitrosocosmicus exaquare, an ammonia-oxidizing archaeon from a municipal wastewater treatment system

[…] Ca. N. exaquare amoA gene sequences were compared with cultivated AOA representatives and environmental sequences obtained from GenBank. Global alignment of sequences was performed using MUSCLE (). Evolutionary histories were inferred using the maximum likelihood method based on the general time reversible model of sequence evolution. A Gamma distribution was used to model evolutionary rate differences among sites. Bootstrap testing was conducted with 500 replicates. All alignments and phylogenetic analyses were conducted in MEGA6 (). [...] Genomic DNA for sequencing was extracted from Ca. N. exaquare using the PowerSoil DNA Isolation Kit (MO BIO Laboratories). Enrichment cultures containing either no organic carbon or supplemented with 0.5 mm taurine were extracted separately to generate metagenomes suitable for differential abundance binning.Genomic DNA was prepared for sequencing using the TruSeq PCR-free kit (Illumina, San Diego, CA, USA) using alternative nebulizer fragmentation, gel-free size selection and a 550 bp target insert size. A mate-pair library was prepared with the Nextera Mate Pair Sample Preparation Kit (Illumina) and sequenced (2 × 301 bases) using MiSeq Reagent Kit v3 (Illumina). Paired-end FASTQ reads were imported to CLC Genomics Workbench version 7.0 (CLC Bio, Qiagen, Hilden, Germany) and trimmed using a minimum Phred score of 20 and length of 50 bases. Paired-end reads were assembled using the CLC de novo assembly algorithm, using a kmer length of 63 and a minimum scaffold length of 1 kb. Mate-pair reads were trimmed using NextClip () and only reads in class A were used for mapping.Metagenome binning and data generation was conducted as described previously () using the mmgenome R package and scripts ( The genome was manually scaffolded using paired-end and mate-pair connections aided by visualization in Circos (). Gaps were closed using GapFiller () and manually through inspections of read alignments in CLC Genomics Workbench.The assembled genome was annotated using both Integrated Microbial Genomes Expert Review (IMG ER; ) and the MicroScope platform for microbial genome annotation (MaGe; ). Locus tags are based on MaGe annotations. Comparative analysis of MetaCyc degradation, utilization and assimilation pathways were generated automatically in MaGe and updated manually to remove incorrect automatic assignments. The full genome sequence of Ca. N. exaquare G61 has been deposited in GenBank (accession CP017922) and associated annotations are publicly available in both IMG ER (ID 2603880166) and MicroScope (#U7DNPY). Summary data and genome accession numbers for associated enrichment culture bacteria are summarized in . […]

Pipeline specifications

Software tools MUSCLE, MEGA, CLC Genomics Workbench, NextClip, Circos, GapFiller
Applications Phylogenetics, De novo sequencing analysis, Nucleotide sequence alignment, Genome data visualization
Chemicals Ammonia, Ammonium Chloride, Calcium, Carbon, Nitrites, Sodium Nitrite, Urea, Succinic Acid