Computational protocol: Complete Genome Sequence of the Largest Known Flavi-Like Virus, Diaphorina citri flavi-like virus, a Novel Virus of the Asian Citrus Psyllid, Diaphorina citri

Similar protocols

Protocol publication

[…] The flavi-like viruses are a group of viruses which share genome organization and encoded proteins with viruses of the family Flaviviridae (). The family Flaviviridae contains a diverse group of RNA viruses including numerous important pathogens from a wide range of hosts including both vertebrates and invertebrates (). The typical Flavivirus genome is a positive-sense, single-stranded unsegmented RNA 9.6 to 12.3 kb in length containing a single open reading frame (ORF) (). The ORF is translated into a single polyprotein which is subsequently cleaved into structural (CP and envelope) and nonstructural (replicase complex) proteins located at the N- and C-termini, respectively (). Unlike the typical Flavivirus, the genomes of flavi-like viruses have been reported to be between 19 and 26 kb (). Although the phylogenetic status of flavi-like viruses is not clear, they fall among flavivirus-jingmenvirus, pestivirus, and hepacipegivirus clades based on amino acid sequences of the NS3 (helicase) and NS5 (RdRp) proteins (). De novo assembly using Trinity 2.1.1 () generated a contig of 27,542 nucleotides from a transcriptome library derived from Diaphorina citri collected in Florida. The deduced amino acid sequence displayed low amino acid sequence similarity (<40%) to the replicase proteins of flavi-like viruses using BLASTx (, ). An average coverage of 912× across the full-length of the contig was obtained by using BWA software (). The presence of this viral sequence in D. citri field collected samples from Florida was confirmed by reverse transcription (RT)-PCR. The nucleotide sequences of both ends of the genomic RNA were determined by rapid amplification of cDNA ends (RACE) using the SMARTer 5′/3′ RACE system according to the manufacturer’s instructions (Clontech, Mountain View, CA). The complete genome sequence of this new putative virus tentatively named Diaphorina citri flavi-like virus (DcFLV) is nonpolyadenylated and is 27,724 nucleotides (nt) in length. The 5′ and 3′ untranslated regions (UTRs) are 598 and 243 nt, respectively, flanking a predicted ORF which encodes a putative polyprotein of 8,960 amino acids. The putative polyprotein contains several conserved domains including DUF612 with an unknown function (nt 1,878 to 2,088), TonB periplasmic domain (nt 1,989 to 2,089), ATP-dependent RNA helicase (nt 2,664 to 3,041), DEAD-like helicase superfamily (nt 2,738 to 2,820), Helicase superfamily C-terminal domain (nt 2,874 to 2,997), and Fts-like methyltransferase (nt 6,811 to 7,035). The putative polyprotein of DcFLV was compared in the GenBank nonredundant protein database using BLASTp with an e-value of 10−3, indicating that the highest identity was with Gentian Kobu-sho-associated virus (GKaV) (query coverage 21%; identity 39%) (GenBank accession no. BAM78287), a flavi-like virus discovered in gentian plants showing kobu-sho syndrome (). A phylogenetic tree based on the amino acid sequences of flavivirus RdRp proteins placed DcFLV in a clade with GKaV and Hermitage virus (GenBank accession no. KU754512), a flavi-like virus identified in Drosophila immigrans (). Whereas, a phylogenetic tree based on flavivirus helicase proteins placed DcFLV in a clade close to the Wuhan centipede virus (GenBank accession no. KR902737), identified in centipedes (Otostigmus scaber and Scolopocryptops sp.) (). To our knowledge, this is the largest genome for a flavi-like virus identified to date. […]

Pipeline specifications

Software tools Trinity, BLASTX, BWA, BLASTP
Application Phylogenetics
Organisms Diaphorina citri
Diseases HIV Infections
Chemicals Nucleotides