Similar protocols

To access compelling stats and trends, optimize your time and resources and pinpoint new correlations, you will need to subscribe to our premium service.


Pipeline publication

[…] egument protein genes, BPLF1, BOLF1 and BGLF1; other genes such as BVLF1, BFLF2 and BFRF3; and within the intergenic regions ()., Single nucleotide variations (SNVs) in the consensus HKNPC1 genome compared to the reported strains were extracted by cross_match ( After masking the repeat regions, a high density of SNVs was observed in the previously reported polymorphic EBNA2, EBNA3 and LMP1 loci (). The region where BMRF1 resided had low SNV density, with no SNV present from positions 67,659 to 69,061 (HKNPC1 coordinates). Pairwise alignment of HKNPC1 with the four reported strains was performed and visualized by mVISTA ( (). Multiple whole sequence alignment of HKNPC1 and the other four EBV subtypes were performed using MAFFT employing Gblocks to mask poorly aligned positions and divergent regions of the aligned sequences. Overall sequence similarities between HKNPC1 and the four other strains were high, reaching 98.6% (B95-8), 98.5% (GD1), 95% (GD2) and 96.6% (AG876), respectively. Expectedly, low similarity regions coincided with regions of high SNV density, exemplified in the polymorphic regions spanning EBNA2 and EBNA3A, -3B and -3C. The three type 1 EBV genomes, B95-8, GD1 and GD2, have higher sequence similarity with HKNPC, particularly in the regions spanning EBNA2 and EBNA3, indicating that HKNPC1 is a type 1 virus. Neighbour-joining trees constructed using software MEGA5 showed that GD1, GD2 and HKNPC1 are more closely related (). Gene trees generated from alignment of translated amino acid sequences of BZLF1, LMP1 and EBNA1 provided the same result (). Sequences of these genes were subsequently validated by dideoxy-based sequencing., HKNPC1 contains 1,589 single nucleotide variations (SNVs) and 132 indels when compared to the reference EBV genome (Accession no. NC007605). Of the 1,589 SNVs, 1,167 had a read depth of 100 or above. The remaining 422 SNVs with read depth less than 100 were verified by dideoxy-DNA sequencing. While 1,043 of the SNVs were found in coding sequence, none were located within the consensus TATA box or PolyA adenylation motifs […]

Pipeline specifications

Software tools mVISTA, MAFFT, Gblocks, MEGA