Computational protocol: Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

Similar protocols

Protocol publication

[…] For reference to EIAV integration in the horse genome, 562 human genome sequences flanking the 3′ LTR for the retrovirus, which included 146 integration sites for an HIV-1 transfection vector (DQ498202 to DQ498347) and 416 sites for an EIAV vector (DQ498348 to DQ498763), were downloaded from GenBank and processed using our analysis pipeline in this study (). These vectors were produced using a three-plasmid lentiviral vector system, which included an EIAV-based transduction vector and vectors encoding the EIAV gag/pol and the VSV-G envelope. These vectors were then co-transfected into 293T cells as described in a previous report []. The number of sequences in GenBank was less than that published by Hacker et al. [], which had been analyzed and mapped to the human genome (University of California, Santa Cruz (UCSC), assembled in May 2004). In this study, the human genome database, which has been recently updated, was used to analyze and map these sequences (UCSC, assembled in December 2013).The Blast-like Alignment Tool (BLAT) program (Available online: http://genome.ucsc.edu) was used to analyze the sequences, which were aligned and mapped to the horse genome (UCSC, assembled in September 2007) (Available online: http://genome.ucsc.edu/). A site was deemed to be an integration site for EIAV if it complied with the following criteria: (1) must be located between the adaptor sequence and the EIAV LTR; (2) must have at least 95% homology to the horse genome sequence and should be a single horse genetic locus; (3) the junction must consist of a horse genomic sequence and 371 bp of the 5′ terminal of the EIAV LTR, within which “TGTGGG” must be used as the initial sequence based on the sequence of EIAVFDDV13; and (4) must have a minimum length of 20 bp, which was the lowest limit recognized by the BLAT program. The sequences that were used for reference were aligned and mapped to the recently updated human genome (UCSC, assembled in December 2013) (Available online: http://genome.ucsc.edu/).The BioMart program (Ensemble Genes 79, Available online: http://asia.ensembl.org) was used to determine whether the integration sites were located in coding genes of the September 2007 horse genome draft. Additional information regarding transcription initiation and termination sites of the coding genes was also obtained from the BioMart program. The reference mRNA sequence (RefSeq mRNA) of the September 2007 horse genome draft was downloaded from the UCSC genome annotation database (Available online: http://www.genome.ucse.edu). The base frequency around the integration sites was determined using the WebLOGO program (Available online: http://weblogo.berkeley.edu/). The repetitive elements around the integration sites were determined by the RepeatMasker analysis of the September 2007 horse genome draft (Available online: http://www.repeatmasker.org/). […]

Pipeline specifications

Software tools BLAT, WebLogo, RepeatMasker
Application Genome data visualization
Organisms Human immunodeficiency virus 1, Equus caballus, Homo sapiens, Mus musculus
Diseases Anemia, Ataxia Telangiectasia, Equine Infectious Anemia, Fetal Diseases, HIV Infections, Sleep-Wake Transition Disorders