Computational protocol: Gene Capture by Helitron Transposons Reshuffles the Transcriptome of Maize

Similar protocols

Protocol publication

[…] The conserved 5′ and 3′ terminal ends of the experimentally determined Hel1 family of Helitrons were isolated () and subjected to multiple sequence alignments. The strict consensus pattern of nucleotides displayed in was used as a template to search the entire database of Zea mays BAC sequences (B73 inbred) downloaded from the Plant Genome Database (www.plantgdb.org/). A script was written in Python programming language using modules from the BioPython project to identify putative Helitrons. This program called HelRaizer, (secs.oakland.edu/helraizer) batch processes the input maize genome sequence and searches for sequences matching the terminal ends of the Helitrons. Correctly oriented 5′ and 3′ termini separated by 100–25,000 bp were identified and the intervening genomic sequence was labeled a putative Helitron. The identification of the Helitron-captured gene fragments was performed using BLASTX search against the nr/protein National Center for Biotechnology Information (NCBI) database. Batch alignment was performed and alignments matching gene fragments of >50 bp with at least 85% similarity were recorded as an instance of gene capture.Evidence for movement of each putative Helitron from the screen above was sought by searching the B73 genome for a paralogous locus lacking the Helitron. This was determined by processing a 1000-bp sequence flanking each end of the element (minus the Helitron sequence) through the BLAST alignment against the Z. mays BAC sequence. In addition, the B73 genome was searched for sequences exhibiting significant internal sequence identity to the putative Heliton. Putative Helitrons from each of these two screens were monitored for expression. The putative duplicate elements that also shared sequence identity in their flanking BAC sequences were deemed redundant and were removed from the collection.Expressed candidate Helitrons were identified by batch processing the putative Helitron sequences through the National Center of Biotechnology Information, NCBI (www.ncbi.nlm.nih.gov) BLAST (Basic Local Alignment Search Tool) analysis against the Expressed Sequence Tag (EST) database of Z. mays. Helitrons that had sequences aligning with the entire length of the EST sequences with at least 99% identity were assigned as candidates for expression of captured genes and were manually annotated and further pursued for experimental analysis. outlines the strategy used to discover Helitrons that display EST expression of captured host genes.Annotation and structure analysis of captured gene pieces was done by manual examination of the splice alignment of the Helitrons with their cognate ESTs and their putative protein products using the computer software GeneSeqer (deepc2.psi.iastate.edu/cgi-bin/gs.cgi) and SplicePredictor (deepc2.psi.iastate.edu/cgi-bin/sp.cgi), respectively (; ). […]

Pipeline specifications

Software tools Biopython, BLASTX, BLASTN, SplicePredictor
Databases PlantGDB
Application WGS analysis
Organisms Zea mays