Similar protocols

Protocol publication

[…] We sampled amino acid transporters from several aphid genera representing three subfamilies and multiple tribes across the Aphididae ( and ). Aphid species included A. pisum, Myzus persicae, Aphis nerii, Tamalia coweni, and Pemphigus obesinymphae. We consider these taxa as representative of aphids because they include tribes at a range of positions across the aphid phylogeny, including members of Pemphigini, which are usually supported as sister to the rest of aphids (). In addition, we sampled amino acid transporters from an outgroup of aphids, the grape phylloxera D. vitifoliae, and two Aphidomorpha outgroups (Pediculus humanus and Drosophila melanogaster; ).Myzus persicae and A. pisum data were generated from established isofemale laboratory lines. M. persicae RNAseq and differential expression data, generated in this study, came from laboratory clones G006, G002, and BTI Red (also known as USDA) (). Data for A. pisum, generated in previously published studies using RNAseq and qRT-PCR (; ; ; ), came from laboratory clones LSR1 (), 9-2-1 (), 5A (), and CWR09/18 (). A. nerii were collected from Asclepias spp. in Miami, FL (Miami-Dade county), Atlanta, GA (Dekalb county), and Minnesota. The Atlanta and Miami populations are maintained as isofemale clones in the laboratory of Patrick Abbot at Vanderbilt University. A. nerii were identified based on host plant and distinctive morphology. T. coweni were collected from various sites and host plants in Arizona, Nevada, and California, as reported by . P. obesinymphae were collected in the vicinity of Nashville, TN (Davidson county) on Populus deltoides subsp. deltoides and were identified by Patrick Abbot based on distinctive gall morphology. D. vitifoliae were collected at the vineyards of Château Couhins in Bordeaux, France on Vitis vinifera cv. Cabernet franc and were identified by aphid biologists at the French National Institute for Agricultural Research (INRA). An isofemale clone of D. vitifoliae (INRA-Pcf7) is maintained at INRA. Voucher specimens for T. coweni are deposited in the Smithsonian Institution Department of Entomology, the Canadian National Collection of Insects, and collections at Washington State University and California State University, Chico (). In addition, we annotated transcripts corresponding to cytochrome c oxidase subunit 1 (CO1) for A. nerii, T. coweni, P. obesinymphae, and D. vitifoliae to serve as identity vouchers. In brief, we used a protein sequence for A. pisum CO1 (Genbank ID YP_002323931.1) as a query in local TBLASTN searches against transcriptomes for the other aphids and D. vitifoliae. Top hits (all with e-value = 0.0) were used as queries in reciprocal BLASTX searches against the NCBI refseq protein database to confirm homology with A. pisum CO1. Reciprocal BLAST searches returned one CO1 sequence for each species except A. nerii, which had two CO1 sequences. Sequences are provided in supplementary file 1, Supplementary Material online. [...] Transcriptomes were sequenced for A. nerii and M. persicae. Total RNA was extracted from whole adult, asexual female A. nerii bodies and a combination of whole bodies, bacteriocyte, and gut for adult, asexual female M. persicae. Total RNA was sent to the Hussman Institute for Human Genomics (University of Miami Miller School of Medicine) for library preparation and paired end sequencing on the Illumina HiSeq platform. Raw RNAseq reads for A. nerii and M. persicae were deposited in the NCBI Sequence Read Archive (SRA) under BioProject PRJNA296778. Raw reads for T. coweni (a combination of paired end and single end reads) and P. obesinymphae (single end reads) were provided by Patrick Abbot, and are available on NCBI in the SRA (T. coweni accession numbers: SRX1305377, SRX1305445, SRX1305282, SRX1304838 []; P. obeiynymphae reads are stored under BioProject PRJNA301746). Reference transcriptomes were assembled for T. coweni, P. obesinymphae, A. nerii, and the M. persicae G006 clone. Reads for these taxa were filtered to a minimum quality score of 30 over 95% of the read, resulting in a combination of paired end and single end reads for A. nerii and G006. All reads from each taxon kept after the filtering process were assembled into a single reference transcriptome for each species in Trinity (7/17/14 release) () using the Blacklight system at the Pittsburgh Supercomputing Center. A fully assembled transcriptome for D. vitifoliae was provided by colleagues at INRA, for which raw sequencing reads are accessible via NCBI (BioProject PRJNA294954). The other three taxa in our dataset—A. pisum, P. humanus, and D. melanogaster—have fully sequenced genomes, from which we used amino acid transporter sequences that were annotated in a previous study (). [...] Using assembled transcriptomes, we annotated amino acid transporters in the Amino Acid-Polyamine-Organocation (APC) (TC # 2.A.3) and Amino Acid-Auxin-Permease (AAAP) (TC # 2.A.18) families from T. coweni, P. obesinymphae, M. persicae, A. nerii, and D. vitifoliaea as previously described (; ). In brief, we used a stand-alone PERL script underlying the ORF prediction available at http://bioinformatics.ysu.edu/tools/OrfPredictor.html, last accessed February 29, 2016 where transcripts were translated into the six reading frames. The translated transcripts were searched for functional domains that significantly matched (e ≤ 0.001) known APC, and AAAP families in HMMER v3.0 (; ). HMMER hits were verified through BLAST searches to the NCBI refseq protein database. Transcripts with significant similarity (e ≤ 0.001) to APC or AAAP sequences from D. melanogaster or A. pisum were selected for further computational processing.Transcriptomes generated many unique but similar transcripts identified as APC or AAAP members through HMMER and BLAST. We collapsed amino acid transporter transcripts into conservative sets of representative loci for each taxon using methods we previously developed (). For M. persicae, which has a draft genome sequence, we collapsed all transcripts that mapped to the same location in the genome. For remaining taxa, we followed a series of steps. First, we collapsed all transcripts with the same Trinity component number into the longest representative transcript. Next, we followed the methods we previously described (). In brief, we estimated the pairwise rate of synonymous substitutions (Ks) among transcripts that clustered together in preliminary phylogenetic analyses using the Goldman Yang method () in KaKs_Calculator v1.2 (). We collapsed all transcripts with a pairwise Ks value < 0.25, keeping the longest sequence to represent the locus. If two similar transcripts met the cutoff Ks of ≥0.25, but overlapped <50 bp, we collapsed the shorter transcript into the longer transcript to validate a conservative estimation of locus number. We chose the threshold Ks value (0.25) because we previously found that it slightly underestimates the number of true amino acid transporter paralogs in A. pisum, collapsing only three very recently duplicated APC paralogs (). Thus, this threshold is appropriate for conservative estimation of amino acid transporter paralogs in related species. [...] The differential expression of M. persicae amino acid transporters between bacteriocyte and whole body tissues was quantified using the transcriptome data from this study. M. persicae clones G006, G002, and BTI Red were treated as replicates. Differential expression analysis was conducted with the RSEM package (v.1.2.22) () and edgeR (v.3.10.2) from the Bioconductor package (). In brief, the processed forward RNAseq reads for all three M. persicae clones were mapped to reference transcripts in a strand-specific manner using bowtie2 (v.2.2.4) () and mapped reads were counted using PERL script rsem-calculate-expression.pl from the RSEM package. The counts from RSEM were scaled to the whole transcriptomes and normalized by relative log expression. The significantly differentially expressed amino acid transporters between bacteriocyte and whole insect were identified using edgeR negative binomial models. Four amino acid transporter sequences comprised several truncated, partial transcripts that supported full-length gene models in the M. persicae genome (Mper-APC09, Mper-APC12, Mper-AAAP06, and Mper-AAAP20; the M. persicae draft genome assembly is available at www.aphidbase.com, last accessed February 29, 2016). In these four cases, the differential expression analysis was performed by mapping raw RNAseq reads to the gene models instead of the transcripts. [...] Transcript sequences were translated to protein using Seaview (v.4.5.2) (), and protein sequences were aligned using MAFFT (v.7.158b) () using default parameters. Alignments were trimmed in TRIMAL (v.1.4) () using a gap threshold of 25%. Prottest (v.3.4) () determined the best-fit model of protein evolution to be either LG + G (APC family) or LG + I + G (AAAP) based on the Akaike Information Criterion. Maximum likelihood (ML) phylogenies were inferred for the APC and AAAP families in RAxML (v.8.0.26) (Random Axelerated Maximum Likelihood) (; ) using the best-fit model of protein evolution and the fast bootstrap option. Bootstrap replicate number was chosen by the bootstrap convergence criterion “autofc”.We further inferred Bayesian phylogenies in MrBayes (v.3.2) using WAG + G (APC family) or WAG + I + G (AAAP), as MrBayes does not implement the LG amino acid substitution model. Two independent runs, each with four chains, were run for one to five million generations, until the standard deviation of split frequencies between runs converged to <0.01. Appropriate parameter sampling and convergence were determined by visually inspecting trace files in Tracer (v.1.6) (). Tracer was also used to determine burn-in values of each dataset (10% of generations), which we discarded when constructing Bayesian consensus trees. In the figures presented here, we mapped ML bootstrap support onto Bayesian consensus trees using SumTrees (v.3.0) from the DendroPy package ().Although we annotated all amino acid transporters in the AAAP family, because of large divergence in that family (), we inferred the relationships among a reduced set of sequences corresponding to the “arthropod expanded clade”. The arthropod expanded clade consists of arthropod orthologs to the mammalian SLC36 family of proton-coupled amino acid transporters (; ). Two human SLC36 sequences (SLC36A1 and SLC36A2), previously shown to belong to the sister clade of the arthropod expanded clade () were used as outgroups. Outgroup sequences for the APC family are members of the sister clade of Na-K-Cl transporters (ACYPI001649, ACYPI007138) (). Untrimmed transcript sequences translated into protein as well as trimmed Bayesian and ML alignments of the APC family and reduced AAAP family (“arthropod expanded clade”) are provided as supplementary files 2–5, Supplementary Material online, and are also available by request from RPD. […]

Pipeline specifications