Similar protocols

Protocol publication

[…] Sequences for protein alignment were either identified by Basic Local Alignment Search Tool (BLAST) search or predicted from genomic and cDNA sequencing data (Arabis alpina, Ae. arabicum, and C. papaya) and accession numbers can be found in supplementary table S2, Supplementary Material online. For the phylogenetic analysis of CO promoter, intergenic region of CO (including its start codon) and the upstream gene COL1 were used for analysis (except for ThCOL where no COL1 is present). Arabis alpina CO promoter was assembled from short reads identified by BLAST search of the preliminary genome assembly (MPIPZ, Cologne). Sequences were assembled using SeqMan Pro (DNASTAR Lasergene Version 8.0.2). CO Promoters from Capsella rubella, Nasturtium officinale, and Sisymbrium officinale were amplified from genomic DNA (kindly provided by Markus Berns, MPIPZ, Cologne) using degenerate primers that anchor in COL1 and CO, respectively (COL1 Fw: TGACACMGGATATGGAATTG; CO Rv: TGGCAGAGTGRACTTGAGCA). The intergenic region between COL1 and CO orthologs was subsequently sequenced by primer walk. The CO promoter of Ae. arabicum was identified by BAC (bacterial artificial chromosome) screen using an CO probe (BAC probe Fw: ACTGGTGGTGGATCAAGAGG; Rv: TCTTGGGTGTGAAGCTGTTG) and sequenced by primer walk from two independent BACs. Accession numbers for CO promoters used in the alignment can be found in supplementary table S3, Supplementary Material online. For the comparison of cis-element composition, 5′-sequences (4,000 bp including the start codon) were identified in publicly available genomes. Accession numbers are given in supplementary table S4, Supplementary Material online. The accession numbers for the comparison of whole loci can be found in supplementary table S5, Supplementary Material online. To obtain Ae. arabicum, whole loci sequences paired-end next generation sequencing reads of Ae. arabicum genomic sequence (, last accessed May 19, 2015) were de novo assembled using CLC Genomics Workbench (CLC Bio) with standard parameters. Desired genes (AeCO, AeCOL2) were subsequently identified in the Ae. arabicum genome running BLAST on a local database. [...] Phylogenetic tree for relationship of subgroup Ia COL protein sequences was inferred using the Neighbor-Joining method in MEGA5 with bootstrap values from 10,000 replicates (; ). Substitution rates were calculated from a codon alignment of the respective CDS followed by calculation of substitution rates corrected by Jukes–Cantor method using SNAP v1.1.1 (, Chapter 4, p. 55–72). CO promoter and COL locus alignments were performed using LAGAN algorithm on mVISTA (; ). Output is displayed as conservation in per cent in 100-bp sliding windows to the corresponding reference sequence. Alignments are displayed using ClustalX 2.0.11 (). For motif identification in CO promoters, data sets were submitted to MEME motif identification tool (Version 4.6, ) and searched for 6–10 bp motifs that can occur in any number. WebLogo was used to present conserved motifs (Version 2.8.2; ). For statistical comparison of cis-element number, the − 4 kb sequence beginning with the start codon was used from the indicated genes. Cis-elements were counted using customized Perl script (supplementary material, Supplementary Material online, Geo Velikkakam James, MPIPZ, Cologne). Shuffle control data sets were generated with the algorithm “shuffle” (100 times, Version 1.02, , p. 281). To test for statistical differences in cis-element analysis and flowering time, one-way analysis of variance and Tukey test were performed using SigmaStat 3.5 (Systat Software). […]

Pipeline specifications