Computational protocol: Evolution of complexity in the zebrafish synapse proteome

[…] For the functional classification with high-level categories, Ensembl mouse identifiers for mouse proteins or orthologous mouse identifiers for zebrafish proteins were integrated with functional annotation from the Ingenuity knowledgebase, using IPA (QIAGEN Redwood City Information of predicted cellular localization (cytoplasm, extracellular space, nucleus, plasma membrane and other) and IPA protein types (cytokine, enzyme, G-protein coupled receptor (GPCR), ion channel, kinase, peptidase, phosphatase, transcription regulator, translation regulator, transporter and other) were obtained. Counts, comparisons and plots of proteins within each species and category were conducted using R. To reproducibly call orthologous sequences between species for a large data set, the Ensembl biomart database was queried using the bioconductor package biomaRt. All orthologues were obtained and counted in each species to determine orthology type, either 1:1, 1:many, many:1, many:many or unique to a species (no orthologue known). For the analysis of all protein families, we used Ensembl identifiers of mouse, zebrafish and mouse orthologues of zebrafish proteins to retrieve Ensembl Protein Families from Ensembl database version 81, containing the last update of the zebrafish genome. Families with an unknown function were not considered. [...] Four whole brains were removed and placed in RNAlater before RNA isolated using the Qiagen RNeasy Plus Mini Kit, and 150 bp paired end Illumina sequencing was conducted at Barts and the London Genome Centre. Adapters were removed from the raw reads using Cutadapt. TopHat2 was used as a wrapper for the alignment programme Bowtie2 to map sequence reads to the reference genome (Danio_rerio.GRCz10.86 obtained via Ensembl), reads were converted into counts using HTSeq and converted to TPM. For comparison of each gene, the average expression (mean TPM from four whole-brain biological replicates) was determined for each gene. Where multiple transcripts for a given gene are known, these were combined to result in a single mean TPM per gene. […]

Pipeline specifications

Software tools IPA, BioMart
Organisms Homo sapiens, Danio rerio
Diseases Disease, Genomic Instability