Computational protocol: FIGL1 and its novel partner FLIP form a conserved complex that regulates homologous recombination

Similar protocols

Protocol publication

[…] Identification of putative orthologs of FLIP, FIGL1, DMC1 and RAD51 was performed following different strategies based on the sequence divergence and the existence of paralogs. Since FLIP sequence diverged significantly during evolution without detectable paralog, 3 iterations of HHblits [,] against the uniclust30_2017_04 database were sufficient to retrieve 139 sequences belonging to plants and metazoa species. To get NCBI entries of those proteins, a PSSM generated from the recovered alignment was used as input of a jump start PSI-blast [] against the eukaryotic refseq_protein database []. For DMC1 and RAD51, reciprocal best hits of blast searches were used to identify the most likely ortholog in every species. First, DMC1 in H. sapiens and S. cerevisiae sequences were blasted against the refseq_protein database to gather a set of DMC1 candidates. Each of these candidates was reciprocally blasted against the protein sequences of six fully sequenced genomes wherein DMC1 and RAD51 genes could be unambiguously identified and which were chosen spread over the phylogenetic tree (H. sapiens, S. cerevisiae, C. reinhardtii, T. gondii, P. falciparum, T. cruzi). Detection of a DMC1 ortholog was considered correct when one of the 6 DMC1 genes was spotted out as best hit with an alignment score at least 10% higher than that of the second best hit, supporting its significantly higher similarity to DMC1 than to RAD51. The same strategy was followed to assign RAD51 orthologs. In the case of FIGL1, large number of paralogs such as spastin, fidgetin, katanin or sap1-like proteins render the global analysis more complex. A phylogenetic tree was initially built focused on the AAA ATPase domain of 600 protein sequences belonging to fidgetin, spastin, katanin, sap1 and VPS4 families. They were aligned using mafft einsi algorithm [] and tree was built with PhyML [] using the LG model for aminoacid substitution and 4 categories in the discrete gamma model. This prior analysis helped to delineate which homologs could be considered as orthologs of H. sapiens and A. thaliana FIDGETIN-like proteins. For the 373 fully sequenced species presented in , reciprocal blast best hit searches were then performed to retrieve the Fidgetin-like ortholog when present. FIGL1 ortholog candidates were retrieved from a blast of H. sapiens and A. thaliana FIGL1 sequences against the refseq_protein database and were assessed by reciprocal best hit searches using these candidates as query against genomes of H. sapiens and A. thaliana. Detection of FIGL1 orthology was assessed if best hit was FIGL1 sequence with an alignment score at least 10% higher than that of the second best hit. For a limited number of species, orthologs were suspected but not identified in any of the NCBI databases. Targeted blast searches where then performed on their genomes using the Joint Genome Institute (JGI) server to further probe the existence of these orthologs which could be detected in 7 cases. All the NCBI and JGI gene entries are listed in and can be easily retrieved from the interactive tree (http://itol.embl.de/tree/132166555992271498216301) [] by passing the mouse over the species names. […]

Pipeline specifications

Software tools HHblits, BLASTP, MAFFT, PhyML, iTOL
Databases Uniclust
Applications Phylogenetics, Amino acid sequence alignment
Organisms Arabidopsis thaliana, Homo sapiens