Computational protocol: The DNA of coral reef biodiversity: predicting and protecting genetic diversity of reef assemblages

Similar protocols

Protocol publication

[…] Moran's I calculated with R packages ape and ncf [] revealed no large-scale or small-scale positive spatial autocorrelation of AR across all sites, or NWHI (p = 0.31) and MHI (p = 0.99) subsets of sites. The same was true for diversity metrics of fishes (p = 0.16) and corals (p = 0.91). After assembly of all seascape data (electronic supplementary material, dataset S4), two islands were excluded from all analyses (Necker and Gardner), because they were missing data for several key seascape factors (electronic supplementary material, ‘Methods’). When using composite AR, Lisianski, Kaho‘olawe and Lana‘i were also excluded due to low sample size (n < 12 species), leaving 13 islands in the analysis.Redundancy analysis (RDA) with the R package vegan [] was used to visualize variation in species' correlations of AR to the seascape predictors and assess the influence of species traits on this variation. A set of species traits, summarized in , was published previously [] (electronic supplementary material, dataset S5). Pearson's r values describing each species's correlation of AR to each seascape factor were the dependent variables (electronic supplementary material, dataset S6). A partial RDA was also performed, using the sample size (i.e. number of islands), marker type and total marker diversity for each species as covariates, but these covariates lacked influence (electronic supplementary material, dataset S7). RDA was complemented by AICc model selection of linear models built with species and sampling traits to determine which traits most parsimoniously explained which species showed high or low correlations of AR to seascape factors. Congruence of spatial patterns of AR across species was gauged by Pearson's correlation coefficient for each species's AR spatial pattern regressed against the composite AR pattern of all species, for species sampled at more than five islands (n = 34). A one-sided t-test indicated whether species tended to positively correlate with composite AR, to assess a community-level trend towards congruence as in []. Sensitivity of congruence to sampling was assessed (electronic supplementary material, ‘Methods’). To assess the roles of life-history, sampling and genetic traits on congruence, these Pearson's r values were regressed against species traits in in linear AICc-based model selection using with JMP Pro v. 11.The same model selection procedure was also used to assess relationships of composite AR to physical and ecological seascape factors in . For comparison, a similar model selection procedure was implemented for the individual AR values for all species-by-marker-by-island combinations available (n = 421) using a linear mixed model which designated the species-by-marker label as a random effect (electronic supplementary material, dataset S8). The latter tests the aggregated response of individual species, which does not have to be the same as the response of the aggregated data (composite AR). Variation was high in AR values across species, and in which species were sampled across islands; the two modelling approaches address these issues differently but otherwise test the same hypothesis with the same response variable. The composite mean uses rarefaction to help standardize sampling variance across islands. The mixed modelling approach uses the species-level data to incorporate the variance across species and also the possibility that each species is drawn from its own distribution. Because of the large increase of parameters that need to be estimated, power is lower, but assumptions are fewer.For both model selection procedures, data gaps for coral species richness and M. capitata genetic diversity estimates required omitting these predictors from model selection. Fish and coral species richness and wave disturbance were omitted due to colinearity with other factors (electronic supplementary material, ‘Methods’; electronic supplementary material, table S1). Models were limited to one to three terms for model comparison to reduce model number given the small sample size of composite AR (n = 13 islands). Top models were defined as ΔAICc < 2.0, where ΔAICc is the difference in AICc score from the model with the minimum observed AICc score. Model selection was repeated for regional subsets (i.e. seven islands in the NWHI and six in the MHI), motivated by the many differences between these regions that might produce distinct population genetic and ecological dynamics. […]

Pipeline specifications

Software tools APE, JMP Pro
Applications Miscellaneous, Phylogenetics, Population genetic analysis
Diseases Pulmonary Fibrosis, Genetic Diseases, Inborn