## Similar protocols

## Protocol publication

[…] A database of British grasshopper and related species traits covering habitat and resource use, life history, dispersal ability, and distribution was compiled to address the hypotheses of factors affecting range change outlined in the introduction (Tables and ).To avoid potential problems with collinearity between explanatory variables, correlations between traits were investigated using a method employed by []: Pearson's correlation tests were calculated between continuous variables, Kendall's correlation tests between categorical variables, and Kruskal-Wallis tests between continuous and categorical variables. A sequential Bonferroni correction was applied in order to account for the large number of tests conducted (55) []. No significant correlations were found.To investigate the relationships between distribution changes and species traits we fitted Generalised Linear Models (GLMs) with Gaussian errors, using first “uncorrected range change” as dependent variable and then repeating analyses with “corrected range change” values. In order to understand the relative importance of different traits in driving distribution changes we took a multimodel inference approach, fitting all possible combinations of trait variables, selecting a set of top models by Akaike information criterion (AIC), and averaging the coefficients and standard errors of trait variables across these [, ]: We fitted GLMs for all 2,047 combinations of the 11 explanatory trait variables and calculated AIC values and differences to the best model with the lowest AIC (ΔAIC). Models with ΔAIC < 4 were selected as the top set for which there was considerable statistical support []. The percentages of top models in which each trait occurred were then calculated. In order to measure the relative importance of each trait, AIC values were transformed to “Akaike weights” [, ], and using these weights, means of trait coefficients across top models were calculated with the “weighted.mean” function in R. Weighted mean standard errors of coefficients were calculated using the following formula adapted from []:
SE(ball)= ∑i=1nwi[SE(bi)]2+[bi−ball]2
where n is the number of models, wi is the Akaike weight of model i, SE(bi) is the standard error of coefficient b in model i, and ball is the weighted mean of all coefficients b. Akaike weights were scaled so that their sum equalled 1 for each predictive variable, i.e. wi values were divided by the sum of Akaike weights of all models which included the variable whose mean standard error was to be calculated. Confidence intervals (CI) across top models were then calculated by multiplying the weighted mean standard errors with factors of 1.96 (95% CI), 2.58 (99% CI) and 3.29 (99.9% CI) and adding / subtracting them from the weighted means of coefficients. Significance levels were assigned accordingly where the values did not span zero (* for 95% CI, ** for 99% CI, *** for 99.9% CI). Throughout this part of the analysis, range change values calculated from the largest set of “surveyed squares” (with a minimum of one species recorded in both time periods, i.e. with the minimum adequate level of recording effort) were used as our primary measures, and results were then compared to those obtained with the other three sets of “surveyed squares” i.e. higher levels of recording effort, in order to assess the robustness of our findings.All analyses of relationships between distribution changes and species traits were also repeated with the exclusion of two species with particularly large range change values, Conocephalus discolor and Metrioptera roeselii (see below).To assess the validity of using Gaussian GLMs with our data we plotted normal quantile-quantile plots of residuals of top sets of models and carried out Shapiro-Wilk tests for normality [, ].To assess the overall goodness-of-fit of top models the amount of deviance accounted for by each model was calculated:
D2= [null deviance − residual deviance] / null deviance
This was adjusted to take into account the number of observations, i.e. species (s) and the number of predictors, i.e. traits (t) [, ]:
adjusted D2= 1−[(s−1)/(s−t)]*[1−D2]
To give an overall fit of the top models, adjusted D2 values were averaged, weighted by AIC weights as with the model coefficients before.Fitted values of range change were extracted for the top models, and means weighted by model Akaike weights were calculated.We investigated the potential influence of phylogenetic autocorrelation, i.e. non-independence of trait values due to relatedness between species, based on a method employed by []. A “working phylogeny” [] of the study species was drawn based on the taxonomy of the Orthoptera Species File [] in the programme “**Treemaker**” [] with all branch segment lengths assumed to be equal (). A phylogeny may be approximated in this way based on taxonomic divisions where the true phylogeny is not (fully) known; assuming equal branch lengths and allowing more than two daughters per node reflects the lack of comprehensive detailed knowledge about the order of splitting []. The “working phylogeny” was exported in “nexus” format and imported into R. The expected covariance between species was calculated using the “vcv” function in the R package “**ape**” and Moran’s I autocorrelation indices were calculated on the residuals of each of the top models using the “Moran.I” function. Moran’s I can take values from −1 (perfect negative autocorrelation) to +1 (perfect positive autocorrelation), with values around zero indicating independence of residuals between related species [–]. Where Moran’s I indices were significant or near-significant, phylogenetically corrected models were fitted using the “pgls” function in the R package “caper” [, ]; as with GLMs before, models were initially fitted to all possible combinations of trait variables and results were then averaged across a set of top models with ΔAIC<4. […]

## Pipeline specifications

Software tools | Treemaker, APE |
---|---|

Application | Phylogenetics |

Diseases | Tooth Discoloration |