Computational protocol: The interplay of various sources of noise on reliability of species distribution models hinges on ecological specialisation

Similar protocols

Protocol publication

[…] Model evaluation is a crucial step in model selection and assessing the accuracy of the prediction []. In general, model accuracy is measured mainly through evaluation and agreement metrics [,]. Evaluation metrics are widely used to measure model performance through assessing the ability of a model to distinguish between presence and absence locations correctly [,]. Agreement metrics, however, measure prediction reliability by assessing the spatial agreement between the “true” and predicted ranges taking into account the probability values of pixels. In other words, reliability can be used to inform how far the predicted ranges are from the truth or “reality” [,]. Using different evaluation metrics is strongly preferred when true absence data are unavailable, and also when the goal is to model potential distribution ranges rather than realized ranges []. Therefore, we calculated the area under the curve (AUC) of the receiver operating characteristic (ROC), as well as the True Skill Statistic (TSS) to evaluate the predictive performance of the models. The AUC value (a threshold-independent evaluation metric) ranges from 0 to 1, with values below 0.5 indicating performance no better than random, whereas a value of 1 indicates perfect performance []. TSS value (a threshold-dependent evaluation metric) varies from -1 to 1, where a value of 1 indicates perfect model performance, and a value lower than or equal to zero indicates a model performance no better than random []. In this study, we considered the models with either median AUC value ≥ 0.7 or median TSS value ≥ 0.4 as good models with usefully predictive distribution ranges (successfully able to discriminate the suitable from unsuitable areas) [,–]. We used the “Biomod2” R package [] to calculate the evaluation metrics (AUC and TSS) for each SDM internally as usually done in empirical studies (henceforth referred to as ‘standard AUC’ and ‘standard TSS’). Additionally, we evaluated the SDM by calculating AUC and TSS using independent data (presence and “true” absences) sampled from the true ranges (henceforth referred to as “independent AUC” and “independent TSS”). We calculated these independent metrics using the “accuracy” function in the “SDMTools” R package []. We compared the differences between the independent evaluation and 25 model evaluation metrics using one-sample Wilcoxon test using the “stats” R package []. To test whether the grid resolutions of the environmental predictors influenced model performance, we assessed the differences between the standard evaluation metric (standard AUC and standard TSS) values at the high and low grid resolutions for all models using two-sample non-parametric Wilcoxon test.We assessed the interaction of spatial resolution, SDM algorithm, positional accuracy, sample size, and species specialisation on SDM’s performance using generalised linear models. We fitted two models, first fitting the exponentially transformed AUC as a function of spatial resolution, the SDM algorithm, positional accuracy, sample size, and species specialisation. In a second model, we additionally included the two-way interaction of these factors. We used the Akaike Information Criterion (AIC) to select the most parsimonious model favouring a low AIC value []. [...] We measured relative agreement between “true” and modelled ranges by calculating their geographical niche overlap. We calculated Schoener's D index [] using the “nicheOverlap” function in the “dismo” R package []. The niche overlap value varies between 0 and 1, where the value of 0 indicates no overlap and value of 1 indicates complete overlap [,]. Additionally, we measured the absolute agreement between the “true” and modelled ranges through a pixel wise comparison using the Overall Concordance Correlation Coefficient (OCCC), a measure of agreement between two continuous datasets which were generated using two different approaches []. We computed the OCCC using the “epiR” R package []. The OCCC value varies between 0 and 1, with 0 representing 100% disagreement and 1 represents 100% agreement between the true and predicted ranges (See for details). […]

Pipeline specifications

Software tools SDMTools, dismo
Applications Miscellaneous, Phylogenetics