Computational protocol: Geographic distribution of vestibular schwannomas in West Scotland between 2000-2015

Similar protocols

Protocol publication

[…] Before analyzing spatial dependency in the VS period prevalence, we needed to determine whether male and female VS case locations could be reasonably pooled into a single VS dataset. We tested for spatial interaction between male and female VS case locations using bivariate point pattern analyses based on Ripley’s K []. The bivariate point pattern analyses used the unit postcode VS case coordinates within a Cross-L function to test for spatial independence between the locations of male and female VS cases (). This was followed by a Difference-K (Kmale-Kfemale) test to detect any sex-based conditional clustering or dispersion in West Scotland (). Both the Cross-L and Difference-K analyses used the R language with the Spatstat package v1.47–0 ()[]. For the Cross-L, our interest was in determining if VS cases for males and females, taken separately, exhibit significant attraction and so a one-tailed pseudo-p value was produced. For the Difference-K analysis, our interest was whether either of the patterns of male and female VS cases exhibited conditional independence and so a two-sided pseudo-p was calculated. We used 199 Monte-Carlo simulations for significance testing of each measure. Because the Monte-Carlo simulations were conditional on the same geospatial polygon layer and number of points, and there was no need to compare results with other regions, corrections for edge effects in the point pattern analyses were deemed unnecessary. Results were represented graphically as pointwise simulation envelopes to illustrate the possible outcomes of our hypotheses tests for spatial interaction at any given prespecified distance [, , ]. Statistical significance of the Cross-L and Difference-K functions across the distance interval of function evaluation (0 to 80 km at 160 m intervals) were determined via a Diggle-Cressie-Loosmore-Ford (DCLF) test () [–]. All K-function based analyses used Euclidean distance in a UTM Zone 30 N coordinate system. [...] Our model of spatial dependence for the districts and zones was based on spatial interaction. We defined spatial interaction as ferry connections between islands and the mainland or between parts of the mainland in addition to first-order Queen’s case contiguity for mainland and/or island spatial units () [].Our definition of interaction is a reasonable basis for exploring spatial autocorrelation in VS PP. However, given the current lack of knowledge regarding the spatial processes governing VS in West Scotland, and recognizing the sensitivity of spatial autocorrelation measures to definitions of spatial dependency, we also computed all spatial autocorrelation measures using an alternate model of spatial dependence based on a k = 4 nearest neighbor adjacency matrix (). Both spatial dependence schemes were row-standardized for use in spatial autocorrelation tests. As the processes governing the spatial distribution of VS in Scotland become disambiguated, more appropriate models of spatial dependency that account for spatial interaction along different dimensions could be formulated and these may produce different results [].Global Moran’s I was used to test spatial autocorrelation in PP at both levels of spatial aggregation (zone and district). The statistical significance of global Moran’s I was computed using 999 permutations of PP values across all spatial units (). Our primary interest was detecting if global positive spatial autocorrelation exists in PP at each level of spatial aggregation and so we calculated one-sided pseudo p-values. All calculations of global Moran’s I were undertaken using the spdep 0.6–8 package in R[, ].Functions were written in R to calculate univariate and bivariate local Moran’s I (). Univariate local Moran’s I functions were validated against output from the PySAL python library and the bivariate measure was validated against the output from Geoda 1.4.6 [, ]. We used the R language for these calculations because of the flexibility it offered for simulations and modification of spatial neighbor matrices. For these local measures, our primary interest was to identify spatial units that exhibit unusual differences of PP values from their neighbors. A spatial unit with significant differences from neighboring values is called a cluster center. A cluster center exhibits positive local spatial autocorrelation when PP values surrounding the spatial unit are similar, for example, either a high value of PP surrounded by high values of PP (high-high) or a low value surrounded by low values (low-low). A cluster center exhibits negative local spatial autocorrelation when PP values surrounding the spatial unit are dissimilar, for example, a low PP value surrounded by high PP values (low-high) or, conversely, a high value surrounded by low values (high-low). When there is no systematic relation between the PP value within a district and its neighbors, then there is no local spatial autocorrelation present. In the univariate case, the value of local Moran’s I, at the cluster center, represents the correlation between PP and itself within the surrounding spatial units at the same spatial scale—either district PP with district PP, or zone PP with zone PP. In the bivariate case, the value of local Moran’s I at the cluster center represents the correlation between PP at the district scale to those PP values assigned to the surrounding districts from the zone level (district-zone). As such, the bivariate measure illustrates if a given district exhibits stability in PP across spatial scales. The statistical significance of the local measures were based on 9999 conditional permutations to derive the local pseudo-p values () [, , ]. We present both univariate and bivariate local Moran’s I results at the standard Type I error rate of α = 0.05 based on the pseudo-p values derived by conditional permutation, however, we did not correct for multiple testing. Therefore, a locally significant result reported herein should be considered an unusual occurrence but not necessarily a statistically significant result. Hence, local Moran’s I analyses are used in an exploratory manner, the aim of which is to identify potential districts with outlying values of PP either at the district level or between the district and zone. Alternatively, the univariate and bivariate global Moran’s I measures are statistically valid tests.The ten zones represented one of many possible geographic zonations of the district postal geography for West Scotland. These zones are defined by the National Health Service (NHS) in Scotland. Because of the small sample size of VS cases, zone-level PP will have the most stable estimates and these zones provide meaningful boundaries from administrative perspective. Conversely, as with most administrative boundaries, the zone boundaries have no a priori relation to the occurrence of VS. As such, inferences based on the results of using the zones may suffer from analytical biases induced by the modifiable areal unit problem (MAUP) []. Specifically, the statistical significance or lack thereof, of spatial dependency in VS PP calculated using global Moran’s I could simply be due to the location of the boundaries and heterogenous area distribution among the ten zones. This is collectively known as the scale and zoning effect of the MAUP. To assess the influence of the combined scale and zoning effect of the MAUP on the results of spatial autocorrelation in PP at the zone scale, 200 random aggregations of the districts were created. A random aggregation was created by 1) randomly selecting ten postcode district polygon centroids; 2) creating a Voronoi tessellation using those ten centroids; 3) assigning the ten Voronoi polygon identifiers to the 312 district polygons; 4) aggregating the district polygons by the Voronoi polygon identifiers to create ten new randomly aggregated zones and at the same time, summing the population and number of VS cases in the random zones and calculating PP (). For each random zonation, global Moran’s I was calculated and pseudo-p values were derived 999 Monte-Carlo simulations. By comparing our observed pseudo-p value with the reference distribution created through this process of random zonation, we can better understand how the MAUP influences the likelihood that our observed spatial autocorrelation result and interpretation would have occurred due to the choice of using the NHS Health Boards geography rather than some other ten-unit zonation.To inform our results and interpretation of PP at the zone level, we examined where district level PP deviated from the surrounding zone level PP. Accordingly, we used bivariate local Moran’s I to assess the degree of scale invariance between the PP at the district and PP at the zone level following the methods of Nelson and Brewer (2017)[]. Using a spatial join within ArcGIS, each district PP value was paired with its corresponding PP value from within the containing zone. This process resulted in a bivariate dataset at the district scale, whereby, each district’s PP value also had the PP value for that district extracted from the zone level. The significance of bivariate local Moran’s I was based on 9999 conditional permutations of the zone VS PP values while holding the district values constant. Bivariate Moran’s I calculated between two spatial scales, in our case, allows for the identification of districts where PP is stable (scale invariant / stationary) or unstable (non-stationary) when aggregated from the district to the zone level []. Consequently, identifying those districts that are unusually different from their surrounding analysis zone PP values, indicates where zone level reporting and interpretations may be unduly influenced by the MAUP as a consequence of spatial aggregation. […]

Pipeline specifications

Software tools spatstat, Spdep
Applications Miscellaneous, Conventional fluorescence microscopy
Organisms Homo sapiens
Diseases Blood Platelet Disorders, Neuroma, Acoustic