Computational protocol: Detecting cancer clusters in a regional population with local cluster tests and Bayesian smoothing methods: a simulation study

Similar protocols

Protocol publication

[…] For this simulation study, lung cancer (ICD-10: C34) cases occurring in men and women in the age group between 40 and 79 years were chosen as sample data. Spatial cancer risk surfaces were constructed by arbitrarily defining two artificial cluster areas at the level of the census tracts. Within these cluster areas, two magnitudes of risk elevation were applied such that the lung cancer risk was computationally set to be either two- (RR1 = 2.0) or four-fold (RR2 = 4.0) as high as the observed risk. The two risk areas were nested within larger communities. The northern cancer cluster (encompassing 6 of the total 50 census tracts in that community) had more rural characteristics, that is, a larger area and lower population density. The second cancer cluster was generated in the south (encompassing 37 out of a total of 99 census tracts composing the entire community) with more urban characteristics, that is, a smaller area and units with higher population density (Figure ).The expected numbers of cancer cases (Ei) per census tract were estimated employing the age-standardized incidence rate for lung cancer as obtained from the database of the epidemiological cancer registry of North Rhine-Westphalia []. The observed cases (Oi) were sampled from the four constructed risk surfaces (urban & rural cluster with either RR1 = 2.0 or RR2 = 4.0) as geocodes using an inhomogeneous Poisson point process (Figure ). 1000 realisations of the process for each cluster and RR magnitude were generated using function rpoispp from the R package spatstat. These realisations (Oi) were aggregated within census tracts and communities, respectively, and used for the subsequent local cluster tests and Bayesian smoothing methods (Figure ). [...] Local cluster tests aim to provide information about the spatial location of suspected clusters. The statistical concept behind the local cluster tests rests on the assumption that disease risk is constant across the study population (constant risk hypothesis or null hypothesis, implying identical risk for each individual). The standardized incidence ratio (SIR), defined as ratio of observed to expected cases, is commonly used as a measure of relative disease risk. A constant risk implies that SIR = 1.0. A SIR value that is significantly larger than 1 indicates a disease cluster. Two types of local cluster tests were applied: The first is based on local measurements of spatial autocorrelation (local Moran’s I) and the second is based on variously defined windows that scan the study region for elevated disease risk (Kulldorff spatial scan statistic; Besag & Newell) []. We applied the methods provided in the R packages DCluster (version 0.2-2) [] and spdep []. For local Moran’s I, Kulldorff spatial scan statistic, and the method of Besag & Newell [], all computations were performed with R version (2.13.1) []. [...] In hierarchical Bayes methods, the parameters describing the distribution of thetai are not estimated from data but are further specified through hyperpriors. The hyperpriors describe the distribution of the priors and are estimated by means of MCMC-simulations. These are used to derive the posterior distribution of thetai. The BYM-model [] split the variation of the thetai into two components: a correlated random term (ui) that depends on values from the neighbourhood (= correlated heterogeneity), and an uncorrelated random component (vi) which describes the heterogeneity (= uncorrelated heterogeneity) in the study area. The BYM model was implemented in the WinBUGS software using MCMC methods, in particular Gibbs sampling []. A burn-in of 20 000 iterations was performed and the posterior distribution was obtained using a sample of 10 000 iterations. The point estimates of theta from the four Bayesian models were used in the subsequent (cluster) evaluations. […]

Pipeline specifications

Software tools Dcluster, Spdep, WinBUGS
Applications Drug design, Miscellaneous
Organisms Homo sapiens
Diseases Blood Platelet Disorders, Lung Neoplasms, Neoplasms