Scientific Papers

Fine-scale maps of malaria incidence to inform risk stratification in Laos | Malaria Journal


Routine surveillance data

The primary dataset comprised of annual malaria case counts from 2017 to 2021 for 1243 health facilities in Laos. As 83 health facilities did not have spatial coordinates, the analysis was performed on 1160 geolocated health facility points (health facilities with longitude and latitude). Figure 1 displays the observed malaria cases at health facilities throughout Laos from 2017 to 2021. The number of reported malaria cases had considerably declined in Laos over the five years, from approximately 9300 cases in 2017 to less than 4000 cases in both 2020 and 2021, with a further reduction to 2305 cases in 2022 [8]. The data from 2019 to 2021 included a breakdown by P. falciparum and P. vivax across health facilities in Laos [see Figure 1 in Supplementary Information (SI)].

Fig. 1
figure 1

Observed malaria cases (P. falciparum and P. vivax combined) per year at health facilities across Laos. The observed malaria cases have considerably declined in Laos over the five years

Treatment-seeking propensity

The continuous treatment-seeking propensity was estimated using a distance-decay-based model [18, 19] to any health facility. Here distance was defined as travel time using a road and terrain-informed friction surface from the Malaria Atlas Project [20]. The decay model was compared against survey data from the Lao PDR Social Indicator Survey II 2017 [21] (see Figure 2 in SI). According to the survey, the national treatment-seeking rate was 58%, with province-specific rates ranging from 28 to 77.8%. The minimum and maximum values of the decay curve were then calibrated to better match the national average.

High resolution environmental covariates and model selection

The incidence model utilized a range of environmental covariates with a spatial resolution of 1 km \(\times\) 1 km, that are known to influence and impact malaria outcomes. The selection of covariates was informed by a comprehensive review that identified robust associations with malaria (see [22]). The covariates are described in detail in Table 1. A suite of static and dynamic covariates were used.

Table 1 List of covariates

To evaluate the performance of the modelling approach, an extensive cross-validation analysis was conducted. This involved testing different model combinations, including a full model comprising both covariates and spatial random effects, a model with covariates only, and a model with spatial random effects only.

Additionally, the performance of two different parasite models were assessed using the data from 2019 to 2021 which included a breakdown by P. falciparum and P. vivax. The first model simultaneously fitted P. falciparum and P. vivax, while the second model fitted these two parasites separately. By comparing the performance of these different models, the most effective approach for predicting and understanding malaria transmission dynamics was determined.

Overall, the modelling approach used here provided valuable insights into the environmental factors that contributed to malaria transmission and helped identify high-risk areas for malaria. Furthermore, the comparison of different model combinations and parasite models aided in improving the accuracy and reliability of predictions and ultimately informed more effective malaria control strategies.

Estimating catchment population

To calculate the appropriate incidence rates for the cases observed at health facilities, the specific catchment population for each health facility was estimated. Catchment population in this case was defined as the number of people likely to seek treatment at each facility. The population data were derived from High Resolution Settlement Layer (HRSL) representing population in Laos in 2018. District-level population estimates for \(2019-2021\) were used to adjust for HRSL population estimates using raking. In this analysis, a modified gravity style model was used to estimate these catchment populations based on travel time to the health facilities, using the latest global maps of travel time to healthcare facilities [20].

For the catchment model, the probability that an individual in the i-th pixel seeked treatment at the j-th health facility, \(\bar{p}(\text {pixel}_i \rightarrow \text {hf}_j)\), was modelled as the relative attractiveness of that facility, \(p(\text {pixel}_i\rightarrow \text {hf}_j)\), normalized in relation to the relative attractiveness of all accessed facilities, \(\sigma _i\), i.e.,

$$\begin{aligned} \bar{p}(\text {pixel}_i \rightarrow \text {hf}_j) = \frac{ p(\text {pixel}_i\rightarrow \text {hf}_j)}{\sum _{k \in \sigma _i} p(\text {pixel}_i\rightarrow \text {hf}_k)}. \end{aligned}$$

This relative attractiveness was modelled as the combination of an inherent decline proportional to the square of the travel time to that health facility, \(tt(\text {pixel}_i \rightarrow \text {hf}_j)\), and a catchment weighting, \(W_j\), intended to capture the variation in services available at each facility that would likely attract an individual seeking care, i.e.,

$$\begin{aligned} p(\text {pixel}_i \rightarrow \text {hf}_j) = \frac{W_j}{tt(\text {pixel}_i \rightarrow \text {hf}_j)^2}. \end{aligned}$$

The catchment weightings were treated as a model parameter to be inferred jointly with the clinical incidence rate during fitting, and a Bayesian prior of form \(\text {log }W_j \sim \text {Normal}(0, 0.25^2)\) was used to regularise their variation. Combining this attractiveness model with the treatment-seeking and population surfaces—denoted here as \(\text {TS}_i\) and \(\text {population}_i\), respectively—allowed the catchment population of the j-th health facility to be defined as

$$\begin{aligned} \text {CP}_{j} = \sum \limits _{i=1}^{N_{\text {pixel}}} \text {TS}_i \times \text {population}_i \times \bar{p}(\text {pixel}_i \rightarrow \text {hf}_j). \end{aligned}$$

In the i-th pixel, residents’ access to facilities was limited to those within a 3-hour (180 min) travel time. It was assumed that individuals generally would not bypass numerous nearby health facilities to visit a more distant one. This assumption was reflected in the model, particularly in areas with sparse health facility distribution, where the accessible set, \(\sigma _i\), included only the five nearest facilities. In contrast, in areas where several health facilities were within a 20-minute travel distance, \(\sigma _i\) consisted of these facilities plus the next four nearest ones. Catchments were modelled as static and did not vary on a monthly nor yearly basis, mainly owing to lack of information on dynamic activities of health facilities and treatment-seeking rates.

Incidence model

A Bayesian geostatistical framework was employed to represent the underlying incidence rate at each pixel, i, for each year, y, with the above catchment model connecting this latent risk surface, \(I_{i,y}\), to the available case data.

For each year, the matrix of covariates, \(\textbf{X}_y\), was formed by joining the collection of static covariates with the time-varying covariates matching that year. The linear predictor for the i-th pixel in year y was then formed by multiplication of the i-th row in \(\textrm{X}_y\) against an annual slope vector, \(\beta _y\). A fixed annual intercept term, \(c_y\) and an annual spatial offset term, \(f_{\textrm{offset},i,y}\), completed the formula for \(I_{i,y}\) as

$$\begin{aligned} \text {log }I_{i,y} = c_y + (\textbf{X}_y)^\prime _i\beta _y + f_{\textrm{offset},i,y}. \end{aligned}$$

The annual spatial offset term was drawn from a spatial Gaussian random field as

$$\begin{aligned} f_{\textrm{offset},\cdot ,y} \sim \textrm{GP}(\text {range}_{\text {offset}}, \text {scale}_{\text {offset}}) \end{aligned}$$

with the hyper-parameters assigned hyper-priors, \(\textrm{range}_{\text {offset}} \sim \textrm{Normal}(1,0.5^2)\) and \(\textrm{scale}_{\text {offset}} \sim \textrm{Normal}(-1,0.5^2)\). A modest shrinkage penalty was placed on the annual slope vector via the prior choice, \(\beta _y \sim \textrm{Normal}(0,1^2)\).

The expected cases at the j-th facility in year y, were computed by summation over all pixels of the product of the latent incidence surface with the catchment model; namely,

$$\begin{aligned} \mathrm {expected\ cases}_{j,y} = \sum _i I_{i,y} \times \text {TS}_i \times \text {population}_i \times \bar{p}(\text {pixel}_i \rightarrow \text {hf}_j). \end{aligned}$$

To allow for over-dispersion relative to the nominal Poisson sampling distribution for case data, the annual case totals observed at each facility were drawn from a Negative Binomial distribution parameterized as

$$\begin{aligned} \text {cases}_{i,y} \sim \text {NegBinom}\left( \text {mean} = \mathrm {expected\ cases}_{i,y}, \text {variance} = \text {mean}\times (1+\sigma ) \right) \end{aligned}$$

with \(\sigma\) assigned a hyper-prior, \(\log \sigma \sim \text {Normal}(-1, 1^2)\).

Model fitting was performed in the Template Model Builder (TMB) and Integrated Nested Laplace Approximation (INLA) packages [32, 33] for R using a Laplace approximation over the random field components and the (logarithm of) catchment attractiveness weights, with a Multivariate Normal approximation in the remaining hyper-parameters centred on the empirical Bayes estimator. Model uncertainty was derived from 300 samples drawn from a Laplace approximation of the posterior and quantified using the standard deviation of the posterior distribution, as well as threshold exceedance and non-exceedance probabilities.

Data for the years 2019 to 2021 provided information on species breakdown to P. falciparum and P. vivax cases for all ages. The methodology described above was applied independently to each species, generating incidence maps tailored to individual species and averaged by year. To explore overlapping risk profiles, these species-specific incidence maps were superimposed, enabling a comprehensive assessment of risk both for individual species and their collective impact. The model validation outcomes, tables of goodness-of-fit measures, and diagnostic figures were included in the Supplementary material. These resources were intended to aid readers in evaluating the effectiveness of the geospatial model.



Source link