A Bayesian framework for distributed lag non-linear models: the {bdlnm} R package

Pau
Satorra

Biostatistics Support and Research Unit, IGTP

Marcos Quijal-Zamorano

Institute of Social and Preventive Medicine (ISPM), University of Bern

Joan Ballester

Institut de Salut Global de Barcelona (ISGlobal)

Miguel Ángel Martínez-Beneito

Departament d’Estadística i Investigació Operativa, Universitat de València

Marc Marí-Dell’Olmo

Agència de Salut Pública de Barcelona (ASPB)

May 26, 2026

Background

Motivation

  • Climate change is intensifying exposure to extreme environmental conditions: heatwaves, air pollution, and cold spells (among others) are becoming more frequent and severe

  • Environmental exposures rank among the strongest drivers of population health, often exerting a larger influence than genetic variation (Argentieri et al. 2025)

  • Understanding how environmental exposures effects healths is therefore essential for public health research

Distributed Lag Non-Linear Models (DLNMs)

  • The relationship between environment and health is complex: effects are typically non-linear and delayed, making them hard to capture with standard regression approaches

  • Environmental exposures rarely act instantaneously, their health impact unfolds over time:

    • A heatwave today may elevate mortality risk some days later

    • Air pollution accumulates biological damage across days or weeks

    • Cold temperatures can trigger cardiovascular events several days after exposure

Distributed Lag Non-Linear Models (DLNMs) (A. Gasparrini, Armstrong, and Kenward 2010) are the standard framework for capturing these dynamics. They simultaneously model two dimensions of risk:

  1. How risk changes across exposure levels (exposure-response)
  2. How risk evolves over time after exposure (lag-response)

DLNMs can be applied to any setting involving two time series and a delayed effect: environmental epidemiology, econometrics, pharmacology, and beyond

DLNMs

  1. Exposure-response: non-linear association between exposure value and health risk


  1. Lag-response: how the effect of a single exposure event evolves across subsequent lags

Usually, both dimensions are modelled simultaneously using splines (natural cubic or B-splines), producing a smooth bivariate surface over exposure and lag

DLNMs: Model formulation

For time-series count data (A. Gasparrini, Armstrong, and Kenward 2010):

\[Y_t \sim \text{Poisson}(\mu_t)\] \[\log(\mu_t) = \alpha + \underbrace{cb(x_t, \ldots, x_{t-L}) \cdot \beta}_{\text{cross-basis}} + \sum_{k} \gamma_k u_{kt}\]

Term Description
\(cb(\cdot)\) Cross-basis: tensor product of exposure and lag spline bases
\(\beta\) Coefficients of the cross-basis
\(u_{kt}\) (Time-varying) confounders (e.g. seasonality, trend)
\(\gamma_k\) Coefficients of the confounders
\(\alpha\) Intercept

The cross-basis encodes the full bivariate exposure-lag-response surface in a single model term

Cross-basis construction

The cross-basis combines the exposure-response basis and the lag-response basis through a tensor product:


Limitations of classical DLNMs

  • Datasets are becoming larger and more complex, with studies increasingly featuring granular spatial data across many locations

  • Classical frequentist DLNMs rely on a two-stage approach:

    1. Independent DLNMs are fitted at each location separately (first stage)
    2. Location-specific estimates are pooled via multivariable meta-analysis (second stage)

Key limitations

  • Ignores spatial structure: borrows no strength across neighbouring locations, discarding information about spatial variation in risk

  • Collapses the lag dimension: first-stage models typically reduce to a single lag or cumulative effect, discarding the full lag-response surface

  • Unstable in sparse data: independent models fitted in small areas with few events yield unreliable or unidentifiable estimates

Spatial Bayesian DLNMs (SB-DLNMs)

A Bayesian extension, Spatial Bayesian Distributed Lag Non-Linear Models (SB-DLNMs) (Quijal-Zamorano et al. 2024), was proposed to overcome these limitations, allowing reliable estimation of small-area lagged non-linear associations

\[Y_{it} \sim \text{Poisson}(\mu_{it}) \qquad \log(\mu_{it}) = \alpha_i + \underbrace{cb(x_{it}, \ldots, x_{it-L}) \cdot \beta_i}_{\text{spatial cross-basis}} + \sum_{k} \gamma_k u_{kit}\]

\[\beta_i = \mu_\beta + \phi_i + \epsilon_i, \qquad \phi_i \mid \phi_{-i} \sim \mathcal{N}\!\left(\bar{\phi}_{\delta_i},\, \frac{\sigma^2_\phi}{|\delta_i|}\right)\]

Term Description
\(\beta_i\) Location-specific cross-basis coefficients
\(\mu_\beta\) Overall pooled exposure-lag-response association
\(\phi_i\) Spatially structured random effect (ICAR prior)
\(\epsilon_i \sim \mathcal{N}(0, \sigma^2_\epsilon)\) Unstructured random effect
\(\delta_i\) Set of spatial neighbours of area \(i\)

The {bdlnm} R package

{bdlnm}


Published in CRAN since March 2026:

        

Overview of the package


Purpose Function Based on
Model fitting bdlnm() New
Prediction bcrosspred() dlnm::crosspred()
Attribution attributable() attrdl() (Antonio Gasparrini and Leone 2014)
Optimal exposure optimal_exposure() New
Visualisation plot.bcrosspred() dlnm::plot.crosspred()
plot.optimal_exposure() New

Installation

You can install the latest stable version in CRAN:

install.packages("bdlnm")
library(bdlnm)

At least the stable version of INLA 23.4.24 (or newest) must be installed beforehand. You can install the newest stable INLA version by:

install.packages("INLA", repos=c(getOption("repos"), INLA="https://inla.r-inla-download.org/R/stable"), 
dep=TRUE)

Hands-on {bdlnm}: An application

Temperature-mortality study

  • We assess the immediate and delayed non-linear effect of daily mean temperature on mortality (age 75+) in London, 2000–2011

  • We will use the {bdlnm} built-in london dataset: one observation per day, containing mortality counts, mean temperature, and calendar variables:

date year dow tmean mort_00_74 mort_75plus mort
2000-01-01 2000 Sat 7.642016 101 226 327
2000-01-02 2000 Sun 8.351041 100 191 291
2000-01-03 2000 Mon 8.188446 107 205 312
2000-01-04 2000 Tue 6.320066 97 191 288
2000-01-05 2000 Wed 6.550981 92 213 305
2000-01-06 2000 Thu 8.728774 82 213 295

(Vicedo-Cabrera, Sera, and Gasparrini 2019)

Temperature and mortality time series

The two time series we aim to model:

Model specification

\[Y_t \sim \text{Poisson}(\mu_t), \qquad \log(\mu_t) = \alpha + cb(x_t, \ldots, x_{t-L}) \cdot \beta + \sum_{k} \gamma_k u_{kt}\]

Following standard choices in temperature-mortality literature (A. Gasparrini, Armstrong, and Kenward 2010):

Cross-basis (\(cb\))

  • Exposure-response: natural spline, knots at the 10th, 75th, and 90th percentiles of temperature

  • Lag-response: natural spline, 3 knots equally spaced on the log scale, max lag of 21 days

Confounders (\(u_{kt}\))

  • Seasonality and long-term trend: natural spline with 8 df/year

  • Day of the week: factor variable

Building the cross-basis

Using the dlnm package to construct the cross-basis matrix cb:

# Exposure-response and lag-response spline parameters
dlnm_var <- list(
  var_prc = c(10, 75, 90),
  var_fun = "ns",
  lag_fun = "ns",
  max_lag = 21,
  lagnk = 3
)

# Build cross-basis 
argvar <- list(
  fun = dlnm_var$var_fun,
  knots = quantile(london$tmean, dlnm_var$var_prc / 100, na.rm = TRUE),
  Bound = range(london$tmean, na.rm = TRUE)
)
arglag <- list(
  fun = dlnm_var$lag_fun,
  knots = logknots(dlnm_var$max_lag, nk = dlnm_var$lagnk)
)

cb <- crossbasis(london$tmean, lag = dlnm_var$max_lag, argvar, arglag)

Building the cross-basis

Seasonality term and temperature prediction grid:

seas <- ns(london$date, df = round(8 * length(london$date) / 365.25))

Finally, we have to define the temperature grid in which predictions will be made:

temp <- round(seq(min(london$tmean), max(london$tmean), by = 0.1), 1)

Fitting the Bayesian DLNM

mod <- bdlnm(
  mort_75plus ~ cb + factor(dow) + seas,
  data = london,
  family = "poisson",
  sample.arg = list(n = 1000, seed = 5243)
)

bdlnm() handles two things internally:

  1. Fits the model using R-INLA
  2. Returns posterior samples and summaries for all parameters
List of 4
 $ model               :List of 49
  ..- attr(*, "class")= chr "inla"
 $ basis               :List of 1
 $ coefficients        : num [1:123, 1:1000] 5.17713 -0.13926 -0.00779 -0.05735 0.0489 ...
  ..- attr(*, "dimnames")=List of 2
 $ coefficients.summary: num [1:123, 1:6] 5.20498 -0.13495 -0.00609 -0.04971 0.04957 ...
  ..- attr(*, "dimnames")=List of 2
 - attr(*, "n_sim")= num 1000
 - attr(*, "class")= chr "bdlnm"

Fitting the Bayesian DLNM

Now we have a matrix with the estimated posterior samples of the model coefficients, together with their summary across samples:

Posterior samples mod$coefficients

                 sample1      sample2      sample3      sample4      sample5
(Intercept)  5.177133367  5.078826441  5.087690366  5.240474799  5.259952868
cbv1.l1     -0.139255261 -0.154092904 -0.112435527 -0.110086149 -0.153346245
cbv1.l2     -0.007789014  0.008053975  0.001242166 -0.003960913  0.006230646
cbv1.l3     -0.057353780 -0.056028646 -0.045410399 -0.044022672 -0.055960322
cbv1.l4      0.048901707  0.057240589  0.025426990  0.028533243  0.049957529

Posterior summary mod$coefficients.summary

                    mean         sd  0.025quant     0.5quant  0.975quant
(Intercept)  5.204978726 0.10244169  5.00329692  5.205434414  5.39782425
cbv1.l1     -0.134951888 0.02148864 -0.17689705 -0.134050285 -0.09601900
cbv1.l2     -0.006088059 0.01031606 -0.02557074 -0.006097467  0.01455306
cbv1.l3     -0.049713404 0.01141099 -0.07279945 -0.049397667 -0.02790430
cbv1.l4      0.049565146 0.01806032  0.01484090  0.050350330  0.08412486

* Visualizing only the first 5 coefficients and the first 5 posterior samples

Minimum mortality temperature (MMT)

The minimum mortality temperature (MMT) is the temperature at which the overall cumulative mortality risk is minimised. It is used as the reference value to center relative risk estimates, so that all effects are interpreted relative to the temperature associated with lowest mortality

mmt <- optimal_exposure(mod, exp_at = temp)
List of 2
 $ est    : Named num [1:1000] 19 19 18.8 18.7 19.1 19.1 18.7 18.9 18.8 19.1 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ summary: Named num [1:6] 18.899 0.144 18.6 18.9 19.2 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...
 - attr(*, "exp_at")= num [1:326] -3.4 -3.3 -3.2 -3.1 -3 -2.9 -2.8 -2.7 -2.6 -2.5 ...
 - attr(*, "lag_at")= num [1:22] 0 1 2 3 4 5 6 7 8 9 ...
 - attr(*, "which")= chr "min"
 - attr(*, "class")= chr "optimal_exposure"

MMT posterior distribution

The posterior is concentrated around 18.9°C and unimodal: the median is a stable centering value for the relative risk estimates

Predicting exposure-lag-response effects

From the fitted model, bcrosspred() computes the full bivariate exposure-lag-response association. Relative risks must be centered at the MMT:

cen <- mmt$summary[["0.5quant"]]
cpred <- bcrosspred(mod, exp_at = temp, cen = cen)
List of 9
 $ exp_at              : num [1:326] -3.4 -3.3 -3.2 -3.1 -3 -2.9 -2.8 -2.7 -2.6 -2.5 ...
 $ lag_at              : num [1:22] 0 1 2 3 4 5 6 7 8 9 ...
 $ cen                 : num 18.9
 $ coefficients        : num [1:20, 1:1000] -0.13926 -0.00779 -0.05735 0.0489 -0.04849 ...
  ..- attr(*, "dimnames")=List of 2
 $ matRRfit            : num [1:326, 1:22, 1:1000] 0.849 0.85 0.85 0.851 0.852 ...
  ..- attr(*, "dimnames")=List of 3
 $ allRRfit            : num [1:326, 1:1000] 2.04 2.03 2.01 2 1.99 ...
  ..- attr(*, "dimnames")=List of 2
 $ coefficients.summary: num [1:20, 1:6] -0.13495 -0.00609 -0.04971 0.04957 -0.04897 ...
  ..- attr(*, "dimnames")=List of 2
 $ matRRfit.summary    : num [1:326, 1:22, 1:6] 0.853 0.853 0.854 0.855 0.855 ...
  ..- attr(*, "dimnames")=List of 3
 $ allRRfit.summary    : num [1:326, 1:6] 1.84 1.83 1.83 1.82 1.81 ...
  ..- attr(*, "dimnames")=List of 2

Predicting exposure-lag-response effects

Now we have an array of estimated posterior RRs for each temperature–lag combination, together with their summary across samples:

Posterior samples cpred$matRRfit

, , sample1

          lag0     lag1     lag2     lag3
-3.4 0.8489690 1.078462 1.144318 1.100586
-3.3 0.8496295 1.077663 1.143098 1.099761
-3.2 0.8502905 1.076864 1.141878 1.098937
-3.1 0.8509518 1.076066 1.140661 1.098113

*Visualizing only first 4 temperatures and lags and 1 sample

Posterior summary cpred$matRRfit.summary

, , mean

          lag0     lag1     lag2     lag3
-3.4 0.8526720 1.071552 1.134935 1.094421
-3.3 0.8533691 1.070942 1.133927 1.093737
-3.2 0.8540667 1.070333 1.132921 1.093054
-3.1 0.8547648 1.069724 1.131916 1.092371

*Visualizing only first 4 temperatures and lags and 1 summary statistic

Exposure-lag-response surface

The full bivariate association can be represented as a 3-D surface over temperature and lag, giving the RR at each specific temperature-lag combination:

Overall cumulative effect

Summing all lag contributions gives the overall cumulative exposure-response curve which is the most commonly reported summary in temperature-mortality studies:

\[RR_{x,\,\text{overall}} = \exp\!\Bigl(\sum_{l=0}^{L} \beta_{x,l}\Bigr)\]

This collapses the 3-D surface into a single curve, showing how the total accumulated risk summing up all the lags varies with temperature

Overall cumulative effect

plot.bcrosspred() built-in visualisations

par(mfrow = c(2, 2), mar = c(4, 4, 2, 1))
plot(cpred, ptype = "3d", main = "Exposure-lag-response surface")
plot(cpred, ptype = "overall", log = "y", main = "Overall cumulative effect")
plot(cpred, ptype = "slices", lag_at = 0, log = "y", main = "Exposure-response (lag 0)")
plot(cpred, ptype = "slices", exp_at = round(quantile(london$tmean, 0.99), 1), log = "y", 
main = "Lag-response (99th pct temperature)")

Attributable burden

  • The attributable() function estimates the mortality burden attributable to non-optimal temperature exposures:

    • Attributable fraction (AF): proportion of all mortality events attributable to temperature
    • Attributable number (AN): absolute number of deaths attributable to temperature

The idea is to compare observed mortality under real temperature conditions against a counterfactual scenario where the population was always exposed to an optimal reference temperature (e.g. the MMT) over the same period

  • Therefore, if we have estimated \(\beta_x\) associated with a certain exposure (centered to an optimal reference one) we can calculate the attribution to non-optimal temperature exposure as:

\[\text{AF}_{x,t} = 1 - \exp(-\beta_x) \qquad \text{AN}_{x,t} = n_t \cdot \text{AF}_{x,t}\]

  • In a DLNM setting, the overall cumulative effect \(\beta_{x,\,\text{overall}}\) can be used as \(\beta_x\) (forward)

Computing attributable burden

attributable() returns daily and aggregated AF and AN using either the forward or backward perspective:

attr <- attributable(mod, london, name_date = "date", name_exposure = "tmean", name_cases = "mort_75plus", cen = cen, dir = "forw")
List of 8
 $ af             : num [1:4383, 1:1000] 0.234 0.226 0.228 0.25 0.247 ...
  ..- attr(*, "dimnames")=List of 2
 $ an             : num [1:4383, 1:1000] 39.6 37.1 36.7 39.3 38.1 ...
  ..- attr(*, "dimnames")=List of 2
 $ aftotal        : Named num [1:1000] 0.185 0.207 0.168 0.145 0.16 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ antotal        : Named num [1:1000] 73212 82089 66584 57592 63258 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ af.summary     : num [1:4383, 1:6] 0.229 0.22 0.222 0.246 0.243 ...
  ..- attr(*, "dimnames")=List of 2
 $ an.summary     : num [1:4383, 1:6] 38.8 36.1 35.8 38.7 37.5 ...
  ..- attr(*, "dimnames")=List of 2
 $ aftotal.summary: Named num [1:6] 0.174 0.018 0.139 0.175 0.207 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...
 $ antotal.summary: Named num [1:6] 68858 7132 55071 69178 81995 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...

Daily attributable fraction

Time series of daily attributable fractions (AFs), together with the temperature exposition in that given day:

Total attributable burden

Total mortality burden attributable to non-optimal temperatures in the London 75+ population across 2000–2011:

mean sd 0.025quant 0.5quant 0.975quant mode
Attributable fraction 0.174 0.018 0.139 0.175 0.207 0.176
Attributable number 68857.597 7131.526 55071.066 69178.391 81995.458 69842.155

Conclusions

Conclusions

  • The {bdlnm} package provides a complete and accessible implementation of Bayesian Distributed Lag Non-Linear Models in R

  • Combining the flexibility of DLNMs with full Bayesian inference enables researchers to fit more realistic and complex models while fully propagating uncertainty

  • The posterior distribution of all estimates allows direct uncertainty quantification of derived measures (e.g., MMT, attributable fractions, attributable numbers) without additional approximations

  • It constitutes a valuable tool for studying the health impacts of climate change and other environmental risks in increasingly data-rich settings

  • The framework extends beyond environmental epidemiology to any setting with time-varying exposures and delayed effects making it a general tool for time series analysis

Present and future work

Currently in development:

  • mcmc_to_bdlnm(): convert any object with posterior samples of DLNM coefficients from an MCMC algorithm (e.g. NIMBLE, Stan) into a bdlnm object, enabling broader model compatibility beyond INLA

Coming soon:

  • Multi-location analysis: pool independent exposure-lag-response curves across cities or regions within a single model

  • Spatial B-DLNMs (SB-DLNM): explicitly model spatial heterogeneity in exposure-lag-response associations across regions

  • Parallelization: faster inference via {futurize} for large datasets

Further resources



References

Argentieri, M. Austin, Najaf Amin, Alejo J. Nevado-Holgado, William Sproviero, Jennifer A. Collister, Sarai M. Keestra, Midas M. Kuilman, et al. 2025. “Integrating the Environmental and Genetic Architectures of Aging and Mortality.” Nature Medicine 31: 1016–25. https://doi.org/10.1038/s41591-024-03483-9.
Gasparrini, A., B. Armstrong, and M. G. Kenward. 2010. “Distributed Lag Non-Linear Models.” Statistics in Medicine 29 (21): 2224–34. https://doi.org/https://doi.org/10.1002/sim.3940.
Gasparrini, Antonio, Ben Armstrong, and Fabian Scheipl. 2011. “Distributed Lag Linear and Non-Linear Models in R: The Package dlnm.” Journal of Statistical Software 43 (8): 1–20. https://doi.org/10.18637/jss.v043.i08.
Gasparrini, Antonio, and Michela Leone. 2014. “Attributable Risk from Distributed Lag Models.” BMC Medical Research Methodology 14 (April): 55. https://doi.org/10.1186/1471-2288-14-55.
Quijal-Zamorano, Marcos, Miguel A Martinez-Beneito, Joan Ballester, and Marc Marí-Dell’Olmo. 2024. “Spatial Bayesian Distributed Lag Non-Linear Models (SB-DLNM) for Small-Area Exposure-Lag-Response Epidemiological Modelling.” International Journal of Epidemiology 53 (3): dyae061. https://doi.org/10.1093/ije/dyae061.
Rue, Håvard, Sara Martino, and Nicholas Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with Discussion).” Journal of the Royal Statistical Society B 71: 319–92.
Vicedo-Cabrera, Ana M., Francesco Sera, and Antonio Gasparrini. 2019. “A Hands-on Tutorial on a Modeling Framework for Projections of Climate Change Impacts on Health.” Epidemiology 30 (3): 321–29. https://doi.org/10.1097/ede.0000000000000982.