A Bayesian framework for distributed lag non-linear models: the {bdlnm} R package

Pau
Satorra

Biostatistics Support and Research Unit, IGTP

Marcos Quijal-Zamorano

Institute of Social and Preventive Medicine (ISPM), University of Bern

Joan Ballester

Institut de Salut Global de Barcelona (ISGlobal)

Miguel Ángel Martínez-Beneito

Departament d’Estadística i Investigació Operativa, Universitat de València

Marc Marí-Dell’Olmo

Agència de Salut Pública de Barcelona (ASPB)

May 26, 2026

Background

Motivation

Climate change is intensifying exposure to extreme environmental conditions: heatwaves, air pollution, and cold spells (among others) are becoming more frequent and severe
Environmental exposures rank among the strongest drivers of population health, often exerting a larger influence than genetic variation (Argentieri et al. 2025)
Understanding how environmental exposures effects healths is therefore essential for public health research

Distributed Lag Non-Linear Models (DLNMs)

The relationship between environment and health is complex: effects are typically non-linear and delayed, making them hard to capture with standard regression approaches
Environmental exposures rarely act instantaneously, their health impact unfolds over time:
- A heatwave today may elevate mortality risk some days later
- Air pollution accumulates biological damage across days or weeks
- Cold temperatures can trigger cardiovascular events several days after exposure

Distributed Lag Non-Linear Models (DLNMs) (A. Gasparrini, Armstrong, and Kenward 2010) are the standard framework for capturing these dynamics. They simultaneously model two dimensions of risk:

How risk changes across exposure levels (exposure-response)
How risk evolves over time after exposure (lag-response)

DLNMs can be applied to any setting involving two time series and a delayed effect: environmental epidemiology, econometrics, pharmacology, and beyond

DLNMs

Exposure-response: non-linear association between exposure value and health risk

Lag-response: how the effect of a single exposure event evolves across subsequent lags

Usually, both dimensions are modelled simultaneously using splines (natural cubic or B-splines), producing a smooth bivariate surface over exposure and lag

DLNMs: Model formulation

For time-series count data (A. Gasparrini, Armstrong, and Kenward 2010):

\[Y_t \sim \text{Poisson}(\mu_t)\] \[\log(\mu_t) = \alpha + \underbrace{cb(x_t, \ldots, x_{t-L}) \cdot \beta}_{\text{cross-basis}} + \sum_{k} \gamma_k u_{kt}\]

Term	Description
$cb(\cdot)$	Cross-basis: tensor product of exposure and lag spline bases
$\beta$	Coefficients of the cross-basis
$u_{kt}$	(Time-varying) confounders (e.g. seasonality, trend)
$\gamma_k$	Coefficients of the confounders
$\alpha$	Intercept

The cross-basis encodes the full bivariate exposure-lag-response surface in a single model term

Cross-basis construction

The cross-basis combines the exposure-response basis and the lag-response basis through a tensor product:

Limitations of classical DLNMs

Datasets are becoming larger and more complex, with studies increasingly featuring granular spatial data across many locations
Classical frequentist DLNMs rely on a two-stage approach:
1. Independent DLNMs are fitted at each location separately (first stage)
2. Location-specific estimates are pooled via multivariable meta-analysis (second stage)

Key limitations

Ignores spatial structure: borrows no strength across neighbouring locations, discarding information about spatial variation in risk
Collapses the lag dimension: first-stage models typically reduce to a single lag or cumulative effect, discarding the full lag-response surface
Unstable in sparse data: independent models fitted in small areas with few events yield unreliable or unidentifiable estimates

Spatial Bayesian DLNMs (SB-DLNMs)

A Bayesian extension, Spatial Bayesian Distributed Lag Non-Linear Models (SB-DLNMs) (Quijal-Zamorano et al. 2024), was proposed to overcome these limitations, allowing reliable estimation of small-area lagged non-linear associations

\[Y_{it} \sim \text{Poisson}(\mu_{it}) \qquad \log(\mu_{it}) = \alpha_i + \underbrace{cb(x_{it}, \ldots, x_{it-L}) \cdot \beta_i}_{\text{spatial cross-basis}} + \sum_{k} \gamma_k u_{kit}\]

\[\beta_i = \mu_\beta + \phi_i + \epsilon_i, \qquad \phi_i \mid \phi_{-i} \sim \mathcal{N}\!\left(\bar{\phi}_{\delta_i},\, \frac{\sigma^2_\phi}{|\delta_i|}\right)\]

Term	Description
$\beta_i$	Location-specific cross-basis coefficients
$\mu_\beta$	Overall pooled exposure-lag-response association
$\phi_i$	Spatially structured random effect (ICAR prior)
$\epsilon_i \sim \mathcal{N}(0, \sigma^2_\epsilon)$	Unstructured random effect
$\delta_i$	Set of spatial neighbours of area $i$

The {bdlnm} R package

{bdlnm}

The goal of the {bdlnm} R package is to provide a software for extending frequentist DLNMs to all kinds of Bayesian DLNMs (B-DLNMs)
It’s based on the package {dlnm}, which is the reference package used for DLNMs in a frequentist perspective (Antonio Gasparrini, Armstrong, and Scheipl 2011)
It uses R-INLA (Rue, Martino, and Chopin 2009) for Bayesian inference

Published in CRAN since March 2026:

Overview of the package

Purpose	Function	Based on
Model fitting	`bdlnm()`	New
Prediction	`bcrosspred()`	`dlnm::crosspred()`
Attribution	`attributable()`	`attrdl()` (Antonio Gasparrini and Leone 2014)
Optimal exposure	`optimal_exposure()`	New
Visualisation	`plot.bcrosspred()`	`dlnm::plot.crosspred()`
	`plot.optimal_exposure()`	New

Installation

You can install the latest stable version in CRAN:

install.packages("bdlnm")
library(bdlnm)

At least the stable version of INLA 23.4.24 (or newest) must be installed beforehand. You can install the newest stable INLA version by:

install.packages("INLA", repos=c(getOption("repos"), INLA="https://inla.r-inla-download.org/R/stable"), 
dep=TRUE)

Hands-on {bdlnm}: An application

Temperature-mortality study

We assess the immediate and delayed non-linear effect of daily mean temperature on mortality (age 75+) in London, 2000–2011
We will use the {bdlnm} built-in london dataset: one observation per day, containing mortality counts, mean temperature, and calendar variables:

date	year	dow	tmean	mort_00_74	mort_75plus	mort
2000-01-01	2000	Sat	7.642016	101	226	327
2000-01-02	2000	Sun	8.351041	100	191	291
2000-01-03	2000	Mon	8.188446	107	205	312
2000-01-04	2000	Tue	6.320066	97	191	288
2000-01-05	2000	Wed	6.550981	92	213	305
2000-01-06	2000	Thu	8.728774	82	213	295

(Vicedo-Cabrera, Sera, and Gasparrini 2019)

Temperature and mortality time series

The two time series we aim to model:

Model specification

\[Y_t \sim \text{Poisson}(\mu_t), \qquad \log(\mu_t) = \alpha + cb(x_t, \ldots, x_{t-L}) \cdot \beta + \sum_{k} \gamma_k u_{kt}\]

Following standard choices in temperature-mortality literature (A. Gasparrini, Armstrong, and Kenward 2010):

Cross-basis ($cb$)

Exposure-response: natural spline, knots at the 10th, 75th, and 90th percentiles of temperature
Lag-response: natural spline, 3 knots equally spaced on the log scale, max lag of 21 days

Confounders ($u_{kt}$)

Seasonality and long-term trend: natural spline with 8 df/year
Day of the week: factor variable

Building the cross-basis

Using the dlnm package to construct the cross-basis matrix cb:

# Exposure-response and lag-response spline parameters
dlnm_var <- list(
  var_prc = c(10, 75, 90),
  var_fun = "ns",
  lag_fun = "ns",
  max_lag = 21,
  lagnk = 3
)

# Build cross-basis 
argvar <- list(
  fun = dlnm_var$var_fun,
  knots = quantile(london$tmean, dlnm_var$var_prc / 100, na.rm = TRUE),
  Bound = range(london$tmean, na.rm = TRUE)
)
arglag <- list(
  fun = dlnm_var$lag_fun,
  knots = logknots(dlnm_var$max_lag, nk = dlnm_var$lagnk)
)

cb <- crossbasis(london$tmean, lag = dlnm_var$max_lag, argvar, arglag)

Building the cross-basis

Seasonality term and temperature prediction grid:

seas <- ns(london$date, df = round(8 * length(london$date) / 365.25))

Finally, we have to define the temperature grid in which predictions will be made:

temp <- round(seq(min(london$tmean), max(london$tmean), by = 0.1), 1)

Fitting the Bayesian DLNM

mod <- bdlnm(
  mort_75plus ~ cb + factor(dow) + seas,
  data = london,
  family = "poisson",
  sample.arg = list(n = 1000, seed = 5243)
)

bdlnm() handles two things internally:

Fits the model using R-INLA
Returns posterior samples and summaries for all parameters

List of 4
 $ model               :List of 49
  ..- attr(*, "class")= chr "inla"
 $ basis               :List of 1
 $ coefficients        : num [1:123, 1:1000] 5.17713 -0.13926 -0.00779 -0.05735 0.0489 ...
  ..- attr(*, "dimnames")=List of 2
 $ coefficients.summary: num [1:123, 1:6] 5.20498 -0.13495 -0.00609 -0.04971 0.04957 ...
  ..- attr(*, "dimnames")=List of 2
 - attr(*, "n_sim")= num 1000
 - attr(*, "class")= chr "bdlnm"

Fitting the Bayesian DLNM

Now we have a matrix with the estimated posterior samples of the model coefficients, together with their summary across samples:

Posterior samples mod$coefficients

                 sample1      sample2      sample3      sample4      sample5
(Intercept)  5.177133367  5.078826441  5.087690366  5.240474799  5.259952868
cbv1.l1     -0.139255261 -0.154092904 -0.112435527 -0.110086149 -0.153346245
cbv1.l2     -0.007789014  0.008053975  0.001242166 -0.003960913  0.006230646
cbv1.l3     -0.057353780 -0.056028646 -0.045410399 -0.044022672 -0.055960322
cbv1.l4      0.048901707  0.057240589  0.025426990  0.028533243  0.049957529

Posterior summary mod$coefficients.summary

                    mean         sd  0.025quant     0.5quant  0.975quant
(Intercept)  5.204978726 0.10244169  5.00329692  5.205434414  5.39782425
cbv1.l1     -0.134951888 0.02148864 -0.17689705 -0.134050285 -0.09601900
cbv1.l2     -0.006088059 0.01031606 -0.02557074 -0.006097467  0.01455306
cbv1.l3     -0.049713404 0.01141099 -0.07279945 -0.049397667 -0.02790430
cbv1.l4      0.049565146 0.01806032  0.01484090  0.050350330  0.08412486

* Visualizing only the first 5 coefficients and the first 5 posterior samples

Minimum mortality temperature (MMT)

The minimum mortality temperature (MMT) is the temperature at which the overall cumulative mortality risk is minimised. It is used as the reference value to center relative risk estimates, so that all effects are interpreted relative to the temperature associated with lowest mortality

mmt <- optimal_exposure(mod, exp_at = temp)

List of 2
 $ est    : Named num [1:1000] 19 19 18.8 18.7 19.1 19.1 18.7 18.9 18.8 19.1 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ summary: Named num [1:6] 18.899 0.144 18.6 18.9 19.2 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...
 - attr(*, "exp_at")= num [1:326] -3.4 -3.3 -3.2 -3.1 -3 -2.9 -2.8 -2.7 -2.6 -2.5 ...
 - attr(*, "lag_at")= num [1:22] 0 1 2 3 4 5 6 7 8 9 ...
 - attr(*, "which")= chr "min"
 - attr(*, "class")= chr "optimal_exposure"

MMT posterior distribution

The posterior is concentrated around 18.9°C and unimodal: the median is a stable centering value for the relative risk estimates

Predicting exposure-lag-response effects

From the fitted model, bcrosspred() computes the full bivariate exposure-lag-response association. Relative risks must be centered at the MMT:

cen <- mmt$summary[["0.5quant"]]
cpred <- bcrosspred(mod, exp_at = temp, cen = cen)

List of 9
 $ exp_at              : num [1:326] -3.4 -3.3 -3.2 -3.1 -3 -2.9 -2.8 -2.7 -2.6 -2.5 ...
 $ lag_at              : num [1:22] 0 1 2 3 4 5 6 7 8 9 ...
 $ cen                 : num 18.9
 $ coefficients        : num [1:20, 1:1000] -0.13926 -0.00779 -0.05735 0.0489 -0.04849 ...
  ..- attr(*, "dimnames")=List of 2
 $ matRRfit            : num [1:326, 1:22, 1:1000] 0.849 0.85 0.85 0.851 0.852 ...
  ..- attr(*, "dimnames")=List of 3
 $ allRRfit            : num [1:326, 1:1000] 2.04 2.03 2.01 2 1.99 ...
  ..- attr(*, "dimnames")=List of 2
 $ coefficients.summary: num [1:20, 1:6] -0.13495 -0.00609 -0.04971 0.04957 -0.04897 ...
  ..- attr(*, "dimnames")=List of 2
 $ matRRfit.summary    : num [1:326, 1:22, 1:6] 0.853 0.853 0.854 0.855 0.855 ...
  ..- attr(*, "dimnames")=List of 3
 $ allRRfit.summary    : num [1:326, 1:6] 1.84 1.83 1.83 1.82 1.81 ...
  ..- attr(*, "dimnames")=List of 2

Predicting exposure-lag-response effects

Now we have an array of estimated posterior RRs for each temperature–lag combination, together with their summary across samples:

Posterior samples cpred$matRRfit

, , sample1

          lag0     lag1     lag2     lag3
-3.4 0.8489690 1.078462 1.144318 1.100586
-3.3 0.8496295 1.077663 1.143098 1.099761
-3.2 0.8502905 1.076864 1.141878 1.098937
-3.1 0.8509518 1.076066 1.140661 1.098113

*Visualizing only first 4 temperatures and lags and 1 sample

Posterior summary cpred$matRRfit.summary

, , mean

          lag0     lag1     lag2     lag3
-3.4 0.8526720 1.071552 1.134935 1.094421
-3.3 0.8533691 1.070942 1.133927 1.093737
-3.2 0.8540667 1.070333 1.132921 1.093054
-3.1 0.8547648 1.069724 1.131916 1.092371

*Visualizing only first 4 temperatures and lags and 1 summary statistic

Exposure-lag-response surface

The full bivariate association can be represented as a 3-D surface over temperature and lag, giving the RR at each specific temperature-lag combination:

Overall cumulative effect

Summing all lag contributions gives the overall cumulative exposure-response curve which is the most commonly reported summary in temperature-mortality studies:

\[RR_{x,\,\text{overall}} = \exp\!\Bigl(\sum_{l=0}^{L} \beta_{x,l}\Bigr)\]

This collapses the 3-D surface into a single curve, showing how the total accumulated risk summing up all the lags varies with temperature

Overall cumulative effect

`plot.bcrosspred()` built-in visualisations

par(mfrow = c(2, 2), mar = c(4, 4, 2, 1))
plot(cpred, ptype = "3d", main = "Exposure-lag-response surface")
plot(cpred, ptype = "overall", log = "y", main = "Overall cumulative effect")
plot(cpred, ptype = "slices", lag_at = 0, log = "y", main = "Exposure-response (lag 0)")
plot(cpred, ptype = "slices", exp_at = round(quantile(london$tmean, 0.99), 1), log = "y", 
main = "Lag-response (99th pct temperature)")

Attributable burden

The attributable() function estimates the mortality burden attributable to non-optimal temperature exposures:
- Attributable fraction (AF): proportion of all mortality events attributable to temperature
- Attributable number (AN): absolute number of deaths attributable to temperature

The idea is to compare observed mortality under real temperature conditions against a counterfactual scenario where the population was always exposed to an optimal reference temperature (e.g. the MMT) over the same period

Therefore, if we have estimated $\beta_x$ associated with a certain exposure (centered to an optimal reference one) we can calculate the attribution to non-optimal temperature exposure as:

\[\text{AF}_{x,t} = 1 - \exp(-\beta_x) \qquad \text{AN}_{x,t} = n_t \cdot \text{AF}_{x,t}\]

In a DLNM setting, the overall cumulative effect $\beta_{x,\,\text{overall}}$ can be used as $\beta_x$ (forward)

Computing attributable burden

attributable() returns daily and aggregated AF and AN using either the forward or backward perspective:

attr <- attributable(mod, london, name_date = "date", name_exposure = "tmean", name_cases = "mort_75plus", cen = cen, dir = "forw")

List of 8
 $ af             : num [1:4383, 1:1000] 0.234 0.226 0.228 0.25 0.247 ...
  ..- attr(*, "dimnames")=List of 2
 $ an             : num [1:4383, 1:1000] 39.6 37.1 36.7 39.3 38.1 ...
  ..- attr(*, "dimnames")=List of 2
 $ aftotal        : Named num [1:1000] 0.185 0.207 0.168 0.145 0.16 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ antotal        : Named num [1:1000] 73212 82089 66584 57592 63258 ...
  ..- attr(*, "names")= chr [1:1000] "sample1" "sample2" "sample3" "sample4" ...
 $ af.summary     : num [1:4383, 1:6] 0.229 0.22 0.222 0.246 0.243 ...
  ..- attr(*, "dimnames")=List of 2
 $ an.summary     : num [1:4383, 1:6] 38.8 36.1 35.8 38.7 37.5 ...
  ..- attr(*, "dimnames")=List of 2
 $ aftotal.summary: Named num [1:6] 0.174 0.018 0.139 0.175 0.207 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...
 $ antotal.summary: Named num [1:6] 68858 7132 55071 69178 81995 ...
  ..- attr(*, "names")= chr [1:6] "mean" "sd" "0.025quant" "0.5quant" ...

Daily attributable fraction

Time series of daily attributable fractions (AFs), together with the temperature exposition in that given day:

Total attributable burden

Total mortality burden attributable to non-optimal temperatures in the London 75+ population across 2000–2011:

	mean	sd	0.025quant	0.5quant	0.975quant	mode
Attributable fraction	0.174	0.018	0.139	0.175	0.207	0.176
Attributable number	68857.597	7131.526	55071.066	69178.391	81995.458	69842.155

Conclusions

The {bdlnm} package provides a complete and accessible implementation of Bayesian Distributed Lag Non-Linear Models in R
Combining the flexibility of DLNMs with full Bayesian inference enables researchers to fit more realistic and complex models while fully propagating uncertainty
The posterior distribution of all estimates allows direct uncertainty quantification of derived measures (e.g., MMT, attributable fractions, attributable numbers) without additional approximations
It constitutes a valuable tool for studying the health impacts of climate change and other environmental risks in increasingly data-rich settings
The framework extends beyond environmental epidemiology to any setting with time-varying exposures and delayed effects making it a general tool for time series analysis

Present and future work

Currently in development:

mcmc_to_bdlnm(): convert any object with posterior samples of DLNM coefficients from an MCMC algorithm (e.g. NIMBLE, Stan) into a bdlnm object, enabling broader model compatibility beyond INLA

Coming soon:

Multi-location analysis: pool independent exposure-lag-response curves across cities or regions within a single model
Spatial B-DLNMs (SB-DLNM): explicitly model spatial heterogeneity in exposure-lag-response associations across regions
Parallelization: faster inference via {futurize} for large datasets

Further resources

🐛 Bug reports and contributions welcome via the GitHub repository
https://github.com/pasahe/bdlnm

📦 Full documented vignettes available on the package webpage
https://pasahe.github.io/bdlnm/

References

Argentieri, M. Austin, Najaf Amin, Alejo J. Nevado-Holgado, William Sproviero, Jennifer A. Collister, Sarai M. Keestra, Midas M. Kuilman, et al. 2025. “Integrating the Environmental and Genetic Architectures of Aging and Mortality.” Nature Medicine 31: 1016–25. https://doi.org/10.1038/s41591-024-03483-9.

Gasparrini, A., B. Armstrong, and M. G. Kenward. 2010. “Distributed Lag Non-Linear Models.” Statistics in Medicine 29 (21): 2224–34. https://doi.org/https://doi.org/10.1002/sim.3940.

Gasparrini, Antonio, Ben Armstrong, and Fabian Scheipl. 2011. “Distributed Lag Linear and Non-Linear Models in R: The Package dlnm.” Journal of Statistical Software 43 (8): 1–20. https://doi.org/10.18637/jss.v043.i08.

Gasparrini, Antonio, and Michela Leone. 2014. “Attributable Risk from Distributed Lag Models.” BMC Medical Research Methodology 14 (April): 55. https://doi.org/10.1186/1471-2288-14-55.

Quijal-Zamorano, Marcos, Miguel A Martinez-Beneito, Joan Ballester, and Marc Marí-Dell’Olmo. 2024. “Spatial Bayesian Distributed Lag Non-Linear Models (SB-DLNM) for Small-Area Exposure-Lag-Response Epidemiological Modelling.” International Journal of Epidemiology 53 (3): dyae061. https://doi.org/10.1093/ije/dyae061.

Rue, Håvard, Sara Martino, and Nicholas Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with Discussion).” Journal of the Royal Statistical Society B 71: 319–92.

Vicedo-Cabrera, Ana M., Francesco Sera, and Antonio Gasparrini. 2019. “A Hands-on Tutorial on a Modeling Framework for Projections of Climate Change Impacts on Health.” Epidemiology 30 (3): 321–29. https://doi.org/10.1097/ede.0000000000000982.

Term	Description
\(cb(\cdot)\)	Cross-basis: tensor product of exposure and lag spline bases
\(\beta\)	Coefficients of the cross-basis
\(u_{kt}\)	(Time-varying) confounders (e.g. seasonality, trend)
\(\gamma_k\)	Coefficients of the confounders
\(\alpha\)	Intercept

Term	Description
\(\beta_i\)	Location-specific cross-basis coefficients
\(\mu_\beta\)	Overall pooled exposure-lag-response association
\(\phi_i\)	Spatially structured random effect (ICAR prior)
\(\epsilon_i \sim \mathcal{N}(0, \sigma^2_\epsilon)\)	Unstructured random effect
\(\delta_i\)	Set of spatial neighbours of area \(i\)

A Bayesian framework for distributed lag non-linear models: the {bdlnm} R package

Background

Motivation

Distributed Lag Non-Linear Models (DLNMs)

DLNMs

DLNMs: Model formulation

Cross-basis construction

Limitations of classical DLNMs

Spatial Bayesian DLNMs (SB-DLNMs)

The {bdlnm} R package

{bdlnm}

Overview of the package

Installation

Hands-on {bdlnm}: An application

Temperature-mortality study

Temperature and mortality time series

Model specification

Building the cross-basis

Building the cross-basis

Fitting the Bayesian DLNM

Fitting the Bayesian DLNM

Minimum mortality temperature (MMT)

MMT posterior distribution

Predicting exposure-lag-response effects

Predicting exposure-lag-response effects

Exposure-lag-response surface

Overall cumulative effect

Overall cumulative effect

plot.bcrosspred() built-in visualisations

Attributable burden

Computing attributable burden

Daily attributable fraction

Total attributable burden

Conclusions

Conclusions

Present and future work

Further resources

References

`plot.bcrosspred()` built-in visualisations