## Abstract

Background: Using modelled air pollutant predictions as exposure variables in

epidemiological analyses can produce bias in health effect estimation. We used statistical simulation to estimate these biases and compare different air pollution models for London.

Methods: Our simulations were based on a sample of 1,000 small geographical areas within London, UK. “True” pollutant data (daily mean nitrogen dioxide (NO2) and ozone (O3)) were simulated to include spatio-temporal variation and spatial covariance. All-cause mortality and cardiovascular hospital admissions were simulated from “true” pollution data using prespecified effect parameters for short and long-term exposure within a multi-level Poisson model. We compared: Land Use Regression (LUR) models, dispersion models, LUR models

including dispersion output as a spline (hybrid1) and generalised additive models combining splines in LUR and dispersion outputs (hybrid2). Validation datasets (model versus fixed-site monitor) were used to define simulation scenarios.

Results: For the LUR models, bias estimates ranged from -56% to +7% for short-term exposure and -98% to -68% for long-term exposure and for the dispersion models from -33% to -15% and -52% to +0.5% respectively. Hybrid1 provided little if any additional benefit but hybrid2 appeared optimal in terms of bias estimates for short-term (-17% to +11%) and long-term (-28% to +11%) exposure and in preserving coverage probability and statistical power.

Conclusion: Although exposure error can produce substantial negative bias (i.e. towards the null), combining outputs from different air pollution modelling approaches may reduce bias in health effect estimation leading to improved impact evaluation of abatement policies.

epidemiological analyses can produce bias in health effect estimation. We used statistical simulation to estimate these biases and compare different air pollution models for London.

Methods: Our simulations were based on a sample of 1,000 small geographical areas within London, UK. “True” pollutant data (daily mean nitrogen dioxide (NO2) and ozone (O3)) were simulated to include spatio-temporal variation and spatial covariance. All-cause mortality and cardiovascular hospital admissions were simulated from “true” pollution data using prespecified effect parameters for short and long-term exposure within a multi-level Poisson model. We compared: Land Use Regression (LUR) models, dispersion models, LUR models

including dispersion output as a spline (hybrid1) and generalised additive models combining splines in LUR and dispersion outputs (hybrid2). Validation datasets (model versus fixed-site monitor) were used to define simulation scenarios.

Results: For the LUR models, bias estimates ranged from -56% to +7% for short-term exposure and -98% to -68% for long-term exposure and for the dispersion models from -33% to -15% and -52% to +0.5% respectively. Hybrid1 provided little if any additional benefit but hybrid2 appeared optimal in terms of bias estimates for short-term (-17% to +11%) and long-term (-28% to +11%) exposure and in preserving coverage probability and statistical power.

Conclusion: Although exposure error can produce substantial negative bias (i.e. towards the null), combining outputs from different air pollution modelling approaches may reduce bias in health effect estimation leading to improved impact evaluation of abatement policies.

Original language | English |
---|---|

Number of pages | 22 |

Journal | Environmental Epidemiology |

Publication status | Accepted/In press - 26 Mar 2020 |