Prediction of PM2.5 concentrations at the locations of monitoring sites measuring PM10 and NOx using generalized additive models and machine learning methods: A case study in London

Antonis Analitis, Ben Barratt, David Green, Andrew Beddows, Evangelia Samoli, Joel Schwartz, Klea Katsouyanni

Research output: Contribution to journalArticlepeer-review

32 Citations (Scopus)
262 Downloads (Pure)

Abstract

The adverse health effects of air pollutants, especially those of PM 2.5, are well documented. However, a lack of adequate monitoring and weaknesses in modelling approaches do not allow a good assessment of health effects in many areas of the World. Advances in computational methods and the availability of new data sets, e.g. satellite remote observations, have enlarged the possibilities of modelling for application in large scale health effects studies. However, PM 2.5 monitoring is very recent in most of the World and more limited compared to other pollutants, and understanding how to use PM 10 monitors to estimate PM 2.5 exposure is therefore important. Since interest in these methods is relatively recent, there is a need for testing their performance against ambient measurements, but long term PM 2.5 datasets are less readily available than PM 10 in many regions. In the present study we report the methodology and results of using regression modelling and a machine learning method (Random Forest-RF), as well as a combination of the two, to enhance a PM 2.5 measurement data base in London using PM 10 and NO x measurements as well as other predictors and compare the relative performance of each method. We found that the combination of predictions by the regression model and the RF performs best and we obtain a cross-validation R 2 of 99.29% and 98.22% for the 5-year periods 2004–2008 and 2009–2013, respectively, and a Mean Square Error near 1. Our enhanced data base for PM 2.5 is available for use by other researchers.

Original languageEnglish
Article number117757
JournalATMOSPHERIC ENVIRONMENT
Volume240
Early online date22 Jul 2020
DOIs
Publication statusPublished - 1 Nov 2020

Keywords

  • Ensemble methods
  • Environmental exposure
  • London case study
  • PM prediction
  • Random forest

Fingerprint

Dive into the research topics of 'Prediction of PM2.5 concentrations at the locations of monitoring sites measuring PM10 and NOx using generalized additive models and machine learning methods: A case study in London'. Together they form a unique fingerprint.

Cite this