King's College London

Research portal

Prediction of PM2.5 concentrations at the locations of monitoring sites measuring PM10 and NOx using generalized additive models and machine learning methods: A case study in London

Research output: Contribution to journalArticlepeer-review

Antonis Analitis, Ben Barratt, David Green, Andrew Beddows, Evangelia Samoli, Joel Schwartz, Klea Katsouyanni

Original languageEnglish
Article number117757
Early online date22 Jul 2020
Accepted/In press3 Jul 2020
E-pub ahead of print22 Jul 2020
Published1 Nov 2020


King's Authors


The adverse health effects of air pollutants, especially those of PM 2.5, are well documented. However, a lack of adequate monitoring and weaknesses in modelling approaches do not allow a good assessment of health effects in many areas of the World. Advances in computational methods and the availability of new data sets, e.g. satellite remote observations, have enlarged the possibilities of modelling for application in large scale health effects studies. However, PM 2.5 monitoring is very recent in most of the World and more limited compared to other pollutants, and understanding how to use PM 10 monitors to estimate PM 2.5 exposure is therefore important. Since interest in these methods is relatively recent, there is a need for testing their performance against ambient measurements, but long term PM 2.5 datasets are less readily available than PM 10 in many regions. In the present study we report the methodology and results of using regression modelling and a machine learning method (Random Forest-RF), as well as a combination of the two, to enhance a PM 2.5 measurement data base in London using PM 10 and NO x measurements as well as other predictors and compare the relative performance of each method. We found that the combination of predictions by the regression model and the RF performs best and we obtain a cross-validation R 2 of 99.29% and 98.22% for the 5-year periods 2004–2008 and 2009–2013, respectively, and a Mean Square Error near 1. Our enhanced data base for PM 2.5 is available for use by other researchers.

Download statistics

No data available

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454