Machine Learning Estimation of Daily Surface Concentrations of PM2.5, NO2, and MDA8 Ozone at High Spatiotemporal Resolutions



Journal Title

Journal ISSN

Volume Title



High concentrations of pollutants in the atmosphere endanger public health and negatively impact other domains. Although surface measurements of pollutants at ground stations are quite reliable, they still suffer from the low spatial coverage due to the limited number of ground stations. Such limitations call for the development of accurate approaches to estimating surface concentrations of pollutants, particularly in regions with no monitoring stations. This thesis proposes machine learning (ML) and deep learning (DL) techniques to estimate daily surface concentrations of PM2.5, NO2, and daily maximum 8-h average (MDA8) ozone. The first task focuses on using random forest (RF) to estimate daily surface concentrations of PM2.5 at 1-km spatial resolution in the 2014-2018 period over Texas to obtain a correlation coefficient (R) of 0.83-0.90 and a mean absolute bias (MAB) of 1.47-1.77 µg/m3. Our results also show the high capability of RF compared to the commonly used models for estimating PM2.5 concentrations. The second task focuses on developing the PCNN-DNN , a novel two-step DL model, to estimate daily surface NO2 concentrations over the contiguous United States (CONUS) from 2005 to 2019. To the best of our knowledge, the PCNN-DNN is the most accurate model in the globe to estimate surface NO2 levels, with an R of 0.975 to 0.978 and an MAB of 0.99 ppb to 1.38 ppb. Moreover, the PCNN-DNN model generates estimated NO2 grids without any missing values, improving the quality of various applications such as public health studies. The third task is to develop a DL approach to accurately estimate surface MDA8 ozone and examines the spatial contribution of several factors on ozone levels over the CONUS in 2019. The model obtains an R of 0.95 and an MAB of 2.79 ppb, highlighting the promising performance of the Deep-CNN at estimating surface MDA8 ozone. We also use Shapley additive explanations (SHAP) to generate, for the first time, a spatial feature contribution map (SFCM) for ozone, the results of which confirm an advanced ability of Deep-CNN to accurately capture the relationships between ozone and most predictor variables.



Machine learning, Satellite remote sensing, Surface concentrations of pollutants, PM2.5, NO2, Ozone


Portions of this document appear in: Ghahremanloo, Masoud, Yunsoo Choi, Alqamah Sayeed, Ahmed Khan Salman, Shuai Pan, and Meisam Amani. "Estimating daily high-resolution PM2. 5 concentrations over Texas: Machine Learning approach." Atmospheric Environment 247 (2021): 118209; and in: Ghahremanloo, Masoud, Yannic Lops, Yunsoo Choi, Seyedali Mousavinezhad, and Jia Jung. "A Coupled Deep Learning Model for Estimating Surface NO2 Levels From Remote Sensing Data: 15‐Year Study Over the Contiguous United States." Journal of Geophysical Research: Atmospheres 128, no. 2 (2023): e2022JD037010.