Felipe Cifuentes This email address is being protected from spambots. You need JavaScript enabled to view it.1, Angel Gálvez1, Carlos M. González1, Mauricio Orozco-Alzate2, Beatriz H. Aristizábal This email address is being protected from spambots. You need JavaScript enabled to view it.1

1 Hydraulic Engineering and Environmental Research Group, Universidad Nacional de Colombia Sede Manizales, Cra 27 64-60 Bloque H Palogrande. Manizales, Colombia
2 Department of Informatics and Computing, Universidad Nacional de Colombia Sede Manizales, Km 7 vía al Magdalena Bloque S La Nubia. Manizales, Colombia

Received: July 30, 2020
Revised: April 4, 2021
Accepted: June 2, 2021

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.4209/aaqr.200471  

Cite this article:

Cifuentes, F., Gálvez, A., González, C.M., Orozco-Alzate, M., Aristizábal, B.H. (2021). Hourly Ozone and PM2.5 Prediction Using Meteorological Data – Alternatives for Cities with Limited Pollutant Information. Aerosol Air Qual. Res. https://doi.org/10.4209/aaqr.200471


  • O3 forecasting was accurately represented by meteorological predictors.
  • SO2 and CO were the most influential predictors for PM2.5 forecasting.
  • PCA and spearman analysis were useful to define subsets of predictor variables.
  • The developed models are useful for forecasting and filling missing data.


Using statistical models, the average hourly ozone (O3) concentration was predicted from seven meteorological variables (Pearson correlation coefficient, R = 0.87–0.90), with solar radiation and temperature being the most important predictors. This can serve to predict O3 for cities with real time meteorological data but no pollutant sensing capability. Incorporating other pollutants (PM2.5, SO2, and CO) into the models did not significantly improve O3 prediction (R = 0.91–0.94). Predictions were also made for PM2.5, but results could not reflect its peaks and outliers resulting from local sources. Here we make a comparative analysis of three different statistical predictor models: (1) Multiple Linear Regression (MLR), (2) Support Vector Regression (SVR), and (3) Artificial Neuronal Networks (ANNs) to forecast hourly O3 and PM2.5 concentrations in a mid-sized Andean city (Manizales, Colombia). The study also analyzes the effect of using different sets of predictor variables: (1) Spearman coefficients higher than ± 0.3, (2) variables with loadings higher than ± 0.3 from a principal component analysis (PCA), (3) only meteorological variables, and (4) all available variables. In terms of the O3 forecast, the best model was obtained using ANNs with all the available variables as predictors. The methodology could serve other researchers for implementing statistical forecasting models in their regions with limited pollutant information.

Keywords: Tropospheric ozone, Particulate matter, Hourly concentrations, Andean city, Support Vector Regression, Artificial Neuronal Network

Don't forget to share this article 


Subscribe to our Newsletter 

Aerosol and Air Quality Research has published over 2,000 peer-reviewed articles. Enter your email address to receive latest updates and research articles to your inbox every second week.

Aerosol and Air Quality Research (AAQR) is an independently-run non-profit journal, promotes submissions of high-quality research, and strives to be one of the leading aerosol and air quality open-access journals in the world.