# Air Pollution Forecasting Using Artificial and Wavelet Neural Networks with Meteorological Conditions

Qingchun Guo This email address is being protected from spambots. You need JavaScript enabled to view it.1,2, Zhenfang He This email address is being protected from spambots. You need JavaScript enabled to view it.1,3, Shanshan Li1, Xinzhou Li2,4, Jingjing Meng1, Zhanfang Hou1, Jiazhen Liu1, Yongjin Chen1

1 School of Environment and Planning, Liaocheng University, Liaocheng 252000, China
2 State Key Laboratory of Loess and Quaternary Geology, Institute of Earth Environment, Chinese Academy of Sciences, Xi’an 710061, China
3 State Key Laboratory of Urban and Regional Ecology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
4 CAS Center for Excellence in Tibetan Plateau Earth Sciences, Beijing 100101, China

Revised: May 5, 2020
Accepted: May 5, 2020

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Guo, Q., He, Z., Li, S., Li, X., Meng, J., Hou, Z., Liu, J. and Chen, Y. (2020). Air Pollution Forecasting Using Artificial and Wavelet Neural Networks with Meteorological Conditions. Aerosol Air Qual. Res. 20: 1429–1439. https://doi.org/10.4209/aaqr.2020.03.0097

## HIGHLIGHTS

• Two models for predicting air pollution index (API) were developed.
• Different combinations of meteorological variables were selected as input variables.
• Performance of WANN model is better than that of the ANN model.
• 16 meteorological factors and past 3 days’API were the best input variables.
• Only including past 3 days’ API as parameters in input datasets gives precise results.

## ABSTRACT

Air quality forecasting is a significant method of protecting public health because it provides early warning of harmful air pollutants. In this study, we used correlation analysis and artificial neural networks (ANNs; including wavelet ANNs [WANNs]) to identify the linear and nonlinear associations, respectively, between the air pollution index (API) and meteorological variables in Xi’an and Lanzhou. Evaluating twelve algorithms and nineteen network topologies for the ANN and WANN models, we discovered that the optimal input variables for an API forecasting model were the APIs from the 3 preceding days and sixteen selected meteorological factors. Additionally, the API could be accurately predicted based solely on the value recorded 3 days earlier. Based on the correlation coefficients between the air pollution index of the targeted day and the tested variables, the API displayed the closest relationship with the API 1 day earlier as well as stronger correlations with the average temperature, average water vapor pressure, minimum temperature, maximum temperature, API 2 days earlier, and API 3 days earlier. When Bayesian regularization was applied as a training algorithm, the WANN and ANN models accurately reproduced the APIs in both Xi’an and Lanzhou, although the WANN model (R = 0.8846 for Xi’an and R = 0.8906 for Lanzhou) performed better than the ANN (R = 0.8037 for Xi’an and R = 0.7742 for Lanzhou) during the forecasting stage. These results demonstrate that WANNs are effective in short-term API forecasting because they can recognize historic patterns and thereby identify nonlinear relationships between the input and output variables. Thus, our study may provide a theoretical basis for environmental management policies.

Keywords: Air pollution; Wavelet artificial neural network; Meteorological factor; Forecast.

## INTRODUCTION

Air pollution is a theme of high importance, and global problems have demonstrated its damaging impacts on human physical health and ecosystems (Nguyen et al., 2015). Meanwhile, it also has a detrimental effect on visibility, climate, and sustainable development (Lelieveld et al., 2015). Poor air quality is one of the five major health risks in the world, for example, long-term exposure to polluted air is related to respiratory infections, heart attack, stroke and lung cancer (Kessler, 2014; Watson, 2014; Lelieveld and Pöschl, 2017). And air pollution has adverse effects on people’s life span, and social communication willingness (Huang et al., 2018).

Due to the large-scale development of industrialization and urbanization, China has been suffering from acute air pollution for many years (Liu and Diamond, 2005). The number of haze days in a year has also risen evidently in China, which has seriously hindered the sustainable development of society and caused widespread concern from all walks of life (Jiang and Bai, 2018). In 2013, China suffered extremely serious haze pollution, influencing 800 million people, and daily average PM2.5 concentrations at a site in Xi’an were more than twice those of Beijing, Shanghai, and Guangzhou (Huang et al., 2014).

Air pollution forecasting also is crucial for public health interventions and air pollution control policymaking. However, air quality forecasting is quite complex (Li et al., 2017a; Park et al., 2018). Apart from the rapid economic growth, air pollution is affected by unfavorable meteorological conditions (Al-Saadi et al., 2005; Gong and Ordieres-Meré, 2016; Li et al., 2017b).

Artificial neural network (ANN) has been performed to predict ground motion (Wiszniowski, 2016), and groundwater depth (He et al., 2014). In particular, ANN has been shown to be effective for more complex tasks. And ANN models utilize a sophisticated technique that has been successfully applied to forecast air pollution (Li et al., 2017b). However, in some cases, the data is too complex for the modeling tools to be processed. Hence it is necessary to preprocess the input system information (Simons et al., 1995). Traditionally, this has been done by using principal component analysis (Xia et al., 2015), or by using Fourier transform (Artursson et al., 2002). Whatever technique is used, it must address two objectives: to save a number of relevant information and to reduce the complexity of the input signal (Zhao et al., 2018). Here, the wavelet transformation was employed to extract the important information from the past air pollution index (API) and meteorological factors. The use of wavelet artificial neural network (WANN) as the predictive model is explained by emphasizing the following aspects: (a) the effects of diverse network parameters and (b) investigation of the capability of WANN model for forecasting next-day air quality (Bai et al., 2016), which offers important guidance to public.

## MATERIAL AND METHODS

### Study Area and Data Introduction

The two study stations are Xi’an and Lanzhou, both located in China (Fig. 1). API data were gathered at the Environmental Protection Agency, and meteorological data at the Meteorological Bureau.

Fig. 1. Location of monitoring stations (Xi’an and Lanzhou).

Fig. 2 shows API from January 2010 to December 2012. API has a periodic law at both sites, and API is larger in the winter and spring, while it is smaller in autumn and summer. And PM10 is the primary air pollutant in both cities, therefore, API may represent PM10.

The data series were divided into a training group (January 2010–December 2011), a calibration group (January–June 2012) and a testing group (July–December 2012). The generalization ability of WANN and ANN is tested by cross-correction.

Fig. 2. Air pollution index (API) from January 2010 to December 2012.

### Artificial Neural Network

Fig. 3 provides an architecture of the artificial neural network employed in the study with one node (API) in the output layer and nineteen nodes in the input layer. The input layer consists of nineteen nodes; namely, precipitation (P), extreme wind speed (EWS), extreme wind speed direction (EWSD), average atmospheric pressure (AAP), average wind speed (AWS), average temperature (AT), average water vapor pressure (AWVP), average relative humidity (ARH), sunshine duration (SD), minimum atmospheric pressure (MAP), minimum temperature (MINT), maximum atmospheric pressure (MAP), maximum temperature (MAXT), maximum wind speed (MWS), maximum wind speed direction (MWSD), minimum relative humidity (MRH), air pollution index [API(t)], API(t – 1), and API(t – 2).

Fig. 3. Schematic of the artificial neural network used to forecast air pollution index in this study.

Backpropagation is a general approach to train ANNs to minimize the goal function (global error function) (Nunnari et al., 2004). The global error function (F) is computed by utilizing Formula (1):

where F is the global error function, Bi is the expected output, and Di is the output of network prediction. The gradient descent technique is employed to adjust the weights of F minimization by using Formula (2) below:

where ΔCji = weight; and η = learning rate.

### Wavelet Transformation

The Mallat pyramidal algorithm is used to calculate the discrete wavelet transform coefficient (DWT) (Mallat, 1989a). So, the DWT was employed to analyze the API and meteorological data. The DWT also comprises a multi-resolution decomposition scheme for input signals (Mallat, 1989b). The DWT of a data sequence f(q) is defined as Formula (3):

where ψ(q) indicates the base wavelet of active length q; u indicates the scale or dilation factor; v indicates the translation in time. For a discrete signal f(q), f(q) ∈ S2 (R), the DWT is defined by multi-resolution decomposition, which can be calculated by the Mallat decomposition algorithm and Mallat pyramidal reconstruction algorithm (Li et al., 1997):

where t and r are the impulse responses to high-pass filter T and low-pass filter R, respectively;  and  are the wavelet series and dimension of the 2i dimension, respectively; and S is the maximum probable dimension of the discrete data f[m]. The Mallat pyramidal reconstruction formula is:

where  and  are the impulse responses to R* and T*, respectively, that is,

${R}^* = \overline{R}^T , {T}^* = \overline{T}^T$
.

The major aim of using DWT is to decrease the complicacy of input signal and the number of related information between decomposition compositions (detailed CD1, CD2 and approximate CA2). DWT can be employed to approximate components to obtain low-dimension compositions and get that of multi-dimension analysis. The correlation coefficients of CD1, CA2 and CD2 are less than 0.0037. It turned out to be the best way to achieve our goal.

### Wavelet Artificial Neural Network

We use WANN model architecture (Fig. 4) to decompose the original time series (API) into three sets of data: detailed CD2 and CD1 components and approximate CA2. Afterwards, these data are used by the ANN as the input elements. In Fig. 4, An is input variables, APIn + t is the next-day API.

Fig. 4. Schematic of a wavelet artificial neural network (WANN) used to forecast air pollution index.

### Evaluation Criteria

Four performance criteria are employed to assess the validity of WANNs and ANNs adopted in the research. These are root mean square error (RMSE), mean error (ME), percentage error of peak (EOp), and correlation coefficient (R), which are as follows (He et al., 2014):

where Bj = measured API for the jth data, Dj = fitting API for the jth data,  = mean of measured API,  = mean of fitting API, L = number of measures, Dp expresses the peak of the fitting API, BP is the peak of the measured API and EOp is the relative error of peak API.

## RESULTS AND DISCUSSION

### Correlation Analysis

Correlation analysis can determine the linear associations between air pollution and meteorological variables. The main disadvantage of using correlation analysis is that it could only detect the linear relationship between two variables. As a result, correlation analysis cannot catch any possible nonlinear relationship that may exist between the outputs and the inputs and may result in missing important output-related inputs in a nonlinear fashion.

The determination of input parameters is one of the most important steps in the design of WANN models. The selection of correlation functions calculated for the variables is shown in Table 1, which is 95% significant. The performance of every variable was evaluated by computing its correlation coefficient (R) with API(t + 1). The analysis showed that API(t) was strongly related to API(t + 1) at the two stations. Furthermore, the performances of average temperature, average water vapor pressure, minimum temperature, maximum temperature, API(t – 1), and API(t – 2) were better than other variables at two stations. That is, the meteorological parameters with the highest correlation to API(t + 1) comprise the above variables. We identified seven significant variables. Therefore, different combinations of variables were selected as inputs for modeling daily API in Table 2. The selection of the variables was based both on comprehensive correlation analysis and on existing knowledge. The horizontal wind is the basic parameter that controls the horizontal dispersion and transport of air pollutants. The effects of solar radiation on the reaction rate constants and, consequently, on the destruction and formation of photochemical species, are complicated. The removal of air pollutants from the atmosphere by precipitation is a very effective process that often leads to low air pollution levels. Many pollutants are highly persistent, and it is usually accepted that the possibility of occurrence of air pollution events increase if the past day’s air pollution was higher than normal.

### Determination of Network Topologies and Training Algorithms

It should be emphasized that finding the most appropriate model structure may be one of the main tasks of the model developer. That is probably because there are usually a lot of candidate variables, and the priority is unknown. Moreover, the relationship between inputs and air quality is nonlinear and highly location dependent.

The different details and dimensions of the input variables are obtained by two-stage decomposition of the wavelet transform. After two-stage decomposition and reconstruction, the variables are divided into three portions. The approximate composition CA2 indicates the general trend of the original variables, while the detailed CD2 reflects the periodic values of the original variables, and the detailed CD1 reflects the inhomogeneity and complicacy of the original variables. In other words, the detailed CD1 determines the complexity of empirical model predictions.

The variation characteristics of sequences are the critical elements affecting the selection of wavelets (Sang, 2013). In order to decompose the input variables optimally, the mother wavelet is selected, and the similarity between the CD1, CD2 and CA2 is considered. The minimum R can best satisfy our purpose of analyzing the variation characteristics of different components of the input variables. The quantitative calculation shows that the components are independent of each other. Twenty-one kinds of wavelet functions are selected for DWT. Table 3 shows that db4 is the best wavelet function in the study because it has the smallest R. Here we take the average temperature as an example, and other input variables have similar results.

Trial and error is applied to acquire the optimal model parameters. Fig. 5 shows that network topologies (19-3-1 for ANN and 57-6-1 for WANN) for Xi’an are better than others by trial and error. The amount of nodes of the hidden layer rises from 1 to 19 in the models. The following observations can be made from Fig. 5 as raising the amount of nodes of the hidden layer: The RMSE values decreased slightly, but after 19-3-1 for ANN and 57-6-1 for WANN, RMSE values increase and fluctuate. Therefore, the best topologies of the patterns for Xi’an are separately identified as 19-3-1 for ANN and 57-6-1 for WANN. Similarly, the best system topologies for Lanzhou are separately 19-3-1 for ANN and 57-6-1 for WANN.

Fig. 5. Optimization of network topologies and training algorithms in Xi’an.

Fig. 5 shows the performances of the improved training algorithms, revealing that the trainbr algorithm has the best performance in predicting API(t + 1) in Xi’an. Trainbr automatically sets optimum values for the parameters of the objective function.

Table 4 shows transfer function (tansig-purelin) in Xi’an is better than others during training, cross-validation and testing periods. Similarly, transfer function (tansig-purelin) in Lanzhou is also better than others.

### Comparative Analysis of the Models

All results of trainings for ANN and WANN during the training period are shown in Table 5. The RMSE for the ANN ANNAPI1 and WANNAPI1 in Xi’an are 22.7233 and 11.5683, respectively; the R are 0.642 and 0.9783, respectively; the ME are –0.1114 and 0.7682, respectively; and the EOp are –0.5337% and –0.016%. The WANNs are superior to the ANN during the training period. Meanwhile, similar results in Lanzhou also can be found in the RMSE, ME, R, EOp

The values of the evaluative criteria for the nine models at the two stations during the forecast period are shown in Table 6. Table 6 summarizes the results of the tests with every network configuration. The ANNs and WANNs have an agile mathematic structure and can map highly nonlinear relations. Most WANN models have good performance in Xi’an and Lanzhou. However, the performances of WANN models in Xi’an were obviously superior to those of the WANN models in Lanzhou. The performance of WANNAPI1 is better than that of ANNAPI1 in Xi’an and Lanzhou. The WANNAPI1 model had even more obvious advantages for Lanzhou, where it was found to provide a more accurate API forecast than the ANNAPI1 model. The EOp values in Table 6 show the models’ performances in simulating the extreme events. In Lanzhou and Xi’an, during the forecast period, the RMSE value of the WANNAPI1 model was the smallest of all the WANN models; the R-value of WANNAPI1 was the largest, and EOp was the smallest. The lower RMSE values indicate that the WANNAPI1 model produced fewer differences and discrepancies between the forecasted API(t + 1) and observed API(t + 1).

Fig. 6 shows observed API(t + 1) versus predicted API(t + 1) in Xi’an and Lanzhou. ANN and WANN models were both able to replicate average of API, however, limited in capturing minimal or maximal peaks.

Fig. 6. Boxplots show the variation of observed API(t + 1) (Observed-API) and predicted API(t + 1) (such as P-ANNAPI1) in Xi’an and Lanzhou.

Figs. 7 and 8 indicates that the ANN and WANN models predicted API at an acceptable accuracy level in Xi’an and Lanzhou. However, the performances of WANN models were obviously superior to those of the ANN models. The WANN models yielded a good agreement between the observed API(t + 1) and predicted API(t + 1), but it is obvious that the WANNAPI1 model was better than the WANNAPI2. It is also obvious that the WANNAPI8 model with 1–3-day lag API was better than the WANNAPI7 with 1-day lag API; that is to say, including the three previous days’ API as parameters in input data set gives more precise results. However, it is necessary to point out that the WANN methods have limitations inherent to their structures.

The agreement between the observed API(t + 1) and the predicted API(t + 1) is also very good at both stations using WANNAPI4 model. The main meteorological conditions of air pollution are average temperature (t), average water vapor pressure (t), minimum temperature (t) and maximum temperature (t) in Xi’an and Lanzhou. The possible reason is that correlation coefficient between them and air pollution is larger.

Fig. 7. Comparison between the observed API and the predicted API in Xi’an.

Fig. 8. Comparison between the observed API and the predicted API in Lanzhou.

### Comparison with Other Models

Many studies have been developed to identify and understand the relationships between air quality and meteorological conditions. ANN, which has the abilities of self-adaption and nonlinear mapping, has been certified in its advantage and widespread application in forecasting air quality. The estimated PM2.5 result in Beijing is in a better RMSE (= 24.06 mg m–3) using ANN than that obtained through multi-variate statistical analysis method (RMSE = 26.69 mg m–3) (Ni et al., 2017). In the linear regression analysis, the range of R2 at six subway stations were 0.18–0.63. Nevertheless, the neural network model with present time variables has high R2 of 0.54–0.81 (Park et al., 2018).

PM10, SO2, and CO are 35%, 43%, 28%, respectively (Kurt et al., 2008). But for the ANN model of API forecasting, the correlation coefficients are 0.6993, 0.6056, 0.6300 for SO2, PM10, NO2 (Jiang et al., 2004).

WANN has better performance for forecasting SO2, PM10, and NO2 in Chongqing than the ANN, such as, RMSE is lower at 4.447 mg m–3, 8.233 mg m–3, and 2.785 mg m–3, respectively (Bai et al., 2016). The best forecast of PM2.5 in Dingling is completed for next day utilizing the hybrid model combining ANN, wavelet transformation, and air mass trajectory, and RMSE is 15.65 mg m–3. It is also noticed that wavelet transform plays a role improving the PM2.5 forecasting accuracy (Feng et al., 2015).

The simulation and forecast were proved by utilizing the data of PM2.5 in Wuhan based on support vector machine. The results showed that the way can obtain precise outcomes (He et al., 2018). The prediction results of neural networks are better than that of linear model, and the maximum prediction error 21 hours ahead is 32% (Perez and Menares, 2018). The long short-term memory (LSTM) can effectively forecast air pollution and achieve the best results (Karimian et al., 2019). The prediction of the 2016 ozone season using generalized additive models are in good agreement with the relevant measurement results (R2 = 0.70) (Pernak et al., 2019). The average consistency index between PM2.5 prediction and observation for the four seasons in the Yangtze Delta is between 74% and 77%, using machine learning and WRF (Jia et al., 2019). The best estimation of PM2.5 (R2 = 0.84) is obtained by using artificial neural network (Bai et al., 2020). Compared with WRF, the correlation coefficients of machine learning model are higher by 50–100%, which can provide better PM2.5 prediction (Ma et al., 2020).

These studies have improved the artificial neural network and achieved better predicting results, but it still need to enhance the predicting accuracy. Therefore, learning and characteristic collection of historical data plays an important role in guaranteeing predicting accuracy.

We have reported the prediction results at both stations. In Xi’an, the WANN prediction is successful, simulating well the peaks, and the location and shape of the main peaks are predicted correctly by the model but slightly overestimating the general background API because the main purpose is to simulate the peaks correctly. In Lanzhou, the agreement was similar; the general characteristics of observed API was also successfully reproduced by the WANN. The agreement obtained in Lanzhou is good with the model accurately forecasting the location and the magnitude of the main peaks.

## CONCLUSIONS

This study presents an optimum system for nonlinear modeling of the daily API using ANNs and WANNs. The input variables (meteorological elements and APIs) for the models were defined via correlation analysis, and discrete wavelet transform was employed to decompose the time series of the meteorological conditions into different dimensions, whereby a unique mixed aspect was decomposed into multiple unique aspects. This data was then incorporated into the models to simulate the next day’s API. Our results indicated that both the ANN and the WANN models predicted the daily API with acceptable accuracy, but the performance of the latter, which integrated the nonlinear mapping of the ANN as well as the multi-scale analysis of the DWT, was obviously superior.

For future WANNs, we will focus on four aspects. Firstly, the models will address meteorological elements and forecast the API in other locations. Secondly, additional elements will be considered, for instance, the longitude, latitude, land use, topography (simulated with the digital elevation model [DEM]), and population density. Sensitivity analysis will also be conducted in order to select parameters that are more closely related to the API, thus improving the predictive accuracy. Thirdly, the models will be used to predict other complex time series that possess nonlinear and unstable characteristics, such as those for the air quality index (AQI), fine particulate matter (PM2.5), PM10, nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), and ozone (O3). Finally, a deep reinforcement learning algorithm based on multi-agent cooperation will be employed for air pollution forecasting, thus providing further insights into the multi-scale spatiotemporal prediction of pollutant concentrations.

## ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (41572150, 41472162, 41702373), Shandong Social Sciences Planning Research Program (18CKPJ34), Shandong Province Higher Educational Humanities and Social Science Program (J18RA196), and State Key Laboratory of Loess and Quaternary Geology Foundation (SKLLQG1907). The authors thank the editor and anonymous reviewers for their valuable comments and proposals, which have helped in improving the quality of our article.

## REFERENCES

1. Al-Saadi, J., Szykman, J., Pierce, R.B., Kittaka, C., Neil, D., Chu, D.A., Remer, L., Gumley, L., Prins, E., Weinstock, L., MacDonald, C., Wayland, R., Dimmick, F. and Fishman, J. (2005). Improving national air quality forecasts with satellite aerosol observations. Bull. Am. Meteorol. Soc. 86: 1249–1261.
2. Artursson, T., Spangeus, P. and Holmberg, M. (2002). Variable reduction on electronic tongue data. Anal. Chim. Acta 452: 255–264.
3. Bai, L., Huang, L., Wang, Z., Ying, Q., Zheng, J., Shi, X. and Hu, J. (2020). Long-term field evaluation of low-cost particulate matter sensors in Nanjing. Aerosol Air Qual. Res. 20: 242–253.
4. Bai, Y., Li, Y., Wang, X., Xie, J. and Li, C. (2016). Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmos. Pollut. Res. 7: 557–566.
5. Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L. and Wang, J. (2015). Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107: 118–128.
6. Gong, B. and Ordieres-Meré, J. (2016). Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques: Case study of Hong Kong. Environ. Modell. Software 84: 290–303.
7. He, P., Zheng, B. and Zheng, J. (2018). Urban PM2.5 diffusion analysis based on the improved Gaussian smoke plume model and support vector machine. Aerosol Air Qual. Res. 18: 3177–3186.
8. He, Z., Zhang, Y., Guo, Q. and Zhao, X. (2014). Comparative study of artificial neural networks and wavelet artificial neural networks for groundwater depth data forecasting with various curve fractal dimensions. Water Resour. Manage. 28: 5297–5317.
9. Huang, J., Pan, X., Guo, X. and Li, G. (2018). Impacts of air pollution wave on years of life lost: A crucial way to communicate the health risks of air pollution to the public. Environ. Int. 113: 42–49.
10. Huang, R.J., Zhang, Y., Bozzetti, C., Ho, K.F., Cao, J.J., Han, Y., Daellenbach K.R., Slowik, J.G., Platt, S.M., Canonaco, F., Zotter, P., Wolf, R., Pieber, S.M., Beuns, E.A., Crippa, M., Ciarelli, G., Piazzalunga, A., Schwikowski, M., Abbaszade, G., … Prévôt, A.S.H. (2014). High secondary aerosol contribution to particulate pollution during haze events in china. Nature 514: 218–222.
11. Jia, M., Cheng, X., Zhao, T., Yin, C., Zhang, X., Wu, X., Wang, L. and Zhang, R. (2019). Regional air quality forecast using a machine learning method and the WRF model over the Yangtze River Delta, east China. Aerosol Air Qual. Res. 19: 1602–1613.
12. Jiang, D., Zhang, Y., Hu, X., Zeng, Y., Tan, J. and Shao, D. (2004). Progress in developing an ANN model for air pollution index forecast. Atmos. Environ. 38: 7055–7064.
13. Jiang, L. and Bai, L. (2018). Spatio-temporal characteristics of urban air pollutions and their causal relationships: Evidence from Beijing and its neighboring cities. Sci. Rep. 8: 1279.
14. Karimian, H., Li, Q., Wu, C., Qi, Y., Mo, Y., Chen, G., Zhang, X. and Sachdeva, S. (2019). Evaluation of different machine learning approaches to forecasting PM2.5 mass concentrations. Aerosol Air Qual. Res. 19: 1400–1410.
15. Kessler, R. (2014). Prevention: Air of danger. Nature 509: S62–S63.
16. Kurt, A., Gulbagci, B., Karaca, F. and Alagha, O. (2008). An online air pollution forecasting system using neural networks. Environ. Int. 34: 592–598.
17. Lelieveld, J., Evans, J.S., Fnais, M., Giannadaki, D. and Pozzer, A. (2015). The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 525: 367–371.
18. Lelieveld, J. and Pöschl, U. (2017). Chemists can help to solve the air-pollution health crisis. Nature 551: 291.
19. Li, T., Shen, H., Zeng, C., Yuan, Q. and Zhang, L. (2017a). Point-surface fusion of station measurements and satellite observations for mapping PM2.5 distribution in China: Methods and assessment. Atmos. Environ. 152: 477–489.
20. Li, X., Li, H., Wang, F. and Ding, J. (1997). A remark on the mallat pyramidal algorithm of wavelet analysis wavelet analysis. Commun. Nonlinear Sci. Numer. Simul. 2: 240–243.
21. Li, X., Peng, L., Yao, X., Cui, S., Hu, Y., You, C. and Chi, T. (2017b). Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation. Environ. Pollut. 231: 997–1004.
22. Liu, J. and Diamond, J. (2005). China's environment in a globalizing world. Nature 435: 1179–1186.
23. Ma, J., Yu, Z., Qu, Y., Xu, J. and Cao, Y. (2020). Application of the XGBoost machine learning method in PM2.5 prediction: A case study of Shanghai. Aerosol Air Qual. Res. 20: 128–138.
24. Mallat, S.G. (1989a). Multifrequency channel decompositions of images and wavelet models. IEEE Trans. Acoust. Speech Signal Process. 37: 2091–2110.
25. Mallat, S.G. (1989b). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11: 674–693.
26. Nguyen, T.T.N., Bui, H.Q., Pham, H.V., Luu, H.V., Man, C.D., Pham, H.N., Le, H.T. and Nguyen, T.T. (2015). Particulate matter concentration mapping from MODIS satellite data: A Vietnamese case study. Environ. Res. Lett. 10: 095016.
27. Ni, X.Y., Huang, H. and Du, W.P. (2017). Relevance analysis and short-term prediction of PM2.5 concentrations in Beijing based on multi-source data. Atmos. Environ. 150: 146–161.
28. Nunnari, G., Dorling, S., Schlink, U., Cawley, G., Foxall, R. and Chatterton, T. (2004). Modelling SO2 concentration at a point with statistical approaches. Environ. Modell. Software 19: 887–905.
29. Park, S., Kim, M., Kim, M., Namgung, H.G., Kim, K.T., Cho, K.H. and Kwon, S.B. (2018). Predicting PM10 concentration in Seoul metropolitan subway stations using artificial neural network (ANN). J. Hazard. Mater. 341: 75–82.
30. Perez, P. and Menares, C. (2018). Forecasting of hourly PM2.5 in south-west zone in Santiago de Chile. Aerosol Air Qual. Res. 18: 2666–2679.
31. Pernak, R., Alvarado, M., Lonsdale, C., Mountain, M., Hegarty, J. and Nehrkorn, T. (2019). Forecasting surface O3 in Texas urban areas using random forest and generalized additive models. Aerosol Air Qual. Res. 19: 2815–2826.
32. Sang, Y.F. (2013). A review on the applications of wavelet transform in hydrology time series analysis. Atmos. Res. 122: 8–15.
33. Simons, J., Bos, M. and Van der Linden, W.E. (1995). Data processing for amperometric signals. Analyst 120: 1009–1012.
34. Watson, T. (2014). Environment: Breathing trouble. Nature 513: S14–S15. [Publisher Site]

35. Wiszniowski, J. (2016). Applying the general regression neural network to ground motion prediction equations of induced events in the Legnica-Głogów Copper District in Poland. Acta Geophys. 64: 2430–2448.
36. Xia, J., Park, J.H. and Zeng, H. (2015). Improved delay-dependent robust stability analysis for neutral-type uncertain neural networks with markovian jumping parameters and time-varying delays. Neurocomputing 149: 1198–1205.
37. Zhao, Y., Zhou, D. and Yan, H. (2018). An improved retrieval method of atmospheric parameter profiles based on the BP neural network. Atmos. Res. 213: 389–397.

Aerosol Air Qual. Res. 20 :1429 -1439 . https://doi.org/10.4209/aaqr.2020.03.0097