Impacts of Chemical Initial Conditions in the WRF-CMAQ Model on the Ozone Forecasts in Eastern China

Ozone (O 3 ) has become the major factor for exceeding air pollution standards in many Chinese cities, especially in the more economically developed and densely populated regions, such as eastern China. In this study, we applied the Weather Research and Forecasting/Community Multiscale Air Quality (WRF/CMAQ) model to predict the air quality, and evaluated the influences of different chemical initial conditions on the O 3 forecasts with observations in Tai’an and other 13 cities in eastern China in June 2021. The influences of different chemical initial conditions on the O 3 forecasts are presented by using two sets of meteorological data (NCEP Final Operational Global Analysis [FNL] and Global Forecast System [GFS]) as initial conditions (IC) and boundary conditions (BC) to drive the WRF/CMAQ model. It was found that the O 3 concentrations forecasted by FNL-GFS, in which the chemical IC derived from the CMAQ simulation results by using the FNL data as IC and BC, were closer to observations in all cities than GFS-GFS, in which the chemical IC derived from the CMAQ simulation results by using the GFS data as IC and BC. The normalized mean bias (NMB) values of FNL-GFS for O 3 met the benchmark ( ± 15%), while the NMB values of GFS-GFS in Hangzhou and Shijiazhuang did not meet the benchmark. The model performances in Tai’an city were similar to those in 13 cities with better results for FNL-GFS than GFS-GFS. The comparisons of contributions of source regions to O 3 in the receptor Tai’an city indicate that different episodes had different relative contributions of source regions and that the simulations of FNL-GFS were more similar to the retrospective simulations than GFS-GFS. The comparisons of contributions of different source sectors to O 3 in Tai’an city show that industry emissions are the largest contributor, followed by transportation, power plants and residential emissions


INTRODUCTION
With the rapid development of industrialization and urbanization in China over the past decades, the emissions of ozone (O3) precursors in the troposphere have been increasing, leading to increasingly serious O3 pollution, especially in eastern China, which is the most economically developed region (Liu et al., 2010;Wang et al., 2017a;Wang et al., 2017b). A report of "Bulletin on the dependence of the average O3 concentrations on different meteorological parameters during different periods in Tai'an city. We discussed characteristics of O3 and its precursors in three episodes in Tai'an. We applied WRF/CMAQ-ISAM to identify the main sources leading to the increases of O3 pollution events.

Observation Datasets
Hourly observed O3 concentrations of monitoring sites in 13 provincial capital cities (Beijing, Fuzhou, Hangzhou, Hefei, Jinan, Nanchang, Nanjing, Shanghai, Shijiazhuang, Taiyuan, Tianjin, Wuhan, Zhengzhou) in eastern China were downloaded from the China National Environmental Monitoring Center (CNEMC, http://www.cnemc.cn/). In order to evaluate the model performance, the model results of each monitoring site in each city were retrieved, and the mean values of observations and simulations of all monitoring sites in each city were compared according to Zhang et al. (2021). The location of each city in eastern China is shown in Fig. 1(a). Hourly measurements of meteorological parameters (temperature [T], relative humidity [RH], wind speed [WS] and wind direction [WD]) used in this study were obtained from the National Climate Data Center (NCDC) (ftp://ftp.ncdc.noaa.gov/pub/data/noaa/).
Measurements of O3 were taken during the study period at five observation stations and one super observation stations in Tai'an city, and the data were provided by Tai'an Ecological Environment Bureau. Fig. 1(b) shows the locations of the observation sites selected for this study, including Jiancezhan (JCZ), Renkou-xuexiao (RK), Dianli-xuexiao (DL), Shandong-diyiyike-daxue (YK), Jiaotong-jixiao (JT), and super observation stations (SUS). The site location information is shown in Table S1. The hourly meteorological data at the JCZ site were used to do model evaluation and the concentrations of O3 precursors (VOC and NOx) were from SUS in this study.

Description of AQF System
The real-time AQF system contains three major components. The first component is the WRF meteorological model (version 4.0) which simulates meteorological fields using the meteorological initial conditions (IC) and boundary conditions (BC) derived from the FNL (WRF-FNL) and GFS (WRF-GFS) results and provides hourly meteorological fields to drive the CMAQ model. Both FNL data (with spatial resolution of 1° × 1° and temporal resolution of 6 h) and GFS data (with spatial resolution of 0.5° × 0.5° and temporal resolution of 6 h) were obtained from the National Centers for Environmental Prediction (NCEP). FNL are retrospective data which assimilate a large number of ground and satellite observation data, and have high temporal and spatial resolutions. It includes 26 ground standard isobaric layers (10-1000 hpa), surface boundary layer and many physical quantities of tropopause (Yan, 2012). On the other hand, GFS are forecast data which releases 384 hours (16 days) of forecast data four times a day, and its temporal and spatial accuracies will decrease with the forecast time. It is generally believed that the accuracy of the forecast data from the GFS model after 7 days is not high, and most official agencies rarely use the GFS forecast data after 10 days (Yan, 2012). The second component is the US EPA's CMAQ model, version 5.0.2, which provides spatial and temporal prediction of O3 by simulating physical and chemical processes. The WRF-CMAQ system adopted an offline paradigm. Fig. 1(a) shows the simulation domain with a horizontal resolution of 36 km × 36 km covering central and eastern China, and a portion of East Asia, with a grid distribution of 82 × 118. The location of Tai'an city is shown in Fig. 1(b), in which RK, DL and JT were located in the same grid. The third component is an emission processing system which processes anthropogenic emissions from the Multiresolution Emission Inventory for China (MEIC) (http://www.meicmodel.org) developed by Tsinghua University for 2016, with a horizontal resolution of 0.25° × 0.25°. This MEIC includes monthly anthropogenic emissions of SO2, NOx, CO, ammonia (NH3), PM2.5, PMcoarse, black carbon (BC), organic carbon (OC) and non-methane volatile organic compounds (NMVOCs). The Biogenic Emission Inventory System version 3.14 (BEISv3.14) was used to calculate inline the natural sources for biogenic emissions (Wang et al., 2021a;Yu et al., 2014).
ISAM (Integrated Source Apportionment Method) was applied to provide information about O3 source apportionments, mainly to track the contributions of industrial and regional sources to atmospheric O3 concentrations (Kwok et al., 2013). Based on geographical distributions, 6 regions including Beijing-Tianjin-Hebei (BTH), Henan (HN), Jiangsu (JS), Tai'an (TA), Shandong except Tai'an (SD), and other region except the marked areas in the domain (OTH) inside eastern China were set as tracked source regions of O3 concentrations using the ISAM source apportionment method (supplementary materials Fig. S1) in this study.
Since the FNL and GFS data are retrospective and forecast data, respectively, and the official website updates them regularly every day, we download them immediately after the updated data become available. We downloaded GFS data of today and the next three days at 1 p.m. and downloaded FNL data for the time from 0:00 yesterday to 0:00 today at 3 p.m. (Fig. S2). In this study, the spin-up period of 7 days was used for model forecasts of the first day on June 1, while the chemical IC derived from the CMAQ simulation results for the previous day will be used for the future model forecasts on the continuous basis. The following two forecasts cases were carried out on the basis of available times for the FNL and GFS data to study the impacts of chemical IC on the O3 forecast: (1) Case 1: The chemical IC derived from the CMAQ simulation results by using the GFS data as initial and boundary conditions for the WRF model, and the forecasts of the next few days with GFS data as initial and boundary conditions for the WRF model (GFS-GFS). This means that the WRF-CMAQ simulations were carried out with the GFS data all times as initial and boundary conditions. (2) Case 2: The chemical IC derived from the CMAQ simulation results by using the FNL data as initial and boundary conditions for the WRF model, and the forecasts of the next few days with GFS data as initial and boundary conditions for the WRF model (FNL-GFS). This means that the WRF-CMAQ forecasts had to start at 0:00 of today because of availability of FNL data. (3) In addition, in order to analyze whether Case 2 can improve the forecast results relative to Case 1, we conducted retrospective simulations as a standard case, that is, Case 3: The WRF-CMAQ simulations were carried out with the FNL data all times as initial and boundary conditions (FNL-FNL) (Fig. 2).

Model Evaluation Protocol
We used statistical metrics to evaluate the performance of WRF/CMAQ models by calculating the relationship between observed and predicted values (Jiang et al., 2010;Wang et al., 2013Wang et al., , 2010Yu et al., 2006). The statistical metrics including the average of observations and predictions (MEAN), mean bias (MB), mean error (ME), root mean square error (RMSE), normalized mean deviation (NMB), normalized mean error (NME) and Pearson correlation coefficient (R) were calculated for each city. The definitions for these metrics were found in Yu et al. (2006). Table 1 compares the simulation performances of WRF model using both FNL (WRF-FNL) and GFS (WRF-GFS) data as IC and BC for T and RH. The statistical results of meteorological predictions indicated that WRF-FNL was slightly better than WRF-GFS in the simulations of T and RH. The MB values of T predicated by WRF-GFS in Hefei, Nanchang, Shanghai, Tianjin, Wuhan, Zhengzhou were 1.30, -0.91, -0.56, 0.57, 0.96 and -0.90°C, respectively, which exceeded the benchmark (± 0.5°C, Emery et al., 2017;Wang et al., 2021b), whereas these for WRF-FNL were -0.06, -0.36, -0.22, 0.39, -0.18, and -0.17°C, respectively, and were within the benchmark.  Emery et al., 2017), while the predictions by WRF-GFS did not meet the benchmark in all cities except in Shanghai (1.51°C). No benchmarks were suggested for the MB and ME values for RH. The deviations of WRF-FNL for RH were significantly lower than those of WRF-GFS in nine cities. For example, as shown in Table 1, the MB values for RH in Beijing, Hangzhou, Hefei, Jinan, Nanjing, Shanghai, Taiyuan, Wuhan and Zhengzhou for WRF-GFS and WRF-FNL decreased from -4.15% to -2.26%, 7.99% to 5.61%, -6.98% to -1.65%, 3.44% to 1.19%, -3.22% to -0.90%, -4.31% to -0.07%, -4.34% to -2.45%, -10.18% to -5.91%, and -5.22% to 0.45%, respectively. The ME and RMSE values of RH in Fuzhou, Hangzhou, Hefei, Nanchang, Nanjing, Shanghai, Taiyuan and Wuhan predicted by WRF-FNL were all lower than WRF-GFS. The higher R values in WRF-FNL also indicate that the simulated values of T and RH by WRF-FNL were more consistent with the observed values than WRF-GFS.

Meteorological prediction evaluations
The WS values were over-predicted by both WRF-FNL and WRF-GFS, while the over-prediction of WRF-GFS was higher than WRF-FNL, as indicted by the MB, ME and RMSE values (see Table 2). Some studies have reported that the WRF model over-predicted low WS values, especially when the WS value was less than 3 m s -1 (Angevine et al., 2012;Fast et al., 2014;Hu et al., 2015;Wang et al., 2021b). The MB values of WS in all cities did not meet the benchmark (± 0.5 m s -1 , Emery et al., 2017), while the ME values satisfied the benchmark (2 m s -1 , Emery et al., 2017). The MB and  (Emery et al., 2017), while their WNMB values in Fuzhou, Taiyuan, Wuhan and Zhengzhou did not meet the benchmark. The WNME values in all cities except Shanghai exceeded the benchmark (WNME ≤ 30%). In conclusion, the simulated T and RH values of WRF-FNL were closer to the actual observations, and WRF-FNL captured T, RH and WS in most cities better than WRF-GFS.

Ozone prediction evaluations
Different meteorological IC and BC data not only affected the simulation performance of the WRF model on weather conditions, but also strongly affected CMAQ simulations of O3 formations. Table 3 summarizes the CMAQ model performances for O3 simulations in thirteen cities for Case 1 (GFS-GFS), Case 2 (FNL-GFS) and Case 3 (FNL-FNL). The O3 concentrations predicted by FNL-GFS .9 µg m -3 , respectively, their corresponding NMB and NME values in both FNL-GFS and GFS-GFS were between 18%-41% and between 36%-48%, respectively. However, in cities with the high observed O3 concentrations (Jinan, Nanjing, Shijiazhuang, Taiyuan, Zhengzhou), the models yielded lower NMB and NME values. Fig. 3 shows the time series of daily predicted and observed maximum 8-h O3 (MDA8 O3) concentrations in thirteen cities and the results are summarized in Table 4. Both FNL-FNL and FNL-GFS captured the daily variations of observed MDA8 O3 concentrations at all cities except Fuzhou where the R values were less than 0.5 because of very low O3 concentrations (observed mean O3 concentration was 84.2 µg m -3 ). The NMB and NME values of both FNL-FNL and FNL-GFS in all cities except Beijing, Fuzhou and Wuhan met the benchmark (± 15% for NMB and 30% for NME), whereas the corresponding values of GFS-GFS were higher. The comparisons of the results for three cases in Table 4 reveal that the simulations of FNL-GFS were closer to observations and more similar to FNL-FNL than GFS-GFS for all thirteen cities as expected.   Tai'an city from June 1 to 30, 2021. Compared to WRF-GFS (-1.70°C), WRF-FNL yielded a smaller negative MB value (-1.32°C) for T. For RH, WRF-FNL also yielded slightly better simulated values with MB and ME values of 8.12% and 12.32%, respectively, while their corresponding values for WRF-GFS were 10.40% and 14.58%, respectively (see Table S4). Generally, the simulated values of T and RH in Tai'an city were in good agreement with the observations, but with over-predictions of WS in WRF-FNL. The correlation coefficients between simulations and observations for T and RH for WRF-FNL were 0.78 and 0.85, respectively, better than WRF-GFS which were 0.72 and 0.80, respectively. The model captured the synoptic features and daily deviations of WD with slight deviation in both WRF-FNL and WRF-GFS.

Effects of meteorology on ozone formation in Tai'an
Temperature is related to the chemical kinetic rates and the formation mechanism of O3. According to variation analyses for the entire day, daytime (06:00-19:00) and nighttime (20:00-05:00) in Fig. 5(a), it is found that the change patterns of O3 were consistent with those of T in Tai'an city. High temperature is often accompanied by strong radiation, being conducive to the formation of photochemical O3 . As shown in Fig. 5(a), the ground-level O3 concentrations present a directly positive correlation with T. The trends of FNL-GFS were more consistent with the observations compared to GFS-GFS (see Fig. 5(a)).
As shown by the dependences of the average concentrations of O3 on RH during entire day, daytime and nighttime in Fig. 5(a), the O3 concentrations decreased with the increases of humidity, and these trends were more obvious in the daytime than the nighttime. As pointed by Yu (2019), moisture can suppress O3 formation by lowering the air temperature, decreasing chain length of peroxy radical chemical amplifiers, decreasing chain length of NO2, and destroying the existing O3 photochemically through water vapor by catalytic O3 destruction cycle. The model    results of the correlational analyses between RH and O3 in FNL-GFS were more consistent with the observations than GFS-GFS. The simulated values of FNL-GFS were higher than GFS-GFS when the RH was low, while the simulated values of FNL-GFS were lower than GFS-GFS when the RH was high. The results of FNL-GFS were closer to the observations than GFS-GFS as indicated in Fig. 5(a). The correlational analyses show that there were nonlinear relationships between WS and O3 ( Fig. 5(a)). The maximum O3 level was found when the WS values were in the range of 2.6 to 3 m s -1 , and then decreased with the increases of WS at daytime (Fig. 5(a)). The reasonable explanation was that with the increase of WS, the stability of boundary layer started to decrease, and more O3 was transported from the upper layer to the surface layer. On the other hand, exogenous transport was strong at relatively low WS, yet very high WS promoted atmospheric mixing, dispersion, and transport, which favored O3 dilution . Fig. 5(b) provides the wind frequency rose diagrams of O3 concentrations at different time periods of the day. The observation results show that the prevailing surface WDs during the daytime were NE, ENE and ESE with the wind frequencies of 18.3%, 13% and 10.5%, respectively. For the daytime, both FNL-GFS and GFS-GFS captured WDs of NE and ENE, while missed the low wind frequencies of SW, WSW and W, while GFS-GFS over-predicted the O3 concentration from the E directions. The observed prevailing WDs during the nighttime turned to NE, ENE and W with the wind frequencies of 29.8%, 19.1% and 15%, respectively. Both FNL-GFS and GFS-GFS captured WD of ENE, while under-predicted the wind frequency of NE and missed the WD of W, while GFS-GFS over-predicted the O3 concentration from the ENE direction. In general, FNL-GFS had a better performance than GFS-GFS in capturing the O3 concentrations for different WD values.  corresponding peak concentrations in the FNL-GFS predictions were 207 and 270 µg m -3 , respectively, while they were 171 and 185 µg m -3 in the GFS-GFS predictions, respectively. The NMB values in JCZ, RK/DL/JT and YK sites for FNL-GFS and GFS-GFS improved from -5.1% to -2.4%, -2.2% to -0.8%, and -3.4% to -1.4%, respectively, while the corresponding NME values improved from 28.8% to 26.2%, 28.3% to 22.8%, and 31.0% to 26.8%, respectively. Meanwhile, the spatiotemporal distributions of these two days in eastern China in Fig. S3 also showed that the simulations of FNL-GFS were more similar to FNL-FNL than GFS-GFS, and the case of GFS-GFS under-predicted the peak values of O3 concentrations in most of eastern cities (see Fig. S3). Fig. 7 shows time-series of comparisons of the observations with the results of different chemical IC conditions in the FNL-GFS and GFS-GFS simulations for O3 and its precursors (NOx and VOC) at the super station in Tai'an city for each day in June 2021. As can be seen, the results of the chemical IC conditions in the FNL-GFS simulations were closer to the observations for O3, NOx and VOC than those in the GFS-GFS simulations for both values of NMB and NME. The NMB (NME) values for O3, NOx and VOC in the FNL-GFS simulations were -6.1% (13.3%), -12.7% (20.3%) and 93.9% (44.3%), respectively, while they were -14.2% (31.9%), -44.3% (28.8%) and 125% (100.5%) in the GFS-GFS simulations, respectively. The similar results can be found for both O3 and NO2 in 13 cities in eastern China with the better performances of chemical IC conditions in the FNL-GFS simulations than the GFS-GFS simulations (see Tables S2 and S3).

Ozone source apportionment analyses for the three episodes
As shown in Fig. 6, two O3 episodes with high concentrations (episodes I: June 4-7 and episodes II: June 20-23, 2021) and a non-episode without precipitation with very low concentrations (June 14-17, 2021) were selected for comparative analyses. The average values of O3 concentrations and meteorological parameters for these episodes are listed in Table S5. The observed mean concentration of O3 in the non-episode period was 78.5 µg m -3 , while their values in the episodes I and II were 148.8 and 172.5 µg m -3 , respectively. Table S5 shows that higher T and lower RH were observed for both pollution episodes compared to the non-pollution episode. The predicted results also captured these features well. As mentioned in Section 3.2.2, high T and low RH values were accompanied by strong UV radiation, which favored photochemical oxidation reactions for the O3 formation. Therefore, meteorological conditions can be the good explanation for the O3 episodes. The results of the FNL-GFS predictions for both high and low O3 concentrations were closer to the observations than those of GFS-GFS as indicated in Table S4. Table 5 summarizes average contributions of source regions to O3 concentrations in the receptor Tai'an city for three episodes estimated by the three cases. Table 5 indicates that different episodes in Tai'an city had different relative contributions of source regions although BCON was the largest contributor for all three episodes. For example, for the non-episode case on the basis of the FNL-FNL simulations, BCON was the largest contributor (46.9%), followed by SD (18.0%), JS (14.7%), OTH (11.4%), TA (4.0%), BTH (3.5%), and HN (1.6%), while for the episode II, BCON was the largest contributor (33.8%), followed by SD (25.5%), HN (12.9%), OTH (8.1%), JS (8.0%), BTH (7.4%), and TA (4.2%). The comparisons of the results for three cases in Table 5  According to the results of ISAM source apportionments, the mean contribution values of different source sectors to the O3 concentrations in Tai'an city for three episodes were estimated by the three cases and are summarized in Table 6. Table 6 indicates that the industry emissions the largest contributor, followed by transportation, power plants and residential emissions for all three episodes without exception for all three simulation cases. For example, for the episode II on the basis of the FNL-FNL simulations, the industry emissions were the largest contributor (33.4%), followed by transportation (26.9%), power plants (22.7%) and residential (17.0%) emissions. The comparisons of the results for three simulation cases in Table 6 reveal that the simulations of FNL-GFS were more similar to FNL-FNL than GFS-GFS as expected. For example, for the episode I,

CONCLUSIONS
In this study, we applied a real-time air quality forecasting (AQF) system of WRF/CMAQ model to predict air quality in June 2021 in eastern China. The influences of different chemical IC on the O3 forecasts and their comparisons to both chemical and meteorological observations in Tai'an and other 13 cities in eastern China are presented. On the basis of meteorological data in 13 cities, the simulated T and RH values of WRF-FNL were closer to the actual observations, and WRF-FNL captured T, RH and WS in most cities better than WRF-GFS. For the CMAQ model performances of O3 simulations in 13 cities for Case 1 (GFS-GFS), Case 2 (FNL-GFS) and Case 3 (FNL-FNL), it was found that the O3 concentrations forecasted by FNL-GFS were similar to those in FNL-FNL in all cities and the NMB values of both FNL-FNL and FNL-GFS met the benchmark of NMB (± 15%) in eight cities, including Hangzhou, Hefei, Jinan, Nanjing, Shijiazhuang, Taiyuan, Wuhan, Zhengzhou, while the NMB values of GFS-GFS in Hangzhou and Shijiazhuang did not meet the benchmark. The results for MDA8 O3 concentrations in 13 cities also indicate that the NMB and NME values of both FNL-FNL and FNL-GFS in all cities except Beijing and Fuzhou met the benchmark (± 15% for NMB and 30% for NME), and that the simulations of FNL-GFS were closer to observations and more similar to FNL-FNL than GFS-GFS for all 13 cities as expected. The model performances for both meteorological and O3 data in Tai'an city were similar to those in 13 cities with better results for FNL-GFS than GFS-GFS. The chemical IC of O3 and its precursors NOx and VOC in Tai'an for FNL-GFS also showed the better results than GFS-GFS. The comparisons of contributions of source regions to O3 in the receptor Tai'an city for three episodes indicate that different episodes in Tai'an city had different relative contributions of source regions although BCON was the largest contributor for all three episodes, and that the simulations of FNL-GFS were more similar to FNL-FNL than GFS-GFS as expected. The comparisons of contributions of different source sectors to O3 in Tai'an city for three episodes show that the industry emissions are the largest contributor, followed by transportation, power plants and residential emissions for all three episodes without exception for all three simulation cases. The results reveal that the simulations of FNL-GFS were more similar to FNL-FNL than GFS-GFS as expected.