Using Fine-Resolution Satellite Data in the Megacity of Beijing , China

Estimating ground-level PM2.5 in urban areas from satellite-retrieved AOD data is limited because of the coarse resolution of the data. The spatial resolution of recent MODIS Collection 6 aerosol data has increased from 10 km to 3 km. Taking advantage of this new AOD dataset, we used a mixed effects model to calibrate the day-to-day relationship between satellite AOD and ground-level PM2.5 concentrations. Regional daily PM2.5 concentrations were estimated by the AOD from March 1, 2013, to February 28, 2014, in the megacity of Beijing. Compared with the simple linear regression model, the accuracy of the PM2.5 prediction improved significantly, with an R of 0.796 and a root mean squared error of 16.04 μg/m. The results showed high PM2.5 concentrations in the intra-urban region of Beijing because of local emissions. The PM2.5 concentrations were relatively low in the northern rural area but high in the southern rural area, which was close to the industrial sector in Hebei Province. We found that the 3 km AOD produces detailed spatial variability in the Beijing area but introduces somewhat large biases due to missing AOD pixels.


INTRODUCTION
PM 2.5 is defined as particle matter with an aerodynamic diameter of less than 2.5 µm.A large number of epidemiological studies have shown that exposure to fine particles can increase the incidence of heart disease, cardiovascular disease and lung cancer (Pope, 2000;Peters et al., 2001;Zanobetti, 2005;Hu, 2009).Ground-based measurements provide continuous and accurate data for epidemiologic studies and air quality assessments.However, ground sites are usually sparse or unavailable in many developing areas; thus, the corresponding research has large uncertainties.
Previous studies found that the satellite-derived aerosol optical depth (AOD) is closely related to the surface PM 2.5 concentration, which can be used to predict PM 2.5 at the regional scale (Chu et al., 2003;Engel-Cox et al., 2004;Gupta et al., 2006;Van Donkelaar et al., 2010).Because AOD denotes the integrated amount of particle extinction over the entire vertical column and PM 2.5 is the mass concentration of dry particles measured near the surface, the AOD-PM 2.5 relationship is impacted by several factors, such as the vertical distribution of the aerosols, aerosol components, and the hygroscopic growth of particles (Gupta et al., 2009a).Numerous studies have been conducted to obtain reliable AOD-PM 2.5 relationships by eliminating these influences (Toth et al., 2014;Wang et al., 2010).However, it is usually difficult to obtain observational parameters, such as the aerosol vertical distribution, that regulate the AOD-PM 2.5 relationship, particularly at the regional scale.Thus, statistical models and atmospheric chemistry models have been widely used to eliminate these influences and to obtain a more accurate AOD-PM 2.5 relationship (Liu et al., 2004).
Early research mostly focused on establishing the AOD-PM 2.5 relationship by simple linear regression (Liu et al., 2005).Meteorological and environmental parameters were then incorporated to improve the correlation between AOD and PM 2.5 (Pelletier et al., 2007;Gupta et al., 2009a).Recently, several advanced statistical models, such as artificial neural networks (Gupta et al., 2009b), general additive models (Liu et al., 2009;Strawa et al., 2013), and geographically weighted regression (Hu et al., 2013, Ma et al., 2014), were developed to improve the satellite-based PM 2.5 predictions.However, these models did not adequately consider the time-varying property between the satellite AOD and ground PM 2.5 measurements.Lee et al. (2011) developed a mixed effect model to establish a daily specific AOD-PM 2.5 relationship with an R 2 = 0.62.Yap et al. (2013) expanded this model by establishing the monthly AOD-PM 10 relationship to improve the predictive power of the MODIS AOD.To establish a reliable AOD-PM 2.5 relationship, these models require abundant ground PM 2.5 measurements.With rapid economic development, Beijing, the capital of China, has been experiencing severe air pollution (Yu et al., 2011;Tao et al., 2012Tao et al., , 2014b)).So far, there is limited research on satellite PM 2.5 prediction in Beijing due to the lack of ground monitoring networks (Wang et al., 2010;Guo et al., 2014).The Environmental Protection Agency (EPA) of China began to monitor PM 2.5 (i.e., particle pollution) in major cities in January, 2013.
Most PM 2.5 estimation studies have used the AOD from the MODerate Resolution Imaging Spectroradiometer (MODIS) because of its daily global coverage and consistent accuracy.However, the 10 km resolution of the conventional Collection 5.1 (C5.1)MODIS AOD is coarse for predicting PM 2.5 at the urban scale (Chudnovsky et al., 2013).The spatial resolution of the recent C6 MODIS AOD has increased to 3 km for resolving fine-scale AOD gradients and point sources (Munchak et al., 2013).In this paper, we assessed the ability of the MODIS 3 km AOD to predict PM 2.5 in the typical megacity of China using satellite observations and ground measurements in 2013.The PM 2.5 estimation was conducted in a relatively limited temporal scope of only one year.When longer ground observations were available, further analyses, such as the seasonality of the AOD-PM 2.5 relationship, can be conducted.In this study, the correlations of the 3 km and 10 km MODIS AOD with PM 2.5 were compared.We used the mixed effects model to calibrate the day-to-day AOD-PM 2.5 relationship.The uncertainties in the PM 2.5 prediction were also evaluated.This paper provides the first study of PM 2.5 estimations using the C6 3 km AOD in urban areas of China.

Ground-Level PM 2.5 Data
The EPA of China began to publish real-time hourly pollutant concentrations in January 2013.There are 35 PM 2.5 monitoring sites in the Beijing area (Fig. 1).To determine the particle pollution in different areas, the PM 2.5 monitoring sites are located near busy roads, urban regions, suburbs, and rural background regions.The monitoring sites are mainly located in urban areas, whereas rural areas have little coverage.The ground PM 2.5 concentration is measured by the tapered element oscillating microbalance method (TEOM) or beta-attenuation method within the Chinese National Ambient Air Quality Standard (GB3095-2012, http://kjs.mep.gov.cn/).Daily average PM 2.5 measurements in Beijing from March 1, 2013, to February 28, 2014, were collected from the EPA of China for this study.

MODIS-Derived AOD
The MODIS sensors on the Terra and Aqua satellites provide daily global information of the Earth-atmosphere system in 36 spectral bands (0.4-14 µm) with a swath width of ~2330 km.The C5.1 AOD is retrieved by a darktarget algorithm with an accuracy of ± (0.05 + 15%) over land (Levy et al., 2010).AOD values are often missing due to the high surface reflectance and cloud cover.The 3 km AOD is retrieved by the same dark-target algorithm, but it can better resolve aerosol gradients and pixels closer to clouds, coastlines and small water bodies, although the accuracy is slightly lower ± (0.05 + 20%) over land (Levy et al., 2013;Munchak et al., 2013;Remer et al., 2013).Because the 3 km AOD is not available for Terra MODIS data, the C5.1 AOD (MYD04) and C6 (MYD04_3K) aerosol data were used to analyze the relationship between the AOD and PM 2.5 .Details on the MODIS AOD retrieval were reported by Levy et al. (2013) and Remer et al. (2013).To minimize the influence of the AOD inaccuracy, only the MODIS AOD with the best quality (quality flag = 3) was used.

Data Processing and Analysis
The 3 km AOD was collocated with ground PM 2.5 measurements by averaging the AOD values within a 3 × 3 pixel window centered on the ground monitoring site.Because diffusion of particle pollution usually occurs within a particular distance over a short time (Gupta et al., 2007(Gupta et al., , 2009a)), the satellite AOD exhibited a close relationship with the daily PM 2.5 (Lee et al., 2011;Hu et al., 2014).Here, we correlated the satellite AOD with the daily averaged PM 2.5 .To match the 3 km AOD pixel box, corresponding 10 km AOD data were selected.Invalid PM 2.5 or AOD was not considered.There were 2,777 pairs of PM 2.5 and 10 km AOD values in Beijing (35 sites and 364 days) and 3,098 pairs of values for the 3 km AOD.For comparison, we selected the 3 km and 10 km AOD-PM 2.5 pairs for the same days and locations.Afterward, 2,147 pairs of AOD-PM 2.5 points were retained.

Statistical Model and Validation
Although statistical models do not need to consider particular physical processes, the AOD-PM 2.5 relationship is influenced by a combination of daily changes in the wind, relative humidity and boundary layer height.Lee et al. (2011) developed a mixed effects statistical model that considered time-varying changes by calculating the daily AOD-PM 2.5 relationship separately by assuming that they have little spatial variability in the study region.Yap et al. (2013) expanded this model to calibrate the AOD-PM 10 relationship in the Malaysian Peninsula by changing the daily parameter to a monthly parameter.Particle pollution and emission sources in China are very different from those in the United States and other countries (Philip et al., 2014).We investigated the AOD-PM 2.5 relationship using a mixed effects model with the 3 km MODIS AOD (MYD04_3K) in our study region.The model is described by the following equations: PM mn is the 24 h PM 2.5 average concentration on day m and at site n.AOD mn denotes the MODIS AOD value at the corresponding site on day m; α and β are the fixed intercept and slope, respectively; these values explain the effect of the long-term AOD on the PM 2.5 for all days; µ m and  m are the random intercept and slope, which adjust the fixed intercept and slope each day.These values explain the daily variation in the relationship between PM 2.5 and AOD; s n is the random intercept at site n; ε mn denotes the random error on day m and at site n.Details on the model were reported by Lee et al. (2011).
The performance of the model was tested by cross validation (CV).We selected one of the 35 PM 2.5 monitoring sites as a test site, and the mixed effects model was fitted by the remaining sites.This model was also used to predict the PM 2.5 concentrations for the test site.Finally, we repeated the process for each monitoring site.The root mean square error (RMSE) was calculated for every cross validation.

Comparison between MODIS 10 km and 3 km AOD
In this section, we assessed the ability of the 3 km AOD data to predict PM 2.5 in the Beijing area.Fig. 2 shows two PM 2.5 monitoring sites in traffic roads (sites YDMN and QM) and one urban PM 2.5 monitoring site (site TT) within one 10 km AOD pixel but different 3 km AOD pixels.The 3 km AOD data can capture finer spatially variability than the 10 km AOD.The 10 km and 3 km AOD showed different details, although they revealed similar aerosol trends at the large scale (Munchak et al., 2013).Because the 10 km AOD was derived from 30% of the dark pixels with moderate values in a 20 × 20 500 m window (Levy et al., 2007), it can smooth both high and low values and thus miss fine features.However, there is a slight decrease in the spatial coverage of the 3 km AOD because the 10 km AOD may also include 3 km cloudy pixels.
Fig. 3 shows the linear regression of the AOD-PM 2.5 relationship.The R 2 of the AOD-PM 2.5 relationship was 0.361 for the 10 km AOD and 0.3613 for the 3 km AOD.Both the 3 km and 10 km AOD appear to exhibit a direct correlation with the ground-level PM 2.5 concentration.The correlation is nearly the same for both the 10 km and 3 km AOD.Note that the dots are scattered, and there is a large offset in the AOD-PM 2.5 relationship that may be caused by the vertical distribution of the aerosols and hygroscopic growth.To obtain a high accuracy in the estimation of the regional PM 2.5 , effort should be made to eliminate these effects.

Descriptive Statistics
Table 1 shows the mean PM 2.5 concentration measured at the 35 PM 2.5 monitoring sites from March 1, 2013, to February 28, 2014.The mean PM 2.5 concentration ranged from 62.08 µg/m 3 to 116.67 µg/m 3 .The annual mean PM 2.5 concentration at the two sites (sites BDL and MYSQ) in .The PM 2.5 concentration in urban regions was obviously higher than that in the northern rural areas but was mostly lower than that in the southern rural areas.The average concentration in the southern suburb adjacent to the industrial regions in Hebei was much higher than that in the northern suburbs, indicating that the southerly transport of industrial pollutants from Hebei Province significantly influenced the air quality in Beijing (Tao et al., 2014).The PM 2.5 concentration at 10 sites exceeded 100 µg/m 3 ; most of the sites are located in the southern suburbs and rural areas.Three of the sites are located in urban Beijing, with one site in the south and the other two near busy roads.

Model Fitting and PM 2.5 Prediction
The mixed effects model generated 188 daily AOD-PM 2.5 relationships for the 3 km AOD.The fixed intercept and slope of the model were 46.967 (SE = 2.886) and 34.111 (SE = 4.808), respectively.The p-value was less than 0.0001, suggesting statistical significance.The random effects of the intercept and slope of the AOD varied daily.We compared the fitted PM 2.5 concentration with the observed PM 2.5 values at the 35 monitoring sites.The result is shown in Fig. 4 (R 2 = 0.8466; slope = 0.831; RMSE = 13.88 µg/m 3 ) and is much larger than that obtained by simple linear regression (Fig. 3).
The slope was close to 1, indicating that the predicted PM 2.5 concentration by the mixed effects model was highly consistent with the observed values.Furthermore, we compared the PM 2.5 obtained from the CV procedure with the ground measurements (R 2 = 0.796; slope = 0.807; RMSE = 16.04 µg/m 3 ).The CV test also indicated that the result is reliable, although the model slightly overfits the data.Fig. 5 shows the yearly mean AOD-derived PM 2.5 in the Beijing area in 2013.The mean PM 2.5 ranged from 40-60 µg/m 3 in the northern and western rural mountain areas to 60-80 µg/m 3 in most of the urban, southern and eastern regions.High-level PM 2.5 in the southern urban region and in the rural towns can be clearly seen, indicating the existence of local emission sources.In contrast, the air quality was relatively better in the northern rural areas.The distribution of the high PM 2.5 concentration was consistent with the ring roads in the urban region, which could be associated with the intense vehicle emissions in Beijing (Yu et al., 2011(Yu et al., , 2013)).
Generally, the regional PM 2.5 was within a reasonable range and was strongly correlated with the ground measurements.The distinct differences in the PM 2.5 in Beijing indicate that the 3 km AOD reveals fine-scale gradients of the exposure levels in urban regions.The satellite-estimated PM 2.5 was slightly lower than that of the ground measurements, partly because the satellite-derived PM 2.5 denotes the average values at the 3 × 3 km scale.

Analysis of the Satellite Estimation Errors
The RMSE of the PM 2.5 estimation in Beijing was 16.04 µg/m 3 , which is much higher than that in the United States (~5 µg/m 3 ) (Lee et al., 2011;Hu et al., 2014).However, the RMSE of our estimation was much lower than that for the results for China overall (32.98 µg/m 3 ) (Ma et al., 2014), mainly due to the much denser ground sites in the Beijing area.The high RMSE in Beijing can be explained by several factors.First, the AOD-PM 2.5 relationship was negative on some days due to the dense aerosol layers aloft (high AOD) with low PM 2.5 or very low boundary layers (low AOD) with high PM 2.5 , which sometimes appeared in northern China (Tao et al., 2014a).Second, the PM 2.5 ranged from 3 to 510 µg/m 3 in our study region; these values were much higher than those in the United States.The large variations in the PM 2.5 may be overlooked due to the missing satellite AOD.The model may underestimate the PM 2.5 concentrations at high concentrations.In addition, the complicated aerosol sources and location of the ground sites may also contribute to the high RMSE (Lee et al., 2011).
To evaluate the influence of the sampling frequency of the satellite AOD, we calculated the mean PM 2.5 of all of the ground measurements ("ALL" hereafter), the PM 2.5 values when AOD values were available ("SAT") and the mean predicted PM 2.5 by the CV mixed effects model ("CV").Fig. 6 shows that there were high biases between the mean PM 2.5 on all days and the value when the satellite AOD is available.The biases ranged from 15 to 40 µg/m 3 , most of which exceeded 25 µg/m 3 .The high biases demonstrate that many days with high PM 2.5 values were excluded because  of missing AOD values.The MODIS dark-target aerosol retrieval is not valid for heavy haze pollution and bright surfaces in winter (Tao et al., 2012), when high PM 2.5 usually occur.Missing satellite retrievals on these polluted days could significantly contribute to the underestimation of the AOD-derived PM 2.5 .
Our finding is significantly different from the result presented by Gupta et al. (2008), who argued that the low sampling from satellites is not a major problem for PM 2.5 predictions, with a bias of less than 2 µg/m 3 in the southeastern United States.As shown in Fig. 6, the missing AOD values can lead to large deviations in the PM 2.5 estimation.The mean PM 2.5 in China, where heavy pollution is frequent, is much higher than that in the United States.The PM 2.5 concentration in northern China ranges from several to hundreds µg/m 3 within 1-2 days (Tao et al., 2014a, b); this range causes large deviations from the mean value when the AOD is unavailable.As shown in Table 2, the available MODIS 3 km AODs did not span more than one-third of the entire year at nearly all the sites; thus, many episodes may have been missed.
Table 2 shows the annual mean PM 2.5 concentrations from the measured and CV-predicted values when the satellite AOD was available.The bias between the CV-predicted and measured values ranged from 0 to 20 µg/m 3 .All monitoring sites except DL, LLH, MYSK, NSH, YF and YLD had a bias below 10 µg/m 3 .The large positive biases at the DL and MYSK sites in the northern rural areas may be due to the high AOD value but low PM 2.5 concentration on particular days.In contrast, the satellite-estimated PM 2.5 exhibited obvious negative biases near busy roads and southern rural areas, indicating an underestimation of heavy particle pollution.
Generally, PM 2.5 in the Beijing area can be predicted by the 3 km AOD at fine scales and low relative deviations.However, the satellite AOD should be used cautiously in PM 2.5 estimations over Beijing due to the missing samples in heavy pollution cases.As more PM 2.5 data in the network becomes available in the future, further studies should consider additional factors to obtain higher accuracies.

CONCLUSIONS
Satellite AODs have been widely used in predicting regional PM 2.5 concentrations.However, the coarse resolution (~10 km) of the conventional MODIS AOD usually limits research in urban areas.In this study, we estimated the regional distribution of PM 2.5 over the Beijing area using one year of C6 3 km MODIS AOD data in 2013.The 3 km AOD exhibited nearly the same correlation with the PM 2.5 as the 10 km AOD, but it could reveal finer spatial characteristics.A mixed effects model was used to calibrate the 3 km AOD for the PM 2.5 estimation in the Beijing area, where heavy particle pollution is common.The AOD-PM 2.5 relationship significantly improved when day-to-day variability was considered; the results had a much higher accuracy (R 2 = 0.846) than the simple linear regression results (R 2 = 0.361) in PM 2.5 prediction.The mean PM 2.5 concentration in Beijing was 60-80 µg/m 3 in the southern and eastern areas and 40-60 µg/m 3 in the western and northern regions.In contrast to clean-air regions, such as the Unites States, large deviations can be introduced into the PM 2.5 over eastern China when the sampling frequency of the satellite AOD is low.

Fig. 1 .
Fig. 1.Spatial distribution of the 35 PM 2.5 monitoring sites in Beijing area.

Fig. 3 .
Fig. 3. Comparison of the AOD-PM 2.5 correlation between MODIS 10 km (left) and 3 km (right) AOD when 10 km, 3 km AOD and PM 2.5 measurement are all available.

Fig. 4 .
Fig. 4. Comparison of predicted PM 2.5 and ground measurements by mixed effects model (left) and CV mixed effects model (right).

Fig. 6 .
Fig. 6.The mean PM 2.5 values from all ground measurements (ALL), the ground measurements when AOD values are available (SAT) and mean PM 2.5 predicted by CV mixed effects model (CV).

Table 1 .
PM 2.5 concentration (µg/m 3 ) observed at the 35 monitoring sites in the whole year.

Table 2 .
The measured PM 2.5 and CV predicted PM 2.5 concentrations for 35 PM 2.5 monitoring sites.