Application of Trajectory Clustering and Source Apportionment Methods for Investigating Trans-boundary Atmospheric Pm 10 Pollution

A modeling framework was proposed to investigate the impact of trans-boundary air pollutant transport on regional air quality. This was based on a combination of the HYSPLIT trajectory model, the CAMx air quality model, and the MM5 meteorological model. The examination of atmospheric PM 10 pollution in Guangzhou within the Pearl River Delta (PRD) region of southern China was used as a case study. The HYSPLIT and MM5 models were used to qualitatively identify the dominant PM 10 pollutant transport pathways that led to PM 10 pollution events in Guangzhou, with five clusters of air mass trajectories being examined. The emission source contribution through each transport pathway to Guangzhou's PM 10 concentration was then quantified using a MM5-CAMx modeling system. The results illustrated that the trans-boundary PM 10 transport played a critical role in the formation of PM 10 pollution events in Guangzhou, with a mean contribution ratio of nearly 49%. In particular, two air mass trajectory clusters that originated from Guangzhou's surrounding regions were found to be the main pollutant transport pathways, and three surrounding cities (Foshan, Dongguan and Huizhou) had a total emission contribution of nearly 30% to Guangzhou's PM 10 concentration through these two pathways. The emissions from these three cities also accounted for 70 to 94% of the total trans-boundary contributions from Guangzhou's nine surrounding cities through the five transport pathways. As a result, in order to improve Guangzhou's air quality, coordinated effort is required to reduce emissions in both Guangzhou itself and its three surrounding cities. It is expected that the presented modeling approach can be applied to air quality studies in many other regions.


INTRODUCTION
The Pearl River Delta (PRD) region is located at the Pearl River estuary where the river enters the South China Sea.With many developed cities on both sides of the Pearl River, the PRD region is the manufacturing and distribution center of southern China.This region has undergone rapid urbanization during the past two decades due to a significant influx of migrants.The increasing population and rapid industrialization of this area have resulted in seriously deteriorated air quality (Wei et al., 2007).For example, haze has been frequently observed in the PRD region due to the high level of air particulates (Tan et al., 2009a).As the geographical center of the PRD region, Guangzhou is the largest city in southern China.In recent years, the air pollution problem in Guangzhou has been of concern due to the adverse effects it may result in.Air pollution can decrease atmospheric visibility, change meteorological conditions, and negatively impact human health (Sun et al., 2006;Chan and Yao, 2008;Tie and Cao, 2009;Xiao et al., 2011;Lin et al., 2012).Particularly, the atmospheric visibility in Guangzhou has been significantly reduced with a reported decline rate of 0.3 km per year during the past four decades (Huang et al., 2008).The increasing air particulate matter levels have also been linked to a significant increase in lung cancer occurrence in Guangzhou over the past few years (Tie et al., 2009).As a result, there is a pressing need to understand and control air pollution in Guangzhou.
It has been recognized that air pollution is not only a local but also a regional problem (Oh et al., 2011).Some pollutants can be transported regionally over hundreds or even thousands of kilometers (Bergin et al., 2005;Wang et al., 2010;Zheng et al., 2010).Guangzhou is surrounded by many large cities in the PRD region, including Foshan, Dongguan, Shenzhen, and Hong Kong.Consequently, the trans-boundary transport of air pollutants from these cities may greatly contribute to Guangzhou's air pollution problem.For example, Wang et al. (2005) examined the different characteristics of air pollutant transport among the cities in the PRD region and found that Guangzhou's air pollution can be significantly affected by emissions from its surrounding regions.Lee et al. (2007) reported that the enrichment of atmospheric heavy metals in Hong Kong and Guangzhou was closely associated with the air mass from the north and the northeast that originates from northern China.The effect of such trans-boundary air pollutant transport may hamper the efforts that Guangzhou has made to address its air pollution problem.In order to develop an effective air quality management strategy in Guangzhou, it is of critical importance to identify the dominant air pollutant transport pathways and quantify the effect of trans-boundary air pollutant transport on the air pollution in the city.However, there have been very limited studies so far to address this issue.Particularly, the PM 10 pollution has been recognized as a serious air quality problem in Guangzhou and in the PRD region (Tan et al., 2009a, b).Although some previous works have revealed the existence of trans-boundary air pollutant transport from the surrounding areas to Guangzhou (Wang et al., 2005, Lee et al., 2007;Zheng et al., 2010), the dominant transport pathways and their quantitative effects on atmospheric PM 10 pollution in Guangzhou still remain poorly characterized.
The objective of this study was to investigate the impact of trans-boundary air pollutant transport on the atmospheric PM 10 pollution in Guangzhou through both qualitative and quantitative modeling approaches.The PM 10 pollution in the autumn of 2006 was analyzed as a case study.The HYSPLIT (Hybrid Single-Particle Lagrangian Integrated Trajectory) model (Draxler and Hess, 1998) and the MM5 meteorological model (Dudhia et al., 2004) were used to track the air masses and analyze the dominant transport pathways, while a k-means clustering algorithm was applied to group various air mass trajectories into different transport pathways.The quantification of emission source contribution to the ambient PM 10 pollution in Guangzhou from each city in the PRD region through each trans-boundary transport pathway was then analyzed using the CAMx air quality model.The MM5 model was used to provide meteorological inputs for the CAMx model, and the particulate matter source apportioning technology (PSAT) implemented within CAMx model was used to quantify the emission source contribution.Consequently, the effects of trans-boundary air pollutant transport through different transport pathways on Guangzhou's PM 10 pollution were characterized.This would provide a sound basis for developing effective air quality management strategy in the study region.

Trajectory Calculation and Clustering
Generally, the qualitative identification of air pollutant transport pathways can be conducted through air trajectory clustering by grouping similar trajectories in terms of air mass movement (Abdalmogith and Harrison, 2005).This method has been widely applied to identify homogeneous groups of air mass transport patterns that affect air quality in urban regions (Prapat and Nguyen, 2007;Wang et al., 2010).In this study, air mass trajectory clustering was used to reveal the relationship between the atmospheric transport pathways and the PM 10 pollution levels in Guangzhou in the autumn of 2006 (September 1 to November 30), which is usually the most polluted season.An overview of air pollution, meteorology, and topography within the PRD region and Guangzhou can be found in a previous study (Chan and Yao, 2008).The HYSPLIT model is the most widely used air trajectory model for establishing sourcereceptor relationships over long distances (Wang et al., 2010), and was thus utilized in this study.The version 4.8 of this model was applied to calculate the 24-hour back trajectories (one in each hour in a day) at an altitude of 30 m above the sea level in Guangzhou (23.16°N, 113.23°E) (Draxler and Hess, 1998).The selection of 24-hour back trajectories was because the objective of this study was to focus on regional air pollutant transport in the PRD region instead of long-distance transport from far-away areas.The altitude of the starting point for trajectory calculation was based on the fact that the sampling height of the PM 10 monitoring station in Guangzhou was located at 30 m above the sea level.
The MM5 meteorological model was used to provide the meteorological data fields required to run the HYSPLIT model.A detailed description of the MM5 model can be found in Dudhia et al. (2004).In this study, the MM5 model was configured using a two-level nested modeling domain on a Lambert map projection centered at (23°N, 113.5°E).The outer modeling domain had a spatial resolution of 27 km × 27 km covering most of southeastern China, and was established with a dimension of 100 × 100 grid cells.The inner domain had a spatial resolution of 9 km × 9 km covering all of the cities within the PRD region, and was set up with a dimension of 58 × 44 grid cells (Fig. 1).Vertically, 35 layers from the ground level to an altitude of 15 km were designed with uneven spacing, among which 20 layers were distributed within a height of 3 km from the ground level in order to provide specific planetary boundary information.The initial conditions (IC) and boundary conditions (BC) for the MM5 model were obtained from the NCEP data every 6 h with a 1° × 1° resolution.The vertical configuration used in the HYSPLIT model was the same as that used in the MM5 model.Based on the results of HYSPLIT model, the air mass trajectories were assigned to distinct clusters according to their moving speed and direction using a kmeans clustering algorithm (Hartigan and Wong, 1979).For the k-means algorithm, the results may change when using different number of clusters.Since the concern of this study was the pollutant transport pathway, the clustering result with the smallest possible cluster number should be selected (Wang et al., 2010).In this study, various pre-selected number of clusters (e.g., 3, 4, 5, 6, and 7) were tested, and it was found that 5 clusters can best represent the meteorological characteristics of the transport pathways in Guangzhou.As a result, this number was selected to be the expected number of air mass trajectory clusters.A more detailed clustering procedure by using the k-means algorithm can be found in Wang et al. (2010).

CAMx Model with PSAT
The atmospheric PM 10 concentration in Guangzhou and its surrounding areas was simulated using the Comprehensive Air Quality Model with Extensions (CAMx).CAMx is a publicly available Eulerian photochemical dispersion model that allows for the integrated assessment of gaseous and particulate air pollutants over many scales (Environ, 2006).One of the unique features of CAMx is its mass tracking module called particulate matter source apportioning technology (PSAT) for studying source apportionment from different emission source categories and regions (Dunker et al., 2002;Yarwood et al., 2005;Ward et al., 2012).It can conduct source apportionment for particulate matter (PM) species, including sulfate, particulate nitrate, ammonium, particulate mercury, secondary organic aerosol, and six categories of primary PM (e.g., elemental carbon, primary organic aerosol, crustal fine, other fine, crustal coarse, and other coarse).The source apportionment can be calculated in parallel to the main program in CAMx for calculating pollutant concentrations, which means that selecting PSAT does not affect the CAMx simulation results (Koo et al., 2009;Huang et al., 2010).
After identifying the dominant air pollutant transport pathways to Guangzhou using the HYSPLIT model, the emission contribution to Guangzhou's PM 10 concentration through each transport pathway was then calculated using the CAMx model with PSAT module.Similar to the HYSPLIT model, the meteorological data fields required to run the CAMx model were provided by the MM5 modeling results.The simulation domain for the CAMx model was the same as the inner domain for the MM5 model, but was designed with 5 grids less in each of the two horizontal directions in order to minimize the side effects at the boundary of the meteorological modeling domains (Wang et al., 2010).The 35 vertical layers for the MM5 model were also collapsed into 12 layers for the vertical domain of the CAMx model, with 9 of them being distributed within an altitude of 3 km from the ground level (Wang et al., 2010).As a result, the CAMx modeling domain was set up with a dimension of 48 × 34 grid cells, covering all of the administrative divisions in the PRD region.A total of 11 emission source areas were defined, including nine cities (e.g., Guangzhou, Huizhou, Foshan, Dongguan, Jiangmen, Zhongshan, Zhaoqing, Zhuhai, and Shenzhen), one special administration region (i.e., Hong Kong), and all of other areas outside the boundary of the PRD region within the CAMx modeling domain, as shown in Fig. 2.

Emission Data
The locations of the nine cities within the PRD region are shown in Fig. 1.An emission inventory for the year 2006, with high temporal and spatial resolution, was compiled for the PRD region using a "bottom-up" investigation approach.A seasonal coefficient was used to adjust the annual emission inventory into different seasons in order to provide emission inventory for air quality simulation within the study period of September 1-November 30 in 2006.The emission inventory covers the major emission sources that were divided into six categories based on the main emission types in the PRD region, including power plant, industrial, traffic, biogenic, VOC product-related, and other sources (such as residential fuel consumption, waste burning, and biomass burning).A total of 2292 point emission sources (Fig. 1) were collected in this study, including industrial and non-industrial stationary equipment or processes.The emission data of area sources were obtained at a countylevel.Table 1 lists the PM 10 emission rates from both point sources and area sources in the PRD region in 2006.Fig. 1 illustrates that the air pollutant emissions in the PRD region were mainly distributed over the central-southern areas.For areas outside the PRD region, emission inventories were taken from those prepared by Streets et al. (2003).The biogenic and VOCs emission data sets were obtained from the GEIA emission inventories (Global Emission Inventory Activity).The Sparse Matrix Operator Kernel Emissions (SMOKE) model was used to convert the emission inventory data into the formatted emission files required by the CAMx model (Houyoux and Vukovich, 1999;Borge et al., 2008).

Meteorological and Air Quality Monitoring Data
A PM 10 monitoring station at Luhu park is located in the urban area of Guangzhou and was used to represent the   (Dudhia et al., 2004).

Simulation Performance of the MM5-CAMx Modeling System
The simulation performance of the MM5-CAMx modeling system is of critical importance for its effective application to air quality management.In this study, the hourly PM 10 concentrations in the first vertical layer of the CAMx modeling domain during the study period was simulated using the MM5-CAMx modeling system.The simulations results in the grids representing the 16 air quality monitoring stations in the PRD region were then compared with the related observations at these monitoring stations.Table 2 lists the relative modeling errors and the Pearson correlation coefficients under the significance level of 0.01.It can be found that the relative error between the simulation and observation values of PM 10 concentration ranged from 24 to 36%, with an average value of 30%.The Pearson correlation coefficient ranged from 0.58 to 0.72, with an average value of 0.64.These results indicate that the MM5-CAMx modeling performance was generally satisfactory and acceptable (Eder and Yu, 2006).

Characteristics of Various Air Pollutant Transport Pathways
Fig. 3 presents the five clusters of air pollutant transport pathways to Guangzhou obtained from the HYSPLIT model.It can be found that the air mass of cluster 1 came from the eastern coastal areas of the PRD region, mainly Dongguan, Huizhou, Shenzhen, and Hong Kong, and moved westward to arrive in Guangzhou.This transport pathway occurred for 636 hours out of the total simulated 2184 hours in the autumn of 2006, with an occurrence frequency of 29%.The air mass of cluster 2 came mainly from Foshan city in the western part of the PRD region, and its occurrence frequency was 20%.Clusters 3, 4 and 5 were mainly from the northeastern areas of Guangdong province, and had a total occurrence frequency of 51%.
Table 3 lists the monitored mean PM 10 concentration in Guangzhou during the occurrence of each transport pathway and the corresponding meteorological conditions during the three autumn months of 2006.It can be seen that clusters 1 and 2 were associated with higher PM 10 concentrations in Guangzhou, while the other three clusters were corresponding to lower PM 10 concentrations.Fig. 4 presents the variation of the observed hourly PM 10 concentration in Guangzhou and the occurrence of different transport pathways from September 1 to November 30, in 2006.It is evident that the PM 10 pollution in Guangzhou was closely related to air mass trajectory clusters 1 and 2. The occurrence of these two transport pathways was generally corresponding to the increased and high-level PM 10 concentrations in Guangzhou.It can be found from Fig. 3 that clusters 1 and 2 were associated with a shorter transport distance when compared to the other three clusters.Their air masses mainly moved   through the highly industrialized and densely populated areas within the PRD region, such as Dongguan, Huizhou, and Foshan.The emissions from these cities were transported to Guangzhou with air mass movement, thus causing elevated pollution levels in Guangzhou as shown in Fig. 4. The results in Table 3 indicate that the emissions from Guangzhou's surrounding areas had a significant contribution to its PM 10 pollution levels through the trans-boundary transport effect.Similar observations were found in a haze formation study in Guangzhou (Tan et al., 2009b).Moreover, it can be found from Table 3 that the PM 10 pollution in Guangzhou was closely related to its meteorological conditions.The occurrence of clusters 1 and 2 was observed to be associated with relatively lower wind speed, higher temperature, and higher relative humidity than the other three air mass trajectory clusters.This illustrates that the dispersion and transport of PM 10 was affected by the meteorological parameters (wind speed, wind direction, relative humidity, and temperature).The heavier air pollution was associated with weaker wind speed since the windy weather is conducive to the dispersion of pollutants.In summary, the atmospheric PM 10 pollution in Guangzhou had a direct relationship with the air mass transport pathway, and the air mass trajectory clusters 1 and 2 (Fig. 3) represent the main trans-boundary transport pathways that influenced Guangzhou's PM 10 concentrations.During their occurrence in the autumn of 2006, the average PM 10 concentration in Guangzhou was 0.081 mg/m 3 , which is 32.7% higher than that (i.e., 0.061 mg/m 3 ) during the occurrence of the other three transport pathways.

Trans-Boundary Emission Contribution from Each Surrounding City
Fig. 5 presents the calculated average ratios of PM 10 emission contributions to the hourly PM 10 concentration in Guangzhou from the local emission sources and the surrounding cities, as well as from the initial conditions (ICs) and boundary conditions (BCs) during the autumn of 2006.
The ICs represent the gridded initial pollutant concentration fields in the gridded modeling domain, and the BCs represent the gridded concentration fields on the lateral faces of the grid boundary (e.g., north, south, east, west, top).Both ICs and BCs are important contributions to the simulated air pollutant concentrations.The system default initial conditions and boundary conditions were used in the modeling setup.The emission source contribution ratio was defined as the ratio of the simulated PM 10 concentration in Guangzhou after removing that emission source from the inventory to the simulated PM 10 concentration in Guangzhou using the original emission inventory (Cheng et al., 2007).It can be found that the mean contribution ratio of the local emission source to Guangzhou's PM 10 concentration in the study period was 51%, implying that the mean trans-boundary emission contribution ratio was 49% (i.e., including contributions from Guangzhou's nine surrounding cities, the other areas outside the boundary of the PRD region within the CAMx modeling domain, and the ICs and BCs).This result is similar to the findings in a previous SO 2 source apportionment study which showed that the mean annual contribution to the SO 2 concentration in Guangzhou from its surrounding areas could reach as high as 50% (Wang et al., 2005).It is evident from the simulation results that trans-boundary emission contributions played a critical role in the formation of PM 10 pollution events in Guangzhou.It can be found from Fig. 5 that the major trans-boundary emission contributors included Huizhou (8%), Foshan (6%), and Dongguan (8%), with a total emission contribution ratio of 22%, which accounts for 79% of the total trans-boundary contributions from the nine surrounding cities of Guangzhou in the PRD region.The emission contribution ratios of the other six cities in the PRD region were ranked in descending order as Shenzhen, Zhaoqing, Zhongshan, Jiangmen, Hong Kong and Zhuhai.The areas outside the boundary of the PRD region accounted for an emission contribution ratio of 18%.

Emission Contribution from Each Surrounding City through Each Transport Pathway
The results of the trans-boundary emission contributions were in agreement with the air trajectory analysis results described above.The trajectory analysis indicated that clusters 1 and 2 represented the major transport pathways.In order to further investigate the impact of different transport pathways on Guangzhou's PM 10 pollution, the contributions from different emission source areas to Guangzhou's PM 10 concentration through each transport pathway were simulated using the MM5-CAMx modeling system.Table 4 presents the modeling results.It was found that through the transport pathways of air mass trajectory clusters 3, 4, and 5, about 26% of the PM 10 concentration in Guangzhou could be attributed to the emissions from the areas outside the PRD region.However, this number was reduced to about 11% through the transport pathways of clusters 1 and 2. Through cluster 1, the upwind areas (including Huizhou, Dongguan, Shenzhen, and Hong Kong) accounted for a total emission contribution ratio of 31%.Through cluster 2, the upwind areas (including Foshan, Zhaoqing and Dongguan) accounted for 28% of emission contribution to Guangzhou's PM 10   4 that the transboundary contribution to Guangzhou's PM 10 concentration from its nine surrounding cities was 34, 36, 11, 25, and 13% through air mass trajectory clusters 1, 2, 3, 4, and 5, respectively.It can also be found that the emissions from Foshan, Dongguan, and Huizhou greatly affected Guangzhou's air quality.Through the transport pathways of clusters 1 and 2, their total emission contribution to Guangzhou's PM 10 was nearly 26%.Among the total transboundary contributions within the PRD region, the emissions from these three cities accounted for 77, 70, 91, 88, and 92% through the five transport pathways, respectively.Such information is valuable for developing effective air quality management strategy in the study region.It implies that the improvement of air quality in Guangzhou requires a coordinated effort between Guangzhou and its surrounding areas, especially Foshan, Dongguan, and Huizhou.In order to obtain more detailed information for effective decision making, the source apportionment of different emission categories may be required and is worth of further examination.This would help characterize the contribution from different emission categories within various emission cities to the PM 10 concentration in Guangzhou through each transport pathway, while the proposed modeling approach is still appropriate for the calculation of such emission contribution.

CONCLUSIONS
In this study, air trajectory clustering and source apportionment technology were used to examine the atmospheric PM 10 pollution problem in Guangzhou within the Pearl River Delta region of China.By combining with the MM5 meteorological model, the HYSPLIT model was used for trajectory clustering, and the impact of air mass transport pathways on the PM 10 pollution in Guangzhou was analyzed using the CAMx model with the PSAT module.Five air pollutant transport pathways were identified, and were represented by five air mass trajectory clusters.Among them, two air mass trajectories that originated from Guangzhou's surrounding cities in the PRD region, with a total occurrence frequency of 49%, were observed to be the main transport pathways that greatly affect Guangzhou's PM 10 pollution.It was found from the simulation results that nearly 49% of the PM 10 concentration in Guangzhou was due to trans-boundary emission contributions.Particularly, the cities of Foshan, Huizhou, and Dongguan represented the most important trans-boundary emission contributions to Guangzhou's PM 10 concentration.Their contributions accounted for 77, 70, 91, 88, and 92% of the total trans-boundary contributions from all of Guangzhou's surrounding cities through the five air pollutant transport pathways.Therefore, in addition to emission reduction within Guangzhou itself, the regulation of emissions within the three surrounding cities is important for improving Guangzhou's air quality.Since there have been few applications of HYSPLIT, MM5 and CAMx models within a general modeling framework to both qualitatively and quantitatively examine the emission contributions to regional air pollution problems, the proposed modeling approach in this study has the potential to be applied to other urban or regional air pollutant emission investigations and decision analysis studies.

Fig. 1 .
Fig. 1.The administrative divisions of the PRD region (the yellow dots represent the locations of point emission sources, the blue flags represent the automatic meteorological observation stations, and the red towers represent the air quality monitoring stations).

Fig. 3 .
Fig.3.The simulated back trajectories arriving at Guangzhou in the autumn of 2006, (a) five clusters of trajectories (cyan for cluster 1, blue for cluster 2, green for cluster 3, red for cluster 4, and yellow for cluster 5), (b) trajectories in cluster 1, (c) trajectories in cluster 2.

Table 1 .
PM 10 emissions from different sources within the PRD region in 2006.PM 10 pollution characteristics.A Beta Gauge method was used at this station to measure the PM 10 concentration at a sampling height of 9 m above ground level (i.e., 30 m above the sea level).The hourly PM 10 concentration data was also obtained from 15 separate air quality monitoring stations (e.g., 12 in PRD region and 3 in Hong Kong, as shown in Fig.1) for evaluating the modeling performance.

Table 2 .
Simulation error of the hourly PM 10 concentration and the Pearson correlation coefficient.

Table 3 .
Monitored mean PM 10 concentration in Guangzhou during the occurrence of each transport pathway and the corresponding meteorological conditions in autumn 2006.

Table 4 .
Emission contribution ratio (%) to Guangzhou's PM 10 concentration through each transport pathway.