Application of a Combined Model to Study the Source Apportionment of PM 10 in Taiyuan , China

Twenty four-hour averaged concentrations of ambient PM10 were collected in Taiyuan, China from April 2001 to January 2002. A sum of 14 chemical species in PM10 was analyzed and a combined receptor model (PCA/MLR-CMB) was applied to this speciation data to determine the contributions of major source categories. On stage 1, two factors were extracted by Principle Component Analysis/Multiple Linear Regression (PCA/MLR), while some unknown sources were excluded from the model as un-extracted factors. Each factor might contain more than one actual source categories and was identified as extracted complex source (EC-source). The actual source categories contained in each EC-source were investigated according to the factor loadings and emission inventory. On stage 2, the two EC-sources were separately used as new receptors, and their corresponding actual sources were apportioned by Chemical Mass Balance (CMB). Although near colinearity existing in some source profiles, a total of eight sources were still well estimated: resuspended dust (about 26%), coal combustion (about 18%), cement dust (about 5%), steel manufacture (about 12%), soil dust (about 7%), vehicle exhaust (about 13%), secondary sulfate (about 12%) and secondary nitrate (about 4%). The combined model resolved 97% of PM10 mass concentrations and the evaluation analysis showed the results obtained by the combined model were reasonable.


INTRODUCTION
Many studies have demonstrated that ambient particulate pollution is associated with some health effects including mortality and morbidity (Ministry of Health, 1954) and environmental effects, such as reducing the visibility (Polissar et al., 2001).To control the particulate pollution, many works have been reported to identify the sources of particulate matter (PM) in certain areas by using receptor models, such as CMB (Chemical Mass Balance, US-EPA, 1987; Wang and Chen, 2008;Zhang et al., 2008), PMF (Positive Matrix Factorization, Paatero and Tapper, 1994;Thimmaiah et al., 2009), and PCA/MLR (Principal Component Analysis/Multiple Linear Regression, Thurston and Spengler, 1985;Srivastava et al., 2008).
In practice, each model has its own strengths and weaknesses (Watson et al., 2008).CMB is the most widespread receptor model all over the world.It apportions ambient PM to each source category identified by field study.However, reasonable results may not be obtained because of the near collinearity among source profiles (Henry, 1992) when the sources and the receptor are incompatible (Shi et al., 2009a).PCA/MLR or PMF, belonging to factor analysis, are also often used to identify the sources.The most advantage of factor analysis is that it identifies the major sources without a prior knowledge while certain unknown sources are excluded, but the sources identified may contain more than one actual sources (Guo et al., 2004).Thus, the results obtained by factor analysis often can not directly guide air quality management.
Recently, a new combined model was developed (Shi et al., 2009b).It is a combination of factor analysis (PCA/MLR or PMF) and CMB, which can improve the balance between the ambient data and source data in CMB through factor analysis to reduce the impact of unknown sources.The combined model was successfully applied to determine the major sources contributing to PM 10 in Zengzhou, China, in our prior study (Shi et al., 2009b).In this work, the combined model (PCA/MLR-CMB) is used to obtain the contributions of major sources of PM 10 in another city of northern China, Taiyuan, and the results were evaluated by a statistical method.

Study Area
Taiyuan (110°30'-113°09'E, 37°27'-38°25'N) is the capital of Shanxi province, China.It is to the east of Loess Plateau and the north of Jinzhong Basin.It has an area of 6988 km 2 and a population of about 3.4 million.The climate of Taiyuan is of continental temperate type.It is dry and sandy in spring and rainy in summer.The mean annual precipitation is 395 mm.Prevailing wind is from NW.The strongest wind speed is 4.0 m/s from WNW and the average yearly wind speed is 2.1 m/s.
Taiyuan is one of the oldest national heavy industrial bases.The main industries are energy, metallurgy, machinery and chemical industry.There are a steel manufacture industrial estate, a coal mine and construction industrial estate and a chemical estate in the city.The main energy source is coal, with total annual consumption more than 20 million tons.

Ambient Sampling
Ambient PM 10 samples were collected from April 2001 to January 2002.Two medium-volume PM 10 air samplers were used for sampling: one was fitted with quartz-fiber filters for ion and carbon component analysis and the other was fitted with polypropylene membrane filters for element analysis.The samplers were operated at flow rates of 100 L/min.A total of 125 samples (24-h sampling duration) were obtained.The sampling method was referred in our prior works (Bi et al., 2007).

Source Sampling
The investigation of PM 10 sources was based on the air pollutant emission inventory from the official reports or field study.Eight potential source categories were identified, including soil dust, coal combustion, cement dust, resuspended dust, vehicle exhaust, steel manufacture, secondary sulfate and nitrate.
Resuspended dust (Zhao et al., 2006) was collected on windowsills positioned at heights of 5 m-15 m from six executive districts of Taiyuan including three industrial estates.Coal combustion source samples were collected from particulate pollution control devices (electrostatic precipitators, fabric filters or wet scrubbers).Soil dust was collected from croplands, dry riverbed, wasteland and orchard.Cement dust was sampled from roofs and stairs of buildings around the construction sites, or from the production lines of nearby cement factories.Steel manufacture and vehicle exhaust sources were sampled from the steel manufacture industries and exhaust pipes, respectively.The chemical profiles of SO 4 2-and NO 3 -were established according to the composition of pure (NH 4 ) 2 SO 4 and NH 4 NO 3 .
The number of samples in each source category was listed in Fig. 1.
Further details about the source sampling have been previously described (Bi et al., 2007).

PCA/MLR-CMB
PCA/MLR is a useful mathematical tool for source profiles reconstruction (Caselli et al., 2006).Usually the ambient data exhibit many large correlations among parameters and PCA results in a much more compact representation of their variations (Ho et al., 2006).With no prior knowledge of the sources, PCA can still identify the major sources by extracting factors.Some of the extracted factors can be identified as one single actual source, while some complex factors may contain more than one emission sources which are covariant in time (Viana et al., 2008).However, the un-extracted factors could be identified as a part of unknown sources.
For CMB, the presence of the unknown sources makes the investigated sources and the receptor incompatible (Shi et al., 2009a), which results in multicollinearity problem becoming more serious (negative estimated contributions may be obtained).Thus, using the results obtained by PCA/MLR, CMB can determine the concentrations of near collineary sources mixed in one factor because of the improved balance between the sources and the receptor.In brief, PCA/MLR-CMB can identify the contribution of each major source although serious collinearity may exist in the source profiles.A detailed description of the combined model is given elsewhere (Shi et al., 2009b).
The PCA/MLR-CMB model is composed of three stages: Stage 1: reducing unknown sources from original receptor by PCA/MLR In stage 1, PCA/MLR is used to extract factors (identified as sources) with the original receptor as input.Under certain criteria (such as eigenvalue > 1.0), a limited number of factors are extracted by PCA.The factor identified as one single source category, such as residual oil combustion, is called extracted simplex source (ES-source).And the factor containing more than one emission sources, is called extracted complex source (EC-source).For example, one factor, usually identified as crustal source actually containing soil dust, coal combustion or cement dust, can be identified as a complex source.Then, source contributions and profiles of all sources, including both the ES-source and the EC-source, are obtained by multiple linear regressions of the total mass.More detailed description of PCA/MLR can be found in other literatures (Thurston and Spengler, 1985;Guo et al., 2004).

Stage 2: applying CMB model to analyze the secondary receptor
In this stage, the EC-source is applied in CMB as a new receptor (called secondary receptor).And the potential sources contributing to each EC-source, called Sub-sources, are obtained by investigation from the real world.Then, the contribution of Sub-sources contained in the EC-source is determined by CMB.
Stage 3: combing the results of stage 1 and 2 The final apportionment results, based on the results of both stage 1 and stage 2, consist of the contributions of both the ES-source and the Sub-source.

Ambient Data and Source Profiles
The average ambient data and source profiles of PM 10 in Taiyuan were presented in Table 1 and Fig 1, respectively.As shown in Table 1, the average concentration of ambient PM 10 was about 305 μg/m 3 , which is far to meet the National Ambient Air Quality Standard of China (100 μg/m 3 ).OC (about 53 μg/m 3 ) was the primary species in the ambient data and took the largest portion in both profiles of coal combustion and vehicle exhaust.It indicated Taiyuan was heavily polluted by coal combustion or vehicle exhaust.The species with secondary highest concentration was Si (about 40 μg/m 3 ).It is a main element in earth crust.So soil dust was an important source for PM 10 , possibly as well as resuspended dust which also took Si as its main composition.The concentration of SO 4 2-(about 35 μg/m 3 ) was also very high in receptor samples.It is primarily derived from the gaseous precursors SO 2 which also has strong connection with coal combustion.The average concentration of Ca was about 27 μg/m 3 .Although it is a major crustal element, the value in soil dust is far less than in the cement dust.So it showed that cement dust was a large contributor to PM 10 .The other main species in the ambient PM 10 are Al, Fe, NH 4 + and NO 3 -.Al is mainly related with coal combustion and most Fe is from steel manufacture factories; NO 3 -is often formed from the oxidation of NO x mainly emitted from vehicle.

Balance between the Sources and the Receptors
To investigate the balance between the sources and the receptor, the max potential emission (MPE) for each species was studied.The MPE is defined as the maximum potential concentration of each species in the receptor, under the assumption that all PM in the receptor is contributed from only one source category with the highest fraction of that species.
The MPE of each measured species was calculated as (Table 1): Where M is the average total mass of the ambient PM 10 ; P i is the max percentage of the ith species takes in the eight sources shown in Fig. 1.Through comparing the MPEs of certain species with their measured concentrations in the receptor, the balance between the sources and the receptor can be studied indirectly.  1 presents, MPEs of K, Zn and Pb were less than the concentrations measured.It demonstrated, except for the eight significant sources, some unknown sources with relatively high concentrations of K, Zn or Pb might exist in the real world.Thus, the balance between the sources and the receptor is disturbed by the presence of the unknown sources.

Coefficient of Divergence
To study the potential of near colinearity among the source profiles, similarities between different source profiles of PM 10 were also calculated using the coefficient of divergence (CD) (Wongphatarakul et al., 1998;Shi et al., 2009c).The CD is defined mathematically as: where x if is the average concentration of the ith species in the fth source category; f and j represent two source categories, and p is the number of species.A CD of zero means there are no differences between two source profiles, while a value approaching one indicates maximum differences (Wilson et al., 1998).The CDs between two source profiles were listed in Table 2.
As Table 2 shows, coal combustion, soil dust, resuspended dust, cement dust and steel manufacture had relatively low CDs about their profiles than them with vehicle exhaust.The profiles of soil dust and resuspended dust were the most similar ones with a CD of 0.310, which was reasonable because the soil was the greatest source of resuspended dust (Zhao et al., 2006).Soil dust and resuspended dust also had certain similarities with coal combustion in profiles.The CDs between them were 0.392 and 0.444, respectively.As for vehicle exhaust, the CDs between its profiles and the others were more than 0.680, As described above, some unknown sources had not been identified and certain similarities presented in the potential sources.To evaluate the multicollinearity problem, CMB alone was applied to the PM 10 , but we could not get reasonable results.It indicated that the balance between the sources and the receptor was low and there were near collinearity among some source profiles.Thus, we need the combined model to apportion the data in this study.

Source Apportionment by the Combined Model Stage 1: PCA/MLR
Varimax rotated PCA was applied to 125 ambient data and the results were presented in Table 3.Two factors were extracted based on the criteria that the corresponding eigenvalues must be more than 1.0.
Factor 1 presented high loading for Mg, Al, Si, K, Ca, Fe, Ti and Ba.Because Mg, Al, Ca, Si, Fe and Ti are the markers for crustal source, this factor was usually interpreted as crustal source which may contain sources of soil dust, resuspended dust or cement dust.However, as mentioned previously, a steel manufacture industrial estate and a coal mine estate are situated in Taiyuan, and Al, Fe are also makers for coal combustion (Hopke, 1985) and steel manufacture, respectively.So factor 1 should be identified as .They are secondary aerosols and the byproducts of combustion.Moderate loading of OC was also observed in factor 2. OC primarily comes from vehicle exhaust.So factor 2 could be identified as another EC-source containing secondary sulfate and nitrate and vehicle exhaust.
Using the grouping results of PCA, source contributions and profiles were then calculated by multiple regressions of particle mass concentrations.The standard deviations of the selected species in each source were determined based on the assumption that the standard deviation of the species in a certain source was linearly depended on the contribution proportion of the species in that source (Shi et al., 2009b).They were estimated as following: where σ i is the standard deviation of C i(secondary) ; δ i is the standard deviation of C i(original) ; C i(secondary) is the estimated average concentration of the ith species in secondary receptor (EC-source), C i(original) is the average concentration of the ith species in original receptor.
The estimated source profiles and standard deviations were shown in Table 4.
Table 5 presented the total mass of each species calculated by PCA/MLR and residual concentrations in the un-extracted factors which might be considered as a part of unknown sources.The percentages of K, Zn and Pb in the residual portion were about 39%, 54% and 56%, respectively.It agreed with the result that certain unknown sources containing relative larger portions of K, Zn and Pb also contributed to the ambient receptor.In addition, about 49 percent of OC was also presented in the residual portion.The reason of this portion of OC can not be extracted by PCA was that it contributed to the receptor irregularly in time serial.In the actual world, the sources of OC are complex and diverse.Therefore, this portion of OC might come from many insignificant unknown sources (e.g., meat cooking and natural gas combustion).Although each insignificant unknown source might get low contribution of OC, however, the sum of them can get a relative high contribution.Thus, with the fist step of PCA/MLR, some unknown sources were excluded as un-extracted factors.As discussed above, two factors were extracted by PCA/MLR on stage 1.Each factor was identified as a EC-source.According to study the factor loadings and investigate the emission inventory, factor 1 might contain soil dust, resuspended dust, cement dust, coal combustion and steel manufacture; and factor 2 might contain secondary sulfate, secondary nitrate and vehicle exhaust.
Next, the two EC-sources were separately used as new receptors for CMB model, on stage 2.
When the first EC-source (factor 1) was applied in the model, its Sub-sources were soil dust, resuspended dust, cement dust, coal combustion and steel manufacture; When the second EC-source (factor 2) was applied, the Sub-sources were secondary sulfate, secondary nitrate and vehicle exhaust.Then, the contribution of each Sub-source was calculated.

Stage 3: Final results
Fig. 2 showed the final results estimated by the combined model PCA/MLR-CMB.Resuspended dust was the greatest contributor to the ambient PM 10 in Taiyuan (about 26%).It is also a common phenomenon in many cities of North China (Zhao et al., 2006)  which was the secondary largest contributor.In Taiyuan, Coal occupied more than 99.5% of the total energy consumptions which would greatly increase especially in the heating-season.The contributions of vehicle exhaust and secondary sulfate were about 13% and 12%, respectively.In 2001, the total number of motor vehicles reached 243,000 and oil consumption was 232,000 tons in the city.Steel manufacture contributed about 12% of the total mass, which was higher than in many studied cities (Kunio, 1991;Li et al., 2003;Ogulei et al., 2006).The steel manufacture base with the biggest productivity of the world, Taiyuan Iron and Steel Company, is situated in the northeast.It emitted 7886.53 tons of dust and was the top emitter among all the industrial sources in 2001.The other contributors were soil dust (about 7%), cement dust (about 5%) and secondary nitrate (about 4%).And there was 3% of the ambient PM 10 contributed by other sources.

The Evaluation of Model Performance
To evaluate the performance of PCA/MLR-CMB model, the correlation between the measurements and the modeling results was statistically estimated.The calculated vs. measured species concentrations were plotted in Fig. 3, showing that the combined model provided good results.It was found that the calculated values were close to the measured species with a slope of 0.91 and a squared correlation coefficient of 0.97.It showed that the results obtained by the combined model were reasonable.

CONCLUSIONS
In this study, a combined receptor model (PCA/MLR-CMB) was applied to determine the source contributions to PM 10 in Taiyuan, China.Although near colinearity existing in some source profiles, eight sources were identified: resuspended dust, coal combustion, cement dust, steel manufacture, soil dust, vehicle exhaust, secondary sulfate and secondary nitrate.The results showed resuspended dust (about 26%) was the largest contributor to PM 10 in Taiyuan, which is similar in other northern cities of china being studied in other literatures.The secondary largest contributor was coal combustion (about 18%).It was reasonable because coal occupied more than 99.5% of the total energy consumptions.Steel manufacture contributed about 12% of the total mass and it was relatively high.It might result from both the large steel manufacture industries long existing in the city.Final analysis showed the results obtained by the combined model were reasonable.

Table 1 .
Average ambient data of PM 10 in Taiyuan.

Table 2 .
CDs between two source profiles of PM 10 in Taiyuan.

Table 4 .
Estimated source profiles and standard deviations of each species (μg/m 3 ).

Table 5 .
Total mass of each species measured, calculated and un-extracted.