A Comparative Study of Air Quality Index Based on Factor Analysis and USEPA Methods for an Urban Environment

There are many different air quality indexes, which represent the global urban air pollution situation. Although the index proposed by USEPA gives an overall assessment of air quality, it does not include the combined effects (or synergistic effects) of the major air pollutants (Shenfeld, 1970; Ott and Thom, 1976; Thom and Ott, 1976; Murena, 2004). So an attempt is made to calculate the Air Quality Index based on Factor Analysis (NAQI) which incorporates the deficiencies of USEPA method. The daily, monthly and seasonal air quality indexes were calculated by using both these methods. It is observed that a significant difference exists between NAQI and EPAQI. However, NAQI followed the trends of EPAQI when plotted against time. Further, the indexes were used to rank various seasons in terms of air pollution. The higher index value indicates more pollution in relative terms. Moreover, the index may be used for comparing the daily and seasonal pollution levels in different sites.


INTRODUCTION
data on concentration of each air pollutant (e.g., SPM, CO, NOx, SO 2 , etc.) in different parts of the world.The large data often do not convey the air quality status to the scientific community, government officials, policy makers, and in particular to the general public in a simple and straightforward manner.This problem is addressed by determining the Air Quality Index (AQI) of a given area.AQI, which is also known as Air Pollution Index (API) (Shenfeld, 1970;Ott and Thom, 1976; Air pollution is a well-known environmental problem associated with urban areas around the world.Various monitoring programmes have been undertaken to know the quality of air by generating vast amount of Thom and Ott, 1976;Murena, 2004) or Pollutant Standards Index (PSI) (Ott and Hunt, 1976;EPA, 1994), has been developed and disseminated by many agencies in U.S. Canada, Europe, Australia, China, Indonesia, Taiwan, etc (Cairncross et al., 2007;Cheng et al., 2007).
An "Air Quality Index" may be defined as a single number for reporting the air quality with respect to its effects on the human health (Thom and Ott, 1976;Bortnick et al., 2002;Murena, 2004).In most elaborate form, it combines many pollutants concentrations in some mathematical expression to arrive at a single number for air quality.
AQI is an integral part of the Environmental Quality Index (EQI), which was developed and used by National Wildlife Federation of U.S. in late 1960s (Inhaber, 1976).In 1971 the EQI, with a numerical index scale from 0 to 100 (0 for complete environmental degradation and 100 for perfect environmental conditions), had seven components: soil, water, air, living space, minerals, timber and wildlife.Inhaber (1975) had suggested a set of air quality indexes for Canada.These indexes were Index of specific pollutants (e.g., SO 2 , SPM, CO, total oxidants, NOx, and coefficient of haze); Index of Inter-Urban air quality (considering the visibility readings at airports) and index of industrial emissions (by taking the total emission and population of the area).
In 1976 the USEPA established PSI which rated air quality from 0-500 with 100 equal to National Ambient Air Quality Standards (NAAQS).The daily PSI is determined by the highest value of one of the five main air pollutants: PM 10 , O 3 , SO 2 , CO and NO 2 (EPA, 1997;EPA, 1999).Lohani (1984) applied factor analysis approach to find environmental index for Taiwan.Here, he compared the air quality index based on factor analysis method and Pindex method.The ratings (or trends) obtained by both these methods are exactly same, but the AQI based on factor analysis shows a wider range which indicates that it is a better approach.Bezuglaya et al. (1993) developed an integral air pollution index (IAPI), which is the sum of individual air pollution indexes calculated by normalising the pollution concentrations to maximum permissible concentration (MPC).Since the main objective of AQI is to measure the air quality in relation to its impact on human health, the Environmental Protection Agency (EPA) of U.S. revised the previous method to calculate daily AQI in 1999.The EPA method is based on concentrations of five criteria pollutants: carbon monoxide (CO), nitrogen dioxide (NO 2 ), ozone (O 3 ), particulate matter (PM) and sulphur dioxide (SO 2 ).The concentration values are converted into numerical indexes.The overall AQI is calculated by considering the maximum AQI among the monitored pollutants corresponding to a site or station.The scale of the index (0-500) is subdivided into six categories that are associated with various health messages.Cheng et al. (2004) proposed a revised EPA air quality index (RAQI) by introducing an entropy function to include effect of the concentrations of the rest of pollutants other than the pollutant with maximum AQI.Further the application of RAQI in Taiwan has showed that the suspended particulates have significantly greater impact on PM 2.5 /PM 10 ratio in southern parts than central and northern area; and these ratios are higher as a whole compared to many other countries (Cheng et al., 2007).Other prominent studies related to various aspects of air quality indexes are those of Kassomenos et al. (1999), Malakos and Wong (1999), Swamee and Tyagi (1999), Trozzi et al. (1999), Khanna (2000), Cogliani (2001), Bortnick et al. (2002), Murena (2004), Jiang et al. (2004), Longhurst (2005), Landulfo et al. (2007), Mayer andKalberlah (2008), andElshout et al. (2008).In Indian context, the studies on AQIs have been carried out for the city of Mumbai (Sharma, 1999), Delhi (Sengupta et al., 2000) and Kanpur (Sharma et al., 2003).The mathematical functions for calculating these indexes are based on health criteria of the EPA and Indian air quality standards.In the same study, Sengupta et al. (2000) examined the Oak Ridge Air Quality Index (ORAQI) based on additive function of sub-indexes for Delhi.
But it was found that this index suffered from eclipsing effect, i.e., when one pollutant exceeds its standard without the index exceeding its critical value (Thom and Ott, 1976).Further, it was observed that on more than 90 percent of time the index estimated that the air quality falls under acceptable limits though the air quality standard for some pollutants was violated.The maximum operator concept (MOC), which is generally used by EPA of U.S. to calculate EPAQI, is suggested to overcome this problem.The MOC only considers the maximum value of any of the sub-indexes to define the overall AQI.This method has the following limitations.First, it discards the values of the other sub-indexes and the harmful levels associated with other pollutants (Radojevic and Hassan, 1999).Second, the index does not include the additive or synergistic effects of pollutants together on the human health.In addition to the above limitations, the other drawbacks pertaining to the EPA method are, first, the break points for calculating the index value for NO 2 concentrations less than 0.65 ppm is not defined.This deficiency is also associated with tropospheric ozone concentrations.Second, the sub-index describes the pollution level of the pollutant on an ordinal scale.This same scale is used for finding the aggregate index.So the severity of the pollution level described by the aggregate index is not linear.Third, a greater part of the world is not able to adopt AQI system mainly because the lack of PM 2.5 measurement capability (Cheng et al., 2007).To address the above shortcomings, we propose an AQI based on simple statistical approach.Therefore, an attempt has been made to calculate a New Air Quality Index (NAQI) which is based on the factor analysis technique assisted by principal component analysis (PCA).The details of materials and methods, results and discussion, and conclusions are given in the subsequent sections.

Sampling of Air Pollutants
The sampling site in the present study was at Jawaharlal Nehru University (JNU) Campus, an area situated in south of Delhi (Fig. 1).This site is sandwiched between the natural vegetation of JNU and a major road carrying vehicular traffic.The daily traffic density is moderate to high with peak periods found during morning and evening hours.Delhi, the capital city of India, with a population of 13.8 million in 2001 (Goyal and Sidhartha, 2003;India, 2006) and having approximately 3.5 million vehicles, is widely known to be one of the most polluted cities of the world.The  In Factor analysis, we have used the technique of Principal Component Analysis (PCA).The basic purpose of PCA is to account for the total variation among the 'n' number of subjects (variables) in pdimensional space by forming a new set of orthogonal and uncorrelated composite variates.Each member of the new set of variates is a linear combination of the original set of measurements (Henry and Hidy, 1979;Lioy et al., 1989).The linear combinations are generated in such a manner that each of the successive composite variates will account for a smaller portion of the total variation.The first composite (principal component) will have the largest variance, the second will have a variance smaller than the first but larger than the third, and so on.If first few principal components (or, eigenvector-eigenvalue pairs) account for more than 60% of the total variance, then there is hardly any requirement in taking more principal components (PCs) (Harman, 1968;Johnston, 1978;Dunteman, 1994) to compute the composite (overall) Air Quality Index.The higher order PCs explain only minimal amounts of total variance and are, therefore, treated as noise (Johnston, 1978;Dunteman, 1994;Kim and Mueller, 1994;Srivastava et al., 2008).

AQI by Using EPA Method (EPAQI)
The AQI measures daily pollution index of the pollutants for which EPA has established National Ambient Air Quality Standards (NAAQS).The index combines the NAAQS with an epidemiological function to determine a descriptor of human health effects due to short-term exposure (24 hour or less) to each pollutant (EPA, 1994(EPA, , 1997)).The index for a pollutant is calculated using the mathematical expression (EPA, 1999): Where, I P = the index value for pollutant, P; C P = the truncated concentration of pollutant, P; BP Hi = the breakpoint that is C P ; BP LO = the breakpoint that is C P ; I Hi = the AQI value corresponding to BP Hi , I LO = the AQI value corresponding to BP LO .
The indexes for each of the pollutants NO 2 , O 3 , PM 10 , CO and SO 2 were obtained from Eq.
(1) using their respective break points and associated AQI values (EPA, 1999).Having calculated Ip of each pollutant, the EPAQI is evaluated by considering the maximum index value (Ip) of the single pollutant.
The method of principal components (PCs) can be applied by using the original values of variables (X j 's) (where j = 1, 2, 3, …, n) or their deviation from their means (x j = X j - ) , or the standardized variables; where S j is the standard deviation (Harman, 1968;Johnston, 1978;Kim and Mueller, 1994;Koutsoyiannis, 2001).
In our analysis, we have used raw data X j 's (j = 1, 2, 3, …., n).In the principal component model, X j is expressed as a linear combination of the principal component as Where, j = 1, 2, …, n; P i is the i th principal component; a ji is the factor loading of the j th variable on the i th principal component.Basically the factor loading is the j th component of the i th eigen-vector of the correlation matrix multiplied by the square root of the corresponding eigen-value.The principal components (Lohani, 1984) are given by, Where i is the eigen-value associated with P i .
After obtaining the principal components (PCs), the New Air Quality Index (NAQI) is computed using the expression given below: Where, n = 3; P i s, i.e., P 1 , P 2 , P 3 are the three principal components for which the cumulative variance is more than 60%; E i s, i.e., E1, E 2 , E 3 are the initial eigen-values ( 1) with respect to the 'percentage of variance'.

RESULTS AND DISCUSSION
Using the data of one-hour average concentration of each pollutant, NAQI and EPAQI have been determined.While computing the NAQI by PCA method, we have retained components with eigenvalue 1.0 and accounting for more than 60 % of variance.The components with eigenvalue < 1.0 explain less variance and hence discarded (Johnston, 1978;Kim and Mueller, 1994).The results of PCA were obtained by using SPSS 10.0 software and the NAQI was calculated by using Eq. ( 4).

Seasonal Variation of NAQI and EPAQI
The variations of NAQI and EPAQI with respect to (w.r.t.) time (hours of the day) are depicted in the Figs.2-11.A perusal of these graphs reveals that NAQI follows almost a similar trend as EPAQI for all the months.For characterising the air quality based on EPAQI during the days of various seasons, frequency of its occurrence in a given index interval is determined.The percentage durations of different categories of air quality are presented in Table 1.

Seasonal Variation of EPAQI
It is clearly seen from Table 1 and Figs.2-11 that for major part of the day, the air quality remains 'unhealthy' if not worse in all the seasons with the exception of monsoon

Seasonal Variation of NAQI Winter Season
The highest index values observed in winter season (Figs.5-8) can be attributed first to the meteorological conditions prevailing in the north India and second to the low rate of removal of gaseous pollutants.This region is dominated by high pressure usually centred over western China causing increased atmospheric stability.This condition allows less general circulation and more stagnant air masses (Padmanabha Murthy, 1984;Aneja et al., 2001).Further, the rate of chemical removal of gaseous pollutants (e.g.NO 2 , CO, etc.) in the form of inorganic compounds such as HNO 3 , H 2 CO 3 , etc is quite less due to the low concentration of OH-radicals as the formation of OH-radicals depends on intensity of light which is generally moderate to weak in winter season (Varshney and Singh, 2002;Kumar et al., 2008).

Summer Season and Monsoon Season
The high index values during the summer months (Figs.9-11) can be attributed to the frequent severe dust storms covering the atmosphere of Delhi in spite the fact that during the summer season mixing height is at its maximum (PROBES, 2003).During monsoon month of September, the NAQI is comparatively very low (Fig. 2) due to precipitation, high wind velocities and changes in the general wind direction.The precipitation helps in wet deposition of pollutants.The changes in wind velocity and reversal of its direction carry the pollutants away from sources as well as increase the possibilities of dilution of concentration of pollutants also.

Post-monsoon season
In the post-monsoon months (Figs.3-4) although the air quality is expressed to be relatively better compared to winter and summer months there is an exception in the month of October (Table 2) which has a rather high value of NAQI (= 29.72).This is mainly on account of lighting of massive fireworks during the two Hindu festivals namely Dussehara and Deepawali (Attri et al.,2001;Kulshrestha et al., 2004) which happen to be in month of October.

Trends between NAQI and EPAQI
A careful examination of the trends of the EPAQI and NAQI reveals that the variation in NAQI closely follows that of EPAQI.Whereas, the magnitude of the values of the EPAQI is solely determined by the maximum value of a sub index of a given pollutant, the value of NAQI takes into account the variances of concentrations in different pollutants.In case, where a particular pollutant is dominant (i.e., the concentration of the pollutant far exceeds the permissible standards) and it also has large variances in its concentration values, both the indexes are expected to have higher magnitudes.Since in the present study the dominant pollutant is PM 10 , which is associated with large variances in its concentration value, the trends of EPAQI and NAQI are not surprisingly similar.

NAQI
The average values of NAQI and EPAQI over a month have also been computed from the hourly index values of these two indexes.These results (i.e. the average index values) for different seasons are summarised in Table 2.The values of two indexes for prescribed standards of pollutants as per Central Pollution Control Board (CPCB) norms are also mentioned in the last column of the Table 2. Now the various months have been ranked in terms of NAQI with respect to the rank of standard NAQI (taken as 1).An examination of NAQI values indicates that the air quality is worst in winter months of December (rank 11 and NAQI= 36.12) and March (rank 10 with NAQI = 30.11)followed by summer months (April, May, June) and Post-monsoon (October, November).Least value (= 12.97) of NAQI is observed for the monsoon month of September.

CONCLUSIONS
The proposed index (NAQI) is basically an air stress index with no established standards, i.e., the index would not show a pronounced relation to the health of the people.Therefore, it is not possible to characterise the air quality associated with the values of NAQI and also to draw any definitive inferences about the REFERENCES category of air-quality as in the case of EPAQI.But it has the advantages of self-consistency as it combines the synergistic effects of all the five criteria pollutants.To evaluate its selfconsistency w.r.t.health outcome matrix, we applied this index by using disability adjusted life years (DALYs) as the common metric of health effect which is widely used in burden of disease estimates (Cohen et al., 2005).However this calculation requires considerable additional information on health status of exposed population.The NAQI is very much useful in defining the status of air in relative terms.Similarly by comparing NAQI values, one could evaluate the air quality status of different locations in relative terms.For instance if the value of NAQI has increased at a given location, it would mean worsening of the air quality and vice versa.NAQI can also be used to ascertain whether the air quality has worsened or improved over the months in different seasons as represented by their Fig. 1.Map of the study area.

Fig. 5 .
Fig. 5. Variation of NAQI and w. r. t.Time (hours of the day) on 28.12.03.

Table 1 .
Frequency of occurrence of Air Quality categories during the day in different seasons Frequency of occurrences during the day (in %)

Table 2 .
Comparison of average index values of EPAQI and NAQI over a month in different seasons for Delhi