Concentration at a Traffic Site in Delhi : Source Identification Using Persistence Analysis and Nonparametric Wind Regression

The source and origin of the ambient PM2.5 concentration at a traffic site in Delhi was studied using the persistence analysis and nonparametric wind regression (NWR) technique. The analysis was performed for the original PM2.5 data observed during 2007–2009, by removing seasonal and trend patterns (PM2.5-AR1), and for the exceedance time series. Detrended fluctuation analysis showed the strong persistence in the original and exceeded PM2.5 time series. This behavior was linked with the self-organized criticality of the process generating PM2.5 concentrations. NWR analysis was carried out to infer the sources of PM2.5 concentrations in the area. Power plants and medium scale industries along with the local transport emissions were found to be responsible for PM2.5 emissions at the site. Analysis of seasonal variations showed significant contributions from medium scale industries and power plants in winter, and dust storms and industrial contribution in summer. An analysis of the results obtained during calm conditions suggests the dominance of local transport emissions along with the above sources of PM2.5 concentrations at the site.


INTRODUCTION
Fine particulate matter poses a serious health risk to individuals exposed.These particles penetrate into the lungs and can cause respiratory illness and lung related diseases (Dockery et al., 1993).The study of trends and variability in the concentration of these particulates in ambient air helps in implementing proper control plans.The identification of sources of the particulate matter also effectively helps in mitigation strategies development.Source apportionment is an important task in air pollution management and control which could lead to implementing better policy options in controlling the concentration levels.Several techniques exist to apportion the sources of air pollutants present in an area including chemical mass balance (CMB), factor analysis with multiple regression (FA-MR), Unmix, positive matrix factorization (PMF), back-trajectory analysis, conditional probability function, nonparametric wind regression (Bilkis et al., 2010).The former four techniques have been extensively used in the literature for particulate matter source apportionment.These techniques require the information on the chemical composition of particulate matter either in ambient air or near source and/or for both and use this to apportion the sources present in an area.The later techniques mainly use meteorological data to apportion the sectors contributing to the pollution in an area and with the help of sector source apportionment one can infer the sources present in the prevailing sector.Both the meteorological and concentration data at receptor are also used to locate the sources influencing the local air quality of an area.In this line, Henry et al. (2009) described a hybrid source-receptor model that locates and quantify local sources of air pollution through nonparametric regression of 1h average atmospheric concentrations of a pollutant on hourly resultant wind speed and direction.The advantage of the model termed as 'nonparametric wind regression' is the use of only wind velocity data without any requirement of knowledge of chemical composition or emission inventory.
The study of temporal evolution of air pollutant concentration over time is important as it gives insight into the internal dynamics.Several approaches including fractal analysis, correlation integral, rescaled-range analysis and detrended fluctuation analysis (DFA) have been used to infer the intrinsic behavior of the time series in the recent past (Windsor and Toumi, 2001;Lee, 2002;Varotsos et al., 2005;Lu and Wang, 2006;Weng et al., 2008;Chelani, 2009).Based on this it has been shown that the time series of air pollutant concentrations possess fractal (Lee, 2002), persistent (Windsor and Toumi, 2001) and scale-invariant behavior.
In India, fine particulate concentration is posing an alarming situation specifically in urban areas.The increase in number of vehicles, number of power plants, coal combustion and resuspension of road dust along with other anthropogenic activities are raising the fine and coarse particulate concentrations.Delhi (28°35'N; 77°12'E), the capital of India, has witnessed the increase in population from 9.5 to 13.8 million during the last decade due to major political activities and urbanization.The number of vehicles in 2001 were approximately 35 lakhs which has increased to 60 lakhs in 2009 (Kumar and Anand, 2012).The tremendous vehicular growth has resulted in the high concentrations of air pollutants.The region has a tropical semi-arid climate with extremely hot summers (average temperature 46°C).Heavy rainfall in monsoon months (~73 cm) and extreme cold in winter months (average temperature 10°C) is generally observed in the area.The winds from NE-NW and SE-SW sector prevail in winter and summer, respectively.The major sources of particulate matter pollution are mainly industries, power plants, vehicular emissions and dust storms from Rajasthan.The poor ventilation, low wind speed and temperature inversion cause the air pollutants, emitted mainly from traffic, to remain trapped in the ground during winter.Very few studies have been conducted on the fine particulate matter in ambient air of Delhi.Kumar et al. (2007) examined the relationship between aerosol optical depth (AOD) estimated from satellite data and the PM 2.5 monitored in Delhi Metropolitan.Chowdhury et al. (2007) inferred that primary emissions from fossil fuel combustion (coal, diesel, and gasoline) were responsible for about 25-33% and biomass combustion contributed 7-20% of PM 2.5 mass in Delhi.In a study conducted by Central Pollution Control Board, New Delhi and National Environmental Engineering Research Institute, Nagpur at 10 locations in Delhi, domestic emission contribution to PM 2.5 was observed to be dominant among traffic and industries (http://www.cpcb.nic.in/Source_Apportionment_Studies.php).The study was however restricted to very few samples and requires further analysis and inference.Kaushar et al. (2013) developed a system for air quality forecasting and research (SAFAR) for coarse and fine particulate matter during 'Commonwealth Games, 2010' in Delhi.It is observed that the 24h PM 2.5 concentration was either around or below the national ambient air quality standards (60 µg/m 3 ) at some sport complexes, whereas at the other sites it fluctuated between 60 and 80 µg/m 3 .
In this study, 24 hourly PM 2.5 concentration observed over 2007-2009 at a traffic site in Delhi is statistically analyzed to infer its sources and temporal characteristics.The two approaches; nonparametric wind regression and detrended fluctuation analysis are applied to 24 hourly PM 2.5 concentrations to probe the local and regional sources contributing to fine dust pollution.The intent behind the use of persistence analysis is to detect the significant correlation between adjacent points and if so occurs, the property can be used to approximately infer the nature of sources contributing to the PM 2.5 pollution in an area.It is of the notion that the combination of two independent techniques; one based on the temporal analysis and another based on the source identification technique enables one to better understand the behavior and origin of the particulate matter in the absence of information on chemical composition of particulate matter.

STUDY AREA AND DATA USED
Central Pollution Control Board (CPCB), New Delhi (www.cpcb.nic.in) is continuously monitoring the particulate and gaseous pollutant concentrations at few sites in Delhi.24 hourly PM 2.5 concentration data observed during 2007 to 2009 at traffic site namely 'ITO' located in north of the city is selected for the analysis.The site has most of the time congested traffic and has intersections connecting four major roads.With two lanes of width 7.5 m in each direction, the sampling site is located at traffic road named Bahadur Shah Zafar marg.Approximately 113000-176000 vehicles pass through this road daily.The meteorological data specifically; wind speed and wind direction at nearby airport is obtained from India Meteorological Department, New Delhi.

NONPARAMETRIC WIND REGRESSION
For nonparametric wind regression, the pollutant concentration C at a receptor is expressed as the function of direction d observed during the corresponding time period as C i .The expected value of C for the particular d is given by, 1 1 ( , ) where K is the kernel function represented by Gaussian and Epanechnikov function given as; C i is the observed pollutant concentration, σ is the smoothing parameter, D i is the predominant wind direction (in degrees) for i th observation with N being the total number of observations.The smoothing parameter σ can be obtained by the cross-validation methods by computing the sum of squared differences between the measured and estimated concentration leaving out one observation (Henry et al., 2002).The details of the procedure and selection criteria of kernel function is given in Henry et al. (2002).The expected value estimated using the above kernels; Gaussian and Epanechnikov is shown to be the true estimator of concentration C under certain conditions (Henry et al., 2002).The results are however insensitive to the choice of the kernel function (Henry et al., 2009).

DETRENDED FLUCTUATION ANALYSIS
The presence of long-range correlations or persistence in the time series can be detected by using DFA.Persistence is characterized mainly by the time series which follow the direction of previous observations whereas anti-persistent time series follows reverse direction.DFA permits the detection of intrinsic self-similarity embedded in a nonstationary time series and avoids the spurious detection of apparent self-similarity (Shi and Liu, 2009).It calculates the root-mean-square fluctuation of integrated and detrended time series.To apply the DFA algorithm, the total length of the time series y(i), i = 1, 2,…, k is integrated as (Peng et al., 1994); where y(i) is the time series with mean y of all the samples, τ is the time lag, k = 1, 2,…, N and N is the length of the time series.The integrated time series is divided into segments of equal length n and the least-squares line is fitted to the data in each segment.The y-coordinate of the straight-line segments is denoted by z n (k), which is used to detrend the time series z(k) as z(k) -z n (k) in each segment.
The root mean square fluctuations of integrated and detrended time series is calculated by (Peng et al., 1994); Repeating the computations for all the segment sizes provides an increasing function relationship between the average fluctuation F(n) and the segment size n.A linear relationship on a log-log graph indicates the presence of scaling, i.e., F(n)~n α , where α is the scaling exponent, can be obtained as a slope of the line for all the segment sizes.The scaling exponent gives an indication of the nature of the time series.For 0 < α < 0.5, it indicates the presence of power-law anti-correlations in the time series, whereas 0.5 < α < 1 suggests the long-range power law correlations.Time series corresponds to white noise if α = 0.5.Even sometimes α ranges between 1 and 1.5, which suggests the stronger long-range correlations.

RESULTS and DISCUSSION
PM 2.5 concentration depicted in Fig. 1 shows strong periodicity with high concentration during winter (Dec, Jan and Feb) and lower in monsoon (July, Aug, Sept).The annual average PM 2.5 concentration ranges from 103 ± 78 µg/m 3 , whereas during winter, summer, monsoon and post monsoon, it ranges from 168 ± 78 µg/m 3 , 88 ± 56 µg/m 3 , 42 ± 20 µg/m 3 and 128 ± 86 µg/m 3 , respectively.The observed values are well comparable with the concentrations observed in various studies at urban sites in Delhi, which is given in Table 1.Apparent non-stationarity with fluctuations may affect the results of the statistical analysis.Hence, the analysis is performed for the original PM 2.5 data and by removing seasonal and trend patterns.For this, AR1 process is fitted to the data and generated time series is subtracted to remove any periodicities (termed as PM 2.5 -AR1).The persistence analysis is also carried out for exceedance time series.For this, the standard threshold provided by Central Pollution Control Board, New Delhi of 60 µg/m 3 has been used (http://cpcb.nic.in/National_Ambient_Air_Quality_Standards.php).The concentrations exceeding this standard are formed as the new time series termed as 'exceeded PM 2.5 ' time series.Around 42% exceedances were observed, the time series of which also showed fluctuations, however not much pronounced as the original one.
The results of the application of detrended fluctuation analysis are given in Fig. 2, which shows log n against log F(n) i.e., root mean square fluctuations.It can be observed that the slope of the straight line fitted to the curve of log n vs. log F(n) is > 0.5 suggesting the presence of strong persistence in PM 2.5 concentration even after removing the periodicity.α > 0.5 is also observed for exceeded time series and for AR1 removed data.The presence of strong persistence suggests long-memory (1/f noise) or temporal dependence in original PM 2.5 and 'exceeded PM 2.5 ' time series.To investigate the significance of crossover in the log-log plot of root mean square fluctuations against n, the slope is calculated before and after the crossover.Before the break, α is observed to be 1.1645, 0.6126, 1.0353 and 0.7242 for PM 2.5 , PM 2.5 -AR1, exceeded PM 2.5 and exceeded PM 2.5 -AR1 time series, respectively.After the break point, α is found to be < 0.5 for the four time series.The longmemory is observed up to 330 (~1 year), 160 (~5 months), 120 (~4 months) and 180 (~6 months) days in PM 2.5 , PM 2.5 -AR1, exceeded PM 2.5 , exceeded PM 2.5 -AR1 time series, respectively.The above results suggest that even if the periodicity is removed, persistence remains in PM 2.5 concentrations, however up to short range.This means the observed persistence is not due to linear correlations but represents the inherent temporal correlation structure.
The presence of power-law scaling and persistence has been linked to self-organized criticality (SOC) in many systems (Shi and Liu, 2009).Here many interacting factors tend to organize the system with a critical point.Internal dynamics of the system plays a major role to follow SOC.PM 2.5 concentration is a result of many interacting factors such as variety in the source emissions, meteorology, climate and geography.The nature of interactions among those factors and the presence of background concentrations govern the temporal variations in the PM 2.5 concentration levels.The presence of persistence in the time series of PM 2.5 concentration is suggestive of the uniformity in the generation mechanism of the concentrations over time.This in turn may signify the uniformity of the sources over time.The dilution capability of the atmosphere is uniform over time and calm conditions prevail during the period.Approximately 64% calm were observed for the corresponding PM 2.5 during the study period.Another conjecture may be the multiplicity of the pollutant concentrations over the scale due to the uniformity in the conditions governing its levels, which tend to multiply the concentrations over the scale of measurement.The pollutants emitted from the source remains in the air and multiply over time.However, the multiplication of pollutant concentration occurs over a certain limit and the self-organization occurs beyond that limit.
Although the nature of sources of pollutants cannot be judged by the persistence in the time series of pollutant concentration, it gives an indication of uniformity in the temporal variations.In order to infer the nature of sources of PM 2.5 concentrations in the area, NWR analysis is carried out.For this, the daily mean wind speed along with the predominant wind direction for the day is considered.Pollution rose is also plotted to have an idea about the pollutant distribution in different directions.It can be observed from Fig. 3(a) that PM 2.5 concentration is not much significant in any direction rather the concentrations are distributed equally in all the directions and therefore locally originated.For PM 2.5 > 60 µg/m 3 and PM 2.5 > 100 µg/m 3 , the major contribution is from the sources located in the NW-NE sector and SE directions (Fig. 3

(b) and 3(c)).
To have a clear idea about the predominant wind directions of concentrations, NWR is carried out.The smoothing parameters σ = 22.7 is considered with the Gaussian kernel function over the 16 wind directions.The results of NWR analysis are given in Fig. 4. The calculated concentration values are shown as the y-axis scale.The NWR plot for the whole PM 2.5 time series shows the significance of NE and NW directions.The contribution from NE and NW for PM 2.5 > 60 µg/m 3 and from NE, SE and SW-W for PM 2.5 > 100 µg/m 3 is observed in Fig. 4(a) and 4(b).For PM 2.5 > 200 µg/m 3 time series, SE-S sector is found to be dominant.The high concentrations of PM 2.5 come from these directions in the study area.
The site is a major traffic intersection site and receives the emissions from transport, nearby industries and power plants.The outer ring road is located within the ½ km vicinity of the site.On the NE side, the medium scale industries like Ghaziabad industrial estate and Shahdara industrial estate are located, where as on NW side, small    scale industries are located.Hence in addition to the local vehicular contributions, which is evident through the frequent calm occurrences (64.7%), industrial contributions to PM 2.5 mass during 2007-2009 are significant at the site.For PM 2.5 > 60 µg/m 3 , the similar findings are noticeable (Fig. 4(a)).
Various power plants (e.g., Badarpur power station) and medium scale industries such as Okhla industrial estate are located in the SE of the site (see Fig. 5 for the location of major industries).The emissions from the power plants are known to be the major contributor of particulate load in Delhi (Gurjar et al., 2004).In the west side, Thar desert is located.
Although the dust storms from the nearby desert bring the coarser particles into the area, the possibility of bringing the fine particulates cannot be ruled out.Power plants along with medium scale industries and desert dust contributes the PM 2.5 > 100 µg/m 3 .
Comparing the findings with the other studies in Delhi, which is given in Table 1, Tiwari et al. (2009) observed the coarse particulate emissions from Rajasthan desert along with continental dust emissions.The emissions from NW and west regions were also observed by Goyal and Sidhartha (2002).To confirm the above observations, the NWR is also plotted for high PM 2.5 levels i.e., PM 2.5 > 300 µg/m 3 , which suggested the prevalence of calm conditions (around 77%) along with the predominant wind direction from W-NW.The desert and large number of small-scale industries are located in W-NW direction.Tiwari et al. (2008) observed using the back-trajectory analysis that the air parcels were impacted by the emissions from the surrounding industrial locations, originated from west and north-west and other locations in the east where power plants are located.In this study, the contribution of local sources is highest followed by the power plant, industries and lowest contribution from nearby desert as suggested by the high percentage of calm conditions, which is approximately 67% for PM 2.5 > 60 µg/m 3 , 74% for PM 2.5 > 100 µg/m 3 , 84% for PM 2.5 > 200 µg/m 3 and 77% for PM 2.5 > 300 µg/m 3 .Guttikunda (2009) observed the primary nature of air pollutant concentrations at the study site.Here the percentage of calms for the whole study period is approximately 64% and the corresponding average contribution is 112 µg/m 3 which is highest among all the directions.It is also attempted to compare the results of NWR analysis with back-trajectory analysis, which is widely used for source or origin apportionment.For this, the backtrajectories are computed using Hybrid Single Particle Lagrangian Integrated Transport (HYSPLIT) model (Draxler and Rolph, 2011) developed by NOAA (http://www.arl.noaa.gov/ready/hysplit4.html).The analysis was performed with the GDAS meteorological dataset and the starting time of

Sampling site Medium scale industries
Small scale industries Power plant 0000 UTC, altitude of 500 m above ground level and total run time of 24h for the few high values of PM 2.5 > 300 µg/m 3 .The results are given in Fig. 6, which shows the arrival of winds predominantly from NW direction, which confirms the findings of NWR analysis.

NWR Analysis of Seasonal Variations
The NWR plot is also subjected to the seasonal time series to infer the season-wise emission sources.For this, four seasons viz., winter (Jan, Feb, Dec), summer (Mar-June), monsoon (July-Sep) and post monsoon (Oct-Nov) are considered.The NWR plot (Fig. 7) for winter depicts almost similar variations for whole PM 2.5 , PM 2.5 > 60 µg/m 3 and PM 2.5 > 100 µg/m 3 with significant contributions from NE and SW directions.For PM 2.5 > 200 µg/m 3 , S and SE directions are observed to be significantly contributing the PM 2.5 mass.Medium scale industries in NE and power plants in SW contribute to the PM 2.5 pollution in Delhi.During summer, the significant contributions are observed from NE, W and NW direction for PM 2.5 and PM 2.5 > 60 µg/m 3 time series.For PM 2.5 > 100 µg/m 3 and PM 2.5 > 200 µg/m 3 , no significant contributions from any direction are observed.This suggests that the dust storms during summer from western Thar Desert, emissions from the small-scale industries located in NW and from the medium and large industries in the NE region are the contributors of PM 2.5 in summer.During monsoon, S and southwesterly winds prevail for PM 2.5 time series during the study period, whereas for PM 2.5 > 60 µg/m 3 , SE and NW directions prevails but to a lesser extent.During post monsoon, as such no wind direction is significant.

CONCLUSION
The sources and origin of PM 2.5 concentration observed during 2007-2009 at a traffic site in Delhi are identified using the persistence analysis and nonparametric wind regression technique.The analysis is performed for the original PM 2.5 data and by removing seasonal and trend patterns (PM 2.5 -AR1) and also for the exceedance time series.Detrended fluctuation analysis suggested the presence of strong persistence in original and exceeded PM 2.5 time series.The long-memory is observed up to 330 (~1 year), 160 (~5 months), 120 (~4 months) and 180 (~6 months) days in PM 2.5 , PM 2.5 -AR1, exceeded PM 2.5 , exceeded PM 2.5 -AR1 time series, respectively.The presence of persistence is linked with the self-organized criticality of the process generating the time series of PM 2.5 concentrations, which suggests the uniformity in the generation mechanism of the concentrations over time.As a result, the concentrations tend to multiply over the scale of measurement.The pollutants emitted from the source remains in the air and multiply over time.NWR analysis is carried out to infer the nature of sources of PM 2.5 concentrations in the area.The power plants and medium scale industries in SE, NE and Thar desert in the west side along with the local transport emissions are found to be responsible for PM 2.5 emissions at the site.Analysis of seasonal variations showed significant contributions from medium scale industries and power plants in winter, dust storms and industrial contribution in summer.The analysis of calm conditions suggested the dominance of local transport emissions along with the above sources of PM 2.5 concentrations at the site.It can be seen that the combination of two different techniques facilitate the source apportionment even if the data on chemical composition of particulate matter is not available and the only reliable source of information is the meteorological data specifically wind velocity and air pollutant concentration over time.The approach can be applied to other pollutants also.

ACKNOWLEDGEMENT
The author is thankful to anonymous reviewers for constructive comments that helped improve the manuscript.Author is also thankful to CPCB, New Delhi for providing valuable data.

Fig. 6 .
Fig.6.Back-trajectories at a site in Delhi for few days of PM 2.5 > 300 µg/m 3 during 2007-2009 at starting time of 0000 UTC and altitude of 500 m AGL calculated using NOAA HYSPLIT model with meteorological data set-GDAS, total run time-24 hrs and location (28.54° 77.188°).The origin of the trajectories is marked with a dot and the trajectories are marked with red color.

Table 1 .
Studies on PM 2.5 concentrations in Delhi, India.