Evaluation of PM2.5 Surface Concentrations Simulated by NASA’s MERRA Version 2 Aerosol Reanalysis over India and its relation to the Air Quality Index

The PM2.5 (particulate matter with a diameter ≤ 2.5 μm), an essential component of air pollution, is closely linked to adverse effects on human health, including premature mortality following prolonged exposure. However, limited surface measurement and the lack of monitoring with adequate spatial resolution hamper studies related to air pollution and its impact on various societally relevant issues. More recently, the National Aeronautics and Space Administration (NASA)’s Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) has begun estimating the global distribution of PM2.5 mass concentrations at high spatio-temporal resolutions, but the limitations of the applied estimation methodologies must be carefully evaluated in order to understand their strengths and weaknesses. This study assesses MERRA-2’s PM2.5 results by comparing them with ground-based measurements conducted at 20 stations across the Indian region between 2015 and early 2018. Our analysis shows that MERRA-2 generally underestimates the PM2.5 in terms of both the mass concentration and the number of exceedance days. While the Central Pollution Control Board (CPCB) measured exceedances of the national ambient air quality standards (NAAQS) on 34% of the days, MERRA-2’s prediction was only 11%, and its estimate of the annual average PM2.5 concentration across all of the sites was also negatively biased, by ~27 μg m. Correlations of 0.96 and 0.6 were found between the estimates and the measurements for the monthly and the daily averaged concentrations, respectively; these numbers can be dramatically improved by applying a simple bias correction. Overall, our evaluation reveals that MERRA-2’s raw estimates of PM2.5 on a monthly time scale or longer are helpful in long-term air quality studies.


INTRODUCTION
Air quality has emerged as one of the most critical issues in the recent times due to increase in vehicular population (Shrivastava et al., 2013), industrialization (Begum and Harikrishna, 2010), rising energy demand and emissions from coal power plants (Guttikunda and Jawahar, 2014), residential cooking (Massey et al., 2009(Massey et al., , 2012 and dust transport from semi-arid and arid regions (Prospero et al., 2002;Dey et al., 2004;Huang et al., 2018). On a global scale, 4-8% of premature deaths are attributed to PM 2.5 (Smith and Jantunen, 2002). PM 2.5 can also cause a wide range of diseases like asthmatic attacks, lung cancer, which may lead to a significant reduction of human life (Kampa and Castanas, 2008;Kim et al., 2015). Particulate pollution also poses a challenge to the satellite retrieval of surface properties such as greenness volume, skin temperature, the urban fraction (Frick and Tervooren, 2019;Gogoi et al., 2019). India's Air (Prevention and Control of Pollution) Act1981 imposes stringent air quality measures based on national ambient air quality standards (NAAQS). However, most of the Indian region, especially the Indo-Gangetic Plain (IGP), faces heavy air pollutions throughout the year. Sometimes pollutant levels reach several times larger than the national standards (CPCB, 2012). Of the several air pollutants, PM 2.5 is a significant component, especially over the Indian region leading to significant deterioration of air quality. Studies have shown that this particulate pollution is linked to increased mortality over the Indian region (Apte et al., 2015). In the central part of India, the average value for PM 2.5 remains as high as ~150 µg m -3 (Massey et al., 2009). In Delhi, PM values exceed NAAQS 85% of the time (Sahu and Kota, 2017) and rank among the cities with the worst air quality in the world (Gupta et al., 2006). In India, life lost due to PM 2.5 is on an average of about 3.4 ± 1.1 years, where Delhi is showing the most significant decreases in the life expectancy of 6.3 ± 2.0 years. In 2011, about 570,000 premature mortalities were attributed to PM 2.5 pollution (Ghude et al., 2016;Balakrishnan et al., 2019).
Despite its significant role in the overall air quality and health of the population, there are not many long-term surface-based measurements available in the public domain about PM 2.5 over the Indian region. Studies have explored the possibility of using satellite-based aerosol optical depth (AOD) to estimate ground-level PM 2.5 to explore its variability at the high temporal frequency and spatial resolution. However, uncertainties related to aerosol retrieval errors over the land, vertical distribution of aerosols, aerosol growth due to water uptake have induced limitations in these PM 2.5 retrieval efforts (Sorek-Hamer et al., 2013). On the other hand, reanalysis datasets have generated PM 2.5 fields at high spatio-temporal resolutions (e.g., MACC and MERRA). Since these are estimates derived from the simulation of PM 2.5 using a complete aerosol life cycle with state-of-theart models, they are less prone to errors that are induced in the case of satellite-based retrievals. However, the accuracy of the models or chemical reanalysis depend primarily on their ability to simulate the meteorology, aerosols and their interaction (Zhang et al., 2016). Besides, processes that are less constrained in models such as rainfall could alter both the column (Pandey et al., 2017;Moteki et al., 2019;Yu et al., 2019) and ground-level aerosol mass concentration which partly constitutes the PM 2.5 (Wu et al., 2018). In this regard, it may be mentioned that an earlier version of MERRA, MERRA-1, has been evaluated over the USA, Europe, Taiwan, Israel, and India (Buchard et al., 2016;Provençal et al., 2017a, b;Mahesh et al., 2019). However, MERRA-2, an advanced version of MERRA, has not been evaluated, especially over the highly dynamic Indian region. One of the primary reasons for the absence of such careful evaluation over the Indian region is due to lack of quality-assured/-controlled PM 2.5 surface measurements available in the public domain. With the recent initiative of the Central Pollution Control Board (CPCB) to open up the data of PM 2.5 for various Indian cities, this has become a reality that allows us to evaluate MERRA-2-estimated PM 2.5 over the Indian region. In this paper, an attempt is made to evaluate PM 2.5 from MERRA-2 reanalysis over the Indian region using surface-based measurements. Furthermore, MERRA-2 PM 2.5 is used to evaluate the seasonality in terms of the air quality index (AQI), exceedance days, and spatial pattern. After that, a simple bias correction is applied to evaluate the usefulness of AQI generated using MERRA-2 reanalysis in comparison to the observations.

Data
The CPCB monitors criteria air pollutants over various cities of India. PM 2.5 mass concentration is measured by using the tapered element oscillating microbalance (TEOM), which uses a small vibrating glass tube whose frequency changes with the mass of PM 2.5 deposited on it. Daily mean PM 2.5 mass concentration from 20 cities was utilized for the evaluation. The stations/cities were selected under the criteria that at least one year of continuous data is available. The locations of the different stations used in this analysis are shown in Fig. 1.
The NASA's Global Modeling and Assimilation Office (GMAO) developed the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) which is an update of MERRA-1, Goddard Earth Observing System (GEOS) model. MERRA-1 was available between 1979 and February 2016, but after the discontinuation of MERRA-1, the GMAO started MERRA-2 (https://gmao.gs fc.nasa.gov/reanalysis/MERRA-2) which is an advanced version and a replacement of its previous reanalysis. Both the reanalyses have the same resolution and products (Gelaro et al., 2017).
The aerosol optical depth at 550 nm from the Moderate Resolution Imaging Spectroradiometer (MODIS) on board NASA's Aqua and Terra satellite is the primary input to the MERRA-2 assimilation (Randles et al., 2017) for aerosol products. The Goddard Global Ozone Chemistry Aerosol Radiation and Transport (GOCART) model is coupled with GEOS atmospheric model to simulate source, transport, sink, and concentration of five dominant aerosol species, including dust (DU), sea salt (SS), black carbon (BC), sulfate (SO 4 ) and organic carbon (OC). 550 nm AOD is a column-speciesintegrated optical quantity, which is the summation of the product of extinction coefficient of each species derived from optical properties of aerosols and clouds datasets and mass concentrations. MERRA-2 aerosol data assimilation is done globally with a resolution of 0.5° × 0.625° and 73 vertical levels from 1980 (Chin et al., 2002;Colarco et al., 2010;Randles et al., 2017).

Reconstruction of PM2.5 Mass Concentration
The major aerosol species considered in MERRA-2 reanalysis are SO 4 , BC, DU 2.5 , SS 2.5 , and OC. It is possible to reconstruct the total mass of PM 2.5 using these subspecies. The general form of the equation to arrive at the total PM mass for any size is given as follows: Total PM = Inorganic ions + Organic matter + Black carbon + Dust + Sea salt The mass of organic matter is calculated with the help of organic carbon by multiplying it with the coefficient derived from various experiments, which are 1.6 ± 0.2 for urban particles, 2.1 ± 0.2 for aged (non-urban) particles. The coefficient is sometimes as high as 2.6 for biomass burning particles (Turpin and Lim, 2001;Chow et al., 2015). SO 4 , NO 3 , and NH 4 are the major parts of inorganic ions. NO 3 and NH 4 were sometimes omitted in case if these measurements were lacking or unreliable. In such cases, SO 4 is multiplied by 1.375 as (NH 4 ) 2 SO 4 being composed of 73% of SO 4 by mass (Malm et al., 1994;Chow et al., 2015). To account for dust and sea salt, MERRA-2 differentiates size for these two species within 2.5 µm diameter range.
For this study, the following equation has been used to reconstruct the PM 2.5 (Hand et al., 2011) The mass of PM 2.5 is reconstructed using the above equation and compared with the mass of PM 2.5 given by the ground-based monitoring of CPCB over 20 cities for a time period of January 2015-March 2018 (Fig. 1). It may be mentioned that this expression is widely used for PM 2.5 estimates over Asia and Europe (Provençal et al., 2017a, b).

Statistical Metrics Used for Evaluation
Various statistical evaluators were applied on the daily, weekly, and monthly mean PM 2.5 concentrations for evaluating MERRA-2 data. These are correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), absolute error in percentage (abs. error), mean bias ( so B C C ), and mean fraction ( / so F C C  ), where C s and C o indicate the simulated and observed concentration, respectively. The values for statistical metrics like RMSE, MAE, absolute error and bias should be as low as possible, whereas the value of mean fraction should be nearer to one for accurate PM 2.5 concentrations. To relate simulated data directly with the observed data, Chang and Hanna (2004) proposed one index Factor of Two (FAC2) to evaluate air quality models, the proportion of data which falls under the range of 0.5 ≤ C o /C s ≤ 2.0 should be equal to or more than 0.5 to be considered a good model performance.

Comparison of MERRA-2 PM2.5 Mass Concentration to CPCB
Statistical evaluation of reconstructed PM 2.5 mass concentration was performed against ground-based monitoring stations. A set of performance statistics was calculated for daily, weekly, and monthly mean PM 2.5 mass concentration (Table 1).
We chose different temporal scales to understand the effect of averaging on the evaluation statistics. The obtained statistics on the bias, RMSE (Eq. (2)), and MAE (Eq. (4)) remain almost similar except the correlation coefficient, which improves with longer averaging periods. For example, the correlation of PM mass concentrations between CPCB stations and MERRA-2 for daily means was ~0.6, whereas it was ~0.96 for monthly means. The observed fluctuations in the values are decreased in the case of monthly (FAC2 daily < FAC2 monthly ) and thereby an overall increase in correlation and a slight decrease in mean fraction. However, the overall bias remains unchanged. The absolute error reduces by ~5% as the averaging period is increased from daily to monthly scales. In the analysis performed by Mahesh et al. (2019) on MERRAero (MERRA-1) over five cities of India, the values of correlation coefficient and FAC2 are less compared to the The average observed concentration of PM 2.5 (over 20 cities) is ~80 µg m -3 due to the clustering of most observational sites in the highly polluted IGP region, interestingly double that of the NAAQS limit. The average simulated annual mass concentration by MERRA-2 is ~35 µg m -3 indicating a considerable underestimation pointing to PM 2.5 mass well below the annual standard. The mean bias is -27.27 µg m -3 and values are underestimated by 34%. Fig. 2 shows the scatter between the MERRA-2 and CPCB for monthly PM 2.5 mass concentrations. The scatter becomes wider for higher loading conditions. It is also observed that for extremely low (high) loading conditions, MERRA-2 overestimates (underestimates) the mass concentrations.
These are based on surface PM 2.5 mass concentration measured by CPCB at 20 different stations (locations are shown in Fig. 1). It is found that there is a clear seasonality in the particulate matter mass concentration with the highest values (> 100 µg m -3 ) observed during winter and lowest (~30 µg m -3 ) during the monsoon period. The grey shaded region represents the mean inter-station variability, which is also influenced by the large spatial variability in mass concentrations. MERRA-2 simulations systematically show lower concentrations on average in comparison to the observations. The blue shaded area represents the variability in PM 2.5 mass concentrations simulated by MERRA-2. There is a low bias both in terms of the mean as well as the variability. The monthly mean values from observations show PM 2.5 mass concentrations mostly above ~60 µg m -3 , indicating high loading conditions beyond the annual NAAQS standard for PM 2.5 . It may be noted that the CPCB and MERRA-2 exceed NAAQS daily standard values ~37.5% and ~11.6% times on annual scale, respectively. Except for summer, CPCB monthly averaged value always crossed 60 µg m -3 threshold, whereas MERRA-2 monthly averaged value always remained below 60 µg m -3 value and it almost matches with CPCB in the summer season. However, as mentioned earlier, MERRA-2 is unable to capture these high loading conditions during most of the months. It may be mentioned that though simulated PM 2.5 mass and variability were biased low, the temporal variability (involving multiple years of monthly means) is well captured by MERRA-2 as indicated by the high correlation (~0.96) on monthly time scales. On an annual basis, it is found that MERRA-2 concentrations were biased low by ~30%. Fig. 3 shows the seasonal loading of total PM 2.5 mass using MERRA-2. A clear seasonality was observed in the spatial pattern of PM 2.5 mass concentration over India. Throughout the year, particulate loading is higher over Indo-Gangetic Plain compared to the rest of the country. The pre-monsoon (post-monsoon) seasons are observed to have the lowest (highest) PM 2.5 concentration. The monsoon and winter season have a moderate level of PM 2.5 mass loading. The highest PM 2.5 concentration during the post-monsoon season is attributed to the stubble burning over the western IGP (Mittal et al., 2009;   Mukherjee et al., 2018;Jethva et al., 2019), which may spread all over India (Cusworth et al., 2018;Sarkar et al., 2018). In addition, over the eastern and peninsular India, coal burning in thermal power plants and industry are also essential contributors (Venkataraman et al., 2018). The southern part was observed to be relatively clean compared to northern India. An apparent mismatch between the observed (from CPCB) and simulated (from MERRA-2) concentration is evident in almost all the seasons (Fig. 3). This mismatch is more prominent during the pre-monsoon season and especially over IGP. The observed and simulated PM 2.5 concentrations in southern India were relatively in agreement. This agreement is even more during monsoon season when the observed concentration is 20-30 µg m -3 . It may be noted that the post-monsoon season, when almost all of the Indian region has the highest fine particulate loading, MERRA-2 can capture the spatial pattern. The observed mismatch in the pre-monsoon season may be attributed to MERRA-2's inability to simulate the size-resolved dust concentration, as the pre-monsoon season over IGP is characterized by high dust loading (Tiwari et al., 2015;Pandey et al., 2016Pandey et al., , 2017. A past study by Kramer et al. (2018) reported an underestimation of dust particle concentration of size less than 2 µm in MERRA-2 reanalysis. Having evaluated MERRA-2's ability to simulate the PM 2.5 concentration, its applicability for air quality purposes was tested.

Air Quality Index Using MERRA-2
The national air quality index (AQI) of India is defined based on 8 major pollutants, including PM 2.5 , where AQI for PM 2.5 is divided into 6 major classes, Good Severe (250+ µg m -3 ) (CPCB, 2015). The PM 2.5 data from MERRA-2 was categorized into these classes as shown along with the observed values (Fig. 4), color-coded according to CPCB for the city of Delhi.
In order to study the ability of MERRA-2 to quantify AQI, bias correction was performed by adding monthly bias value to the respective month. From Fig. 5, it is found that MERRA-2 PM 2.5 estimates are improved and is capturing spatial pattern as well as seasonal variability of AQI. Earlier, MERRA-2 was unable to capture AQI classes beyond Moderate. However, after the bias correction, MERRA-2 can capture Very Poor AQI near Indo-Gangetic Plain. It can be observed that MERRA-2 is not capturing a higher AQI condition, which mainly occurs in the winter season. However, after applying the bias correction, MERRA-2 is showing a good match during the winter season (though biased low in terms of mass concentrations). Earlier MERRA-2 was showing 53% satisfactory AQI which is a high underestimation of actual AQI where only 22% is a satisfactory condition were present. After the bias correction, MERRA-2 shows ~24% satisfactory condition, which is much closer to the 22% reported by CPCB.
The same bias correction is applied over all the 20 cities, as shown in Fig. 5. It can be observed that cities over IGP are showing the least number of good air quality days. Cities in the southern part of India and Maharashtra shows the highest percentage (~30+%) of good air quality days. PM 2.5 from MERRA-2 is negatively biased as compared to CPCB observations, where most of the data lie within satisfactory AQI and is unable to capture Moderate and Poor AQI classes. Whereas north Indian cities like Gaya, Kanpur, Patna, and Varanasi, where the percentage of good air quality days were zero, MERRA-2 shows the significant occurrence of good air quality days (higher than 20%). This indicates MERRA-2's inability to capture highly polluted episodes. The application of bias correction results in a reasonable representation of Poor AQI conditions and also improved the spatial pattern of AQI (Fig. 6) throughout the country. The percentage of days in AQI by CPCB over Delhi as shown in this study are in concurrence with the past studies (Sahu and Kota, 2017). Fig. 6 shows the improvement in the spatial pattern in MERRA-2 due to bias correction. MERRA-2 had failed to capture moderate and very poor air quality conditions over IGP in the winter and post-monsoon season. However, the bias correction provides some ability to capture spatial patterns. MERRA-2 well captures the AQI in southern India during pre-monsoon and monsoon seasons without any bias, but over the cities like Kanpur, Varanasi, and Gaya, it fails even during the monsoon season when the AQI is the least. The use of bias correction not only improves the ability in the least polluted seasons, but it also captures average AQI over the extremely polluted cities in the high AQI seasons. The study by Rajput et al. (2018) suggests that the AQI over Lucknow ranges in Very Poor especially in post-monsoon and winter seasons, which is well matching with the CPCB and MERRA-2 with bias correction. Fig. 7 shows the exceedance of different air quality classes over the IGP on a seasonal basis with and without bias correction. The CPCB observations show 0% and < 5% of good days over IGP and whole India in the winter season respectively, but in the case of MERRA-2 shows more than 10% and 20% of good days as already discussed earlier. MERRA-2 bias correction leads to 0% good days in IGP and India during winter. Similarly, more than 50% and 20% of the days can be observed in Very Poor AQI over IGP and India, respectively which is absent in the case of MERRA-2 but bias correction helps it to capture the same results over both locations.
It may be mentioned that the bias correction implemented in this work is based on biases observed based on monthly observations and simulations, which were then applied to a daily scale. Though, it must be acknowledged that this is quite a simple method to carry out bias correction, nevertheless has helped to improve MERRA-2's ability to both provide spatial and temporal patterns of air quality at daily scales. This implies that more advanced bias correction methods will vastly improve the usability of MERRA-2 chemical reanalysis datasets for impact studies over a complex region such as South Asia. Such studies are urgently required over the Indian region that is impacted by repeated poor air quality over large swathes of the country in recent times due to various reasons.

CONCLUSIONS
1. MERRA-2 underestimates the actual PM 2.5 concentration by 34%, although 79% of the simulated values fall within FAC2 of the observed value. 2. The daily mean concentrations measured by CPCB and predicted by MERRA-2 exceed the NAAQS limit by 37.5% and 11.65%, respectively, on an annual time scale. 3. The simulations diverge when predicting higher mass concentrations, highlighting MERRA-2's inability to estimate the PM 2.5 level during pollution episodes. 4. The temporal and spatial pattern of the daily air quality is accurately predicted by MERRA-2 after a simple bias correction, which is based on the bias observed in the monthly means, is applied. 5. The bias-corrected MERRA-2 results capture the number of exceedance days reasonably well.

ACKNOWLEDGMENTS
The authors wish to thank the Central Pollution Control Board (CPCB) for their open access to PM 2.5 data and the State Pollution Control Board (SPCB), Odisha, for partially supporting some components of this study. We would also like to acknowledge NASA's Modern-Era Retrospective Analysis, Version 2 for their efforts to make this data available for public use.

DISCLAIMER
Reference to any company or specific commercial products does not constitute financial and personal conflicts of interest.   Table 2. Table 2. Percentage of days in each AQI class, where A is CPCB data, B is MERRA-2 simulated data, and C is MERRA-2 with bias correction. Cities are listed in the order of least good days as per CPCB data. IGP cities are in bold letters.