Ranil Dhammapala This email address is being protected from spambots. You need JavaScript enabled to view it.1, Ashani Basnayake2, Sarath Premasiri3, Lakmal Chathuranga3, Karen Mera4 1 South Coast Air Quality Management District, CA 91765, USA
2 Verité Research, Colombo 00005, Sri Lanka
3 National Building Research Organization, Colombo 00005, Sri Lanka
4 Embassy of the United States, Colombo 00003, Sri Lanka
Received:
October 25, 2021
Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.
Revised:
March 21, 2022
Accepted:
March 22, 2022
Download Citation:
||https://doi.org/10.4209/aaqr.210266
Dhammapala, R., Basnayake, A., Premasiri, S., Chathuranga, L., Mera, K. (2022). PM2.5 in Sri Lanka: Trend Analysis, Low-cost Sensor Correlations and Spatial Distribution. Aerosol Air Qual. Res. 22, 210266. https://doi.org/10.4209/aaqr.210266
Cite this article:
The South Asian island nation of Sri Lanka did not have any permanent PM2.5 monitors sharing data publicly in near-real-time until the U.S. Embassy installed a Beta Attenuation Monitor (BAM) in September 2017. This research aims to better understand the PM2.5 distribution in Sri Lanka by analyzing data collected by that BAM, and leveraging low-cost sensors, model and remote sensing data. BAM data show PM2.5 levels were “Unhealthy for Sensitive Groups” or “Unhealthy” according to the U.S. classification system, for at least 50% of the time between each November and the following February. This coincides with the northeast monsoon when stable air masses reduce dispersion of pollutants. Back trajectory analyses suggest long range transport also contributes to elevated PM2.5 during these months. Although slightly cleaner than regional embassies, this location has exceeded the Sri Lankan 24-hr standard for PM2.5 (50 µg m–3) every year since 2018. The area has met Sri Lanka’s annual standard (25 µg m–3) since 2019. We used PurpleAir (PA) and Atmos low-cost PM2.5 sensors co-located with the Embassy BAM, to develop correction factors to transform raw sensor data to BAM-like data. The influence of meteorological variables and the performance of different statistical models were considered and the regression coefficients of the most applicable models are presented. We also compared our PA correction factor against user-selectable options on the PurpleAir.com website. The Australian “Woodsmoke” correction can be applied to quickly visualize a reasonably accurate estimate of PM2.5 concentrations. We applied our PA correction factor to six other PurpleAir sensors operated around the country, to understand Sri Lanka’s PM2.5 distribution. With these corrected data, we interpolated satellite and model-derived PM2.5 annual averages at 1 km intervals. The most populated Western Province had the highest concentrations with elevated levels extending offshore. The sparsely populated southeast had the cleanest air.HIGHLIGHTS
ABSTRACT
Keywords:
PurpleAir, Atmos, BAM, U.S. Embassy, Colombo
PM2.5 is the most extensively monitored ambient air pollutant in the world because of the many negative public health consequences associated with it (Xing et al., 2016; Pope and Dockery, 2006; Schwartz et al., 1996). Yet, only two dozen countries have more than three PM2.5 monitors per million people, and about 140 countries that are home to 18% of the world’s population have no PM2.5 monitors (Martin et al., 2019). The lack of reliable, publicly accessible near-real-time PM2.5 data makes it harder for people to protect themselves as they go about their day-to-day outdoor activities. It is also difficult to assess the true public health burden and economic costs associated with air pollution induced health effects in the absence of such data. Sri Lanka has a population of approximately 21.5 million. 7260 Sri Lankans are estimated to have succumbed to the adverse effects of ambient PM2.5 pollution in 2019 (IHME, 2019). A recent review paper by Ileperuma (2020) provides a comprehensive overview of historical efforts to monitor air quality in the country. There was little to no continuous, publicly reported long term monitoring of PM2.5 anywhere in the country prior to 2017. A few research studies involving PM2.5 data collection (Seneviratne et al., 2011; Seneviratne et al., 2017) were used to quantify contributing source categories (Table 1) but their PM2.5 data were not reported to the public in near-real-time. Colombo’s source apportionment is based on measurements or estimates made over a decade ago. However, it is very likely that major emission source categories identified earlier, especially traffic, industry, biomass burning and road dust continue to be significant contributors to PM2.5 today. In September 2017, the U.S. Embassy in the Sri Lankan capital of Colombo installed a Beta Attenuation Monitor (BAM) and began reporting hourly PM2.5 data to the internet. Current and historical data are available online (AirNow, 2021). The main objective of monitoring air quality at many U.S. embassies worldwide is to provide U.S. citizens and diplomatic staff in the area with actionable health information related to air pollution. This resource also proved to be valuable to the Sri Lankan public. We were unable to locate any other official source of PM2.5 monitoring data in Sri Lanka, reported to the internet in near-real-time. As such, the U.S. Embassy BAM remains the primary source of timely, publicly available PM2.5 data in the country to date. BAMs belong to a class of PM2.5 monitors that are considered Federally Equivalent Monitors (FEMs), meaning they are certified to be reasonably accurate if operated correctly (Chung et al., 2001) and can be used for regulatory applications. Since FEMs are expensive and require specialized training and equipment handling, it is not feasible to deploy many of them around the country. Low cost sensors (LCS) are increasingly relied upon to fill in data gaps and improve spatial coverage of pollutant characterization (examples: Barkjohn et al., 2021; Chan et al., 2021; Hagan et al., 2019; Owoade, 2021; Zheng et al., 2019). Since environmental factors and aerosol characteristics affect LCS performance, comparisons against reference grade instruments such as BAMs are made and correction factors developed prior to field deployment (Zheng et al., 2018; Zimmerman et al., 2018; Malings et al., 2020). This is accomplished by co-locating a LCS alongside an FEM for a reasonable amount of time and comparing the data. U.S. Embassy FEMs have aided such co-location experiments elsewhere (McFarlane et al., 2021a in Ghana; McFarlane et al., 2021b in Uganda and the Democratic Republic of Congo; Raheja et al., 2022 in Togo). There are no universally applicable correction factors for a particular brand of LCS. Not all platforms sharing LCS data online have been corrected. One of the largest sources of global air quality data (AirVisual) performs dynamic, cloud-based corrections of LCS data with the nearest FEM. OpenAQ does not adjust LCS data. PurpleAir offers users a dropdown list of selectable correction factors, mostly developed with U.S. data. Ideally, LCS correction factors should be developed under conditions representative of environmental factors and aerosol characteristics in the respective airsheds (WMO, 2018). It is important to note that LCS data do not become regulatory grade data when correction factors are applied. It is also important to acknowledge that subsequently deploying the sensor to a climatically dissimilar area, or to an airshed that is impacted by aerosols with very different physical and chemical characteristics could render correction factors less applicable. Though several LCS co-location experiments have been conducted in neighboring India (Plantower sensor- Zheng et al., 2018; Shinyei sensor- Johnson et al., 2018; Alphasense sensor- Hagan et al., 2019), we were unable to find any published studies showing LCS vs. FEM comparisons in Sri Lanka. LCS correction factors need to be developed first before deploying them to monitor PM2.5 across Sri Lanka. Satellite observations of atmospheric aerosols are regularly used to further expand the mapping of PM2.5 worldwide (van Donkelaar et al., 2006; Al-Hamdan et al., 2009; Jin et al., 2019). Satellites do not measure ground level PM2.5 directly. Various algorithms process their data to yield the total light extinction by aerosols in the entire atmospheric column (known as the Aerosol Optical Depth, or AOD). The Multi-Angle Implementation of Atmospheric Correction (MAIAC) is a widely used algorithm that uses satellite data to produce twice-daily AOD estimates at 1 km intervals globally. MAIAC performs reasonably well in South Asia (Mhawish et al., 2019). Since AOD depends on the chemical makeup of aerosol, its optical properties and ambient moisture content, the correlation between ground level PM2.5 and AOD varies nonlinearly in time and space (Jin et al., 2019). These relationships can be established by scaling the satellite-derived AOD by PM2.5/AOD ratios in air quality models (van Donkelaar et al., 2006; Jin et al., 2019). This scaling approach is easier to implement than alternative statistical methods (example: Chen et al., 2021), and is defensible since the air quality model internally accounts for the spatial and temporal variability of PM2.5 sources, its chemical components, meteorology, terrain and chemistry in space and time. BAM and corrected LCS data can be interpolated with this scaled AOD field. This research aims to better understand the PM2.5 distribution in Sri Lanka by: We downloaded hourly FEM data files from September 2017–July 2021 from the AirNow website (AirNow, 2021). We only used hourly data that passed AirnowTech’s quantity control checks described in Dhammapala (2019). Quantile-quantile plots (not shown) confirmed that all FEM data mostly followed a lognormal distribution. Where necessary, PM2.5 concentrations were converted to air quality categories to aid interpretation. For hourly data, this involves using the NowCast concentration reported in the data file. The US based National Climatic Data Center maintains a comprehensive global archive of weather observations sourced through the World Meteorological Organization and other partners. CSV files of hourly data are available freely at ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite/. The only representative meteorological station was about 2 km away from the FEM, and reports temperature (T), pressure, relative humidity (RH), wind speed and direction once every 3 hours. Although it is not possible to know the site’s QC regime, we carried out some basic consistency checks to determine data validity. Long range transport of pollutants from other parts of South Asia (Begum et al., 2011; Seneviratne et al., 2011) can contribute to elevated concentrations. We investigated the path that air masses traversed prior to arriving at the Embassy monitor, using the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Stein et al., 2015). Three-day back trajectories ending 50 m above the location of the Embassy monitor were generated every 3 hours between September 17, 2017 and July 31, 2021, using global reanalysis meteorological fields with a spatial resolution of 2.5°. To identify source regions, we ran cluster analysis on the trajectories (Xu et al., 2015) and also and generated concentration weighted trajectories (CWT; Hsu et al., 2003; Seibert et al., 1994) to further aid interpretation. Even though Sri Lanka had COVID-19 related lockdowns and curfews that resulted in less traffic, we did not attempt to decipher their effect on PM2.5. This is mainly because (i) the relatively short pre-COVID data record does not facilitate robust comparisons, and (ii) meteorological data available are insufficient to develop statistical models needed to account for and remove the influence of day-to-day meteorological variability on PM2.5 (example: Talbot et al., 2021). The National Building Research Organization (NBRO) provided BAM PM2.5 data collected between November 2020 and March 2021, about 1.5 km away at the Colombo Municipal Council (CMC). Although two different LCS were co-located with this BAM for 1-2 months, various data quality issues prevented their use in correction factor development. Therefore, we compared the CMC and Embassy BAM data to check for spatial gradients of PM2.5. Two types of LCS were available for co-location experiments with the Embassy-operated BAM: PurpleAir (PA-II-SD) and Atmos. PA is US-manufactured and uses two Plantower PMS5003 laser counters to measure particulate matter, with each channel alternating every 5 seconds. Data from each channel are averaged over two minutes and the second PA channel is used to check data precision. The second LCS used was the Indian made Atmos sensor, which has a single Plantower PMS7003 laser counter and works the same as the PA (Zheng et al., 2019). A PA sensor owned by the Overseas School of Colombo ran alongside the Embassy BAM from January 14, 2020–July 7, 2021, although data from September and October 2020 were lost. PA sensors provide two separate two-minute average estimates of PM2.5. We used the higher of the two (labeled “CF=1” after a recent firmware update) as an independent variable when developing PA correction factors, since “CF=1” is not adjusted with a calibration factor developed in China (Malings et al., 2020). An Atmos sensor owned by Verité Research ran alongside the Embassy BAM from December 9, 2020–July 7, 2021. We used the two-minute averaged “pm25raw” output from the sensor as an independent variable in this work. PA and Atmos sensors also measure T and RH. These parameters are required since T and RH have been found to influence the performance of sensors (Bulot et al., 2019; DeWitt et al., 2020; Malings et al., 2020). Though these internal measurements can differ from ambient T and RH measurements, they have been shown to be strongly correlated with reference T and RH (Malings et al., 2020), and are more representative of the sensor’s internal conditions which dictate how its circuitry functions. Recent literature was examined for different data handling methods and statistical procedures to ensure data analysis and correlations are robust. Ideally, performance comparisons should be based on daily averages, since the FEM’s performance certifications are based on the comparison of its daily average concentrations. However, this reduces the number of available data points and narrows concentration ranges over which correlations are applied. Therefore this manuscript examines correlations based on one-hourly and daily averaged data. There are several PA monitors operated by different agencies and individuals around Sri Lanka that report data to the PA website. We downloaded their hourly data and applied the BAM vs. PA correlation to the sites with at least twelve months of data, to obtain an estimate of the spatial distribution of PM2.5. Their distributions and diurnal profiles were also examined. The Supplemental Materials contain a picture of the Embassy sampling site (Fig. S1) and a map of PurpleAir sensors around Sri Lanka. LCS report concentrations every few minutes (i.e., sub-hourly data) and these were averaged over the hour as follows: Data outside these ranges cannot be legitimate measurements. The Atmos sensor’s RH module recorded data that were clearly erroneous so the PA sensor’s RH readings were substituted. With the exception of RH measured by one PA sensor in Nawalapitiya, all other sensors’ T and RH showed the expected diurnal and seasonal variations. We followed the steps explained in Fig. S2 of the Supplemental Materials to adjust Nawalapitiya RH data, but it had minimal impact on the corrected PA reading. Sensor data were treated as the independent variable and different statistical models were fitted to minimize the difference between FEM data (the dependent variable). We restricted the model choices to the following three: Malings et al. (2020) showed improved correlations and smaller errors when the [interaction of first order MLR terms] was considered. Models must not overfit data (Murphy, 2012), and this was addressed by performing a five-fold cross validation on statistical models (Hastie et al., 2009). Variables that were not statistically significant (p > 0.05) were dropped and the regression coefficients of important explanatory variables were re-computed iteratively, until all non-significant variables were removed. Model performance statistics were compiled and the model that showed the least dependence on other variables is proposed. Since co-location experiments with FEMs are conducted prior to deploying low-cost sensors elsewhere, it is assumed that the correlations developed are applicable in the new location. The validity of this assumption was investigated using a subset of the co-location data to check how well a correlation established over a shorter duration mimics FEM concentrations in the future. Since there are several gaps in PA data from across the country, we assembled a representative year at each site by combining all available data from June 1, 2019–July 31, 2021, and constructing averages for each of the twelve months, before averaging them together. This ensures data from all months are only represented once, and minimizes over-sampling in some months. We used the Google Earth Engine (GEE) cloud-based platform (Gorelick et al., 2017) to retrieve geo-referenced archives of 1 km MAIAC AOD and the Copernicus Atmospheric Monitoring Service global air quality model (CAMS; https://atmosphere.copernicus.eu) with a horizontal resolution of about 40 km. The date range was restricted to June 1, 2019–July 31, 2021, to overlap with the period when the greatest number of PA sensors operated across the country. Since the Terra and Aqua satellites pass over Sri Lanka at approximately 10:30 AM and 1:30 PM local time respectively, only CAMS AOD and PM2.5 estimates during this period were used for scaling MAIAC AODs. CAMS data were interpolated so they could be spatially merged with MAIAC AODs. Since cloud cover interferes with remote sensing of AODs, this technique is best applied to obtain time aggregated spatial patterns. Nevertheless, we compared daily MAIAC-CAMS-PM2.5 averaged over a 3 km × 3 km grid around each monitor, against the corrected PA and BAM data. The “daily” averages of the measured data were in fact averages between 10 AM and 2 PM, to remain consistent with satellite overpass times. A two-year average of MAIAC-CAMS-PM2.5 at each 1 km grid cell was also calculated in GEE using the same monthly averaging approach as for the nationwide PAs and exported for subsequent interpolation of measured data. We used a two-year average to minimize data loss due to cloud cover, and to obtain a more robust spatial pattern. We employed Residual Kriging (RK) for fusing monitoring data and MAIAC-CAMS-PM2.5 data. RK has been used in similar applications before (Schulte et al., 2020), and takes the log-difference between monitored and MAIAC-CAMS-PM2.5 at monitor locations and builds a variogram. A variogram shows the relative difference between two measurements as a function of distance between the two monitoring sites. The basic premise is that measurements made further apart will vary more than measurements made closer to each other. The most appropriate mathematical model is then fit to describe the variogram and that model is used to interpolate measurements to grid cells lacking measurements. For simplicity, we used the default variogram fitting parameters. Finally, exponents of these interpolated residuals are multiplied by gridded MAIAC-CAMS-PM2.5, resulting in annual average PM2.5 estimates at every grid cell. Besides GEE, all other data processing and plotting was conducted using R Statistical Software (R Core Team, 2019). The Supplemental Materials consisting of four figures, a table and an interactive map is hosted at https://tinyurl.com/pv6h458u. The Sri Lankan 24-hr and annual PM2.5 standards are 50 and 25 µg m–3 respectively (CEA, 2008). Unlike the US 24-hr PM2.5 standard of 35 µg m–3 that is based on the three-year average of the annual 98th percentile of the daily average, these are maximum permissible concentrations. The annual averages in 2018, 2019 and 2020 were 31.9, 23.5 and 19.5 µg m–3 respectively, exceeding the local standard only in 2018. However all years exceeded the U.S. and World Health Organization annual standards of 12 and 10 µg m–3, respectively. The Sri Lankan 24-hr standard was exceeded every year (2018: 86.1, 2019: 62.3, 2020: 55.0 µg m–3). These concentrations are slightly lower than those recorded at other US embassies in the region (Dhammapala, 2019). The monthly variation of Colombo’s PM2.5 is shown in Fig. 1. Overall, air quality is “Good” or “Moderate” at least 80% of the time from April to October. Between November and February, the public encounters poorer air quality around half the time on average. These seasonal fluctuations largely depend on the monsoons. The northeast monsoon from December to February is governed by semi-permanent wintertime high pressure over the Himalayas (Goswami, 2005). This forces cooler, stable air over Sri Lanka, thereby retarding pollutant dispersion. The southwest monsoon brings rain and cleaner marine air to Colombo from May to September, resulting in good dispersion and low pollutant levels. Seasonal measurements of ventilation indices (mixing layer height × wind speed) in nearby Chennai (India) confirm lower atmospheric mixing in the northern hemispheric winter and more in the summer (Swamy et al., 2020). The first and second inter-monsoonal periods of March–April and mid-October–early-November are transition periods when air quality begins to improve or deteriorate, respectively. Diurnal weekday (Monday–Friday) and weekend (Saturday–Sunday) PM2.5 variations during different seasons are shown in Fig. 2. The ranges of “normal” NowCast concentrations are shown as the hourly median ± 95% confidence intervals (CI), overlaid on air quality categories to add context. Average morning rush hour and nighttime conditions from November–February are more likely to be USG, whereas other months are mostly Moderate throughout the day. Lower weekend emissions, most notably a drop in the morning rush hour peaks, are evident (light blue band). The general diurnal pattern is typical for a site influenced by traffic and domestic sources, and is easily explained by diurnal emission patterns and evolution of the mixing layer (Seinfeld and Pandis, 1998; Stull, 2000). At sunrise and shortly thereafter, the nocturnal inversion breaks up slowly, keeping morning rush hour emissions trapped at the surface until mid-morning. As solar heating improves vertical mixing, pollutant levels then drop through the afternoon. As evening approaches, the inversion layer starts to re-form but is not very shallow until later at night. This enables evening rush hour emissions to mix throughout a deeper layer than the morning emissions, leading to a comparably smaller increase in concentrations. Pollutants from domestic activities are trapped within a shallower nocturnal inversion layer, leading to higher nighttime concentrations. Biomass burning, identified as a significant contributor to Colombo’s PM2.5 (Table 1), can contribute throughout the day as people use firewood for cooking and burn yard waste. The pollution roses in Fig. 3 show how seasonal wind flows affect PM2.5 levels. The predominantly northeast winds in the more polluted months clearly stand out. Even though high PM2.5 levels are seen during other relatively infrequent wind directions between November and February, they occur when wind speeds are relatively light. Such winds tend to meander and likely transport pollutants from many nearby areas. The cleaner conditions during the remainder of the year are mostly accompanied by southwest winds, as explained in Section 3.1.1. CWTs in Fig. 4(a) show average concentrations measured at the monitor when air masses pass over the respective regions. Stronger winds (evidenced by the 72-hour back trajectories originating further away) help reduce PM2.5 buildup. Areas nearer to the monitor contribute to elevated concentrations during lighter winds (associated with shorter trajectories), which reduce pollutant dispersion. Air masses traversing Bangladesh and India (including their offshore waters) can contain up to an average of 45 µg m–3 of PM2.5. However, it must be noted that such air masses travel across Sri Lanka en route to Colombo and are likely to include PM2.5 entrained from domestic sources. The aggregated trajectory clusters in Fig. 4(b) offer some insights into the contribution of long range transport and locally generated pollution, although a proper quantification requires a detailed modeling study. Lighter east/northeast winds are about 8 µg m–3 less polluted compared to those associated with long range transport during the same season, from approximately the same general direction (green minus blue line). Therefore under average conditions, long range transport may add about 8 µg m–3 of PM2.5 to that from nearby sources. Lighter southwest winds are associated with about 10 µg m–3 more PM2.5 than stronger southwest winds (red minus purple lines). This could be the approximate contribution of PM2.5 from Colombo’s sources. The Embassy and CMC BAMs track linearly and are usually within 10% or 5 µg m–3, whichever is more (Fig. 5(a)). Median differences in hourly concentrations show a morning rush hour spike at the CMC site that is about 6 µg m–3 larger, while evenings are about 5 µg m–3 more polluted around the Embassy site (see boxplot medians in Fig. 5(b)). These are likely driven by differences in local traffic patterns. Fig. 6 shows how linear, MLR and QR statistical models of the 24-hr averaged raw data compared against the FEM. Models with interaction terms are not shown to prevent overcrowding. Raw PA readings over-estimate PM2.5 by larger margins than Atmos sensors. The goal of a model is to transform the raw data as close to the 1:1 line as possible. While all models tested appear to accomplish this, their suitability was examined more closely by considering model performance, dependence on other variables and duration of the co-location experiment. Three desirable model performance criteria are high correlation coefficient (R2), small normalized root mean square error (N-RMSE), and the ability to reflect the same air quality category as the FEM. These criteria are examined in Fig. 7. Fig. 7 helps isolate a few candidate models for closer consideration: for Atmos sensors, the linear and QR models have high % correct and low N-RMSEs respectively, with comparable R2. For PA, the MLR and QR models have the highest % correct, lowest N-RMSEs and high R2. When differences between simpler and more sophisticated models are small, the simpler model is preferred. Model bias (value predicted by the statistical model minus FEM) was also examined as a function of time, PM2.5 concentrations and meteorological variables. Ideally, the bias should not be correlated with these variables, although a high bias can be expected at low concentrations when monitors can be noisy. Low concentrations in turn occur in stormy months when RH is high. Fig. S3 of the Supplemental Materials shows how the Atmos sensor bias varies by hour of day, and how the model/FEM ratio depends on daily average FEM concentration, RH and temperature. Statistical models with interaction terms did not improve performance. The linear model was considered satisfactory since the largest biases occur during the least polluted times. Similar plots for PA correlations (not shown) suggest the MLR model is the least sensitive to these variables. Fig. 8 shows how well PA correlations derived from 30-day, 60-day and 90-day co-location experiments perform henceforth. The gray shaded areas are the testing data used to establish the correlation. The accuracy of the values predicted by different models thereafter is seen by comparing them against the FEM (black) trace. Rather than a minimum duration for establishing a correlation, it appears that unless the co-locations include a representative period, the statistical model’s ability to handle a future event of similar magnitude is compromised. After the PM2.5 decline in March 2020 was included in the correlation (60-day co-location), the projections thereafter improved. However, the QR model showed several deviations from the FEM even after 90-days of colocation data whereas the MLR model remained a stable predictor. Similar plots for the Atmos sensor (not shown) confirmed that longer co-location periods alone do not guarantee better model performance; the training data set needs to include data representative of conditions encountered later. The linear model tracked the FEM closer than the QR model on most polluted days. Based on data presented in this section and examining the consistency between hourly and daily model performance, the correction factors in Table 2 are recommended for PA and Atmos sensors in Sri Lanka. Neither our Atmos nor PA correction factors compared well with formulae derived in neighboring India. A three- month study conducted with Atmos sensors at multiple sites across New Delhi showed a comparable slope at one site with a larger intercept (visually estimated from Fig. 6 in Zheng et al., 2019), resulting in higher FEM- like concentrations. Zheng et al. (2018) reported a seasonally- varying linear fit with a temperature dependence for Plantower sensor data collected in Kanpur, India. Applying this relationship to our data resulted in large under-estimations. Our own analysis of PurpleAir data co-located at the US Embassy in New Delhi showed a larger slope and smaller intercept, resulting in over-estimated FEM-like data. We also compared the correlations available on the Purpleair.com website to determine which readily available formulae worked best to quickly obtain a reasonably accurate estimate of PM2.5 levels from PA sensors in Sri Lanka. Despite differences in climate and possibly aerosol characteristics, results in Table S1 of the Supplemental Materials show that applying the “Woodsmoke” correlation derived in Australia yields concentrations whose N-RMSE is only 4.2% higher than our findings, with the U.S. EPA correlation tracking closely behind, at 5.9% higher. Applying a correction factor derived in Colombo to sites across the country may introduce some uncertainty, and this can only be ascertained by conducting multiple co-location experiments around the country. This is an onerous, time-consuming task. However correction factors have been applied up to 2000 km away in data-deficient environments (McFarlane et al., 2021b) if the application does not require a high level of accuracy, and aerosol masses are expected to be somewhat similar. Fig. 9 uses corrected PA data from across Sri Lanka collected since June 2019, to compare the (a) distribution of daily averages and (b) diurnal profiles across sites. Data from the Embassy BAM collected over the same period are included for comparison. The box width in Fig. 9(a) scales by the amount of data available. The difference in medians between the cleanest and the most polluted site is about 12 µg m–3, most of which is likely to be from nearby local sources. Good or Moderate air is present at least three quarters of the time at all sites, while Unhealthy conditions occur under 5% of the time. As with the BAM, November–February is more polluted at all sites. Except for their relative magnitudes and a few differences discussed below, the diurnal trends at all sites are broadly similar to the general patterns shown in Fig. 2. The text preceding that figure explaining the general diurnal patterns also applies to each of the PA sites. The Jayawadanagama area experiences a slight midday increase, possibly due to traffic congestion when nearby Montessori’s end their school day. This monitor and those at Thalangama and Overseas School are clustered around Sri Lanka’s administrative capital of Sri Jayawardanapura-Kotte. Surrounding residential areas probably account for the slightly elevated concentrations in the evenings and nights. This area also sees more rush hour traffic than the vicinity of the Embassy monitor. The Puttalam site is located along the main Colombo-Mannar road, about 10 km downwind of the Lakvijaya coal fired power plant. However, the intermittent operation of the facility probably reduces PM2.5 impacts at the monitor. The diurnal profile shows a small increase during the daytime that is sustained until evening. This is consistent with traffic along the main road and probably some light, occasional power plant plume impacts. Akurana is located along the Kandy-Matale road, which may explain why the midday concentrations do not drop as much as other sites relative to their own peaks. Nawalapitiya is a rural area with little urban influence. Pie charts at the location of each PA monitor in the interactive map of the Supplemental Materials show the percentage of time they measured air quality in the different categories. The size of the pie chart scales with the average PM2.5 over the representative annual average. Clicking on the pie charts shows popups of diurnal trends similar to Fig. 2. The Q-Q plots in Fig. S4 of the Supplemental Material suggest that MAIAC-CAMS-PM2.5 consistently over-estimates PM2.5. This confirms the need for “ground-truthing” the MAIAC-CAMS-PM2.5 product by interpolating it with observational data. The bottom right panel of the same figure shows the variogram and the best fit (Gaussian, in this case) developed for kriging. The kriged PM2.5 map shown in Fig. 10 fits our conceptual understanding of the spatial distribution of PM2.5 in Sri Lanka: the most populated and industrialized Western Province has the highest concentrations, and the Embassy BAM is located at the edge of an area that meets the annual standard. A few other pixels around the country also exceed the annual standard but investigating these is beyond the scope of this work. Elevated concentrations extend over the offshore waters of the western coastline. This is expected during the more polluted northeast monsoonal months, and when land breezes transport pollutants from the Western Province offshore. Care must be taken in reading too much into ‘hotspots’ especially within 1–2 pixels of coastlines and areas near salt pans, as these AODs can be biased high due to uncertainties in the MAIAC algorithm (Lyapustin and Wang, 2018). The sparsely populated southeastern part of the country has the lowest concentrations and even meets the US annual standard of 12 µg m–3. Uncertainties in the kriged MAIAC-CAMS-PM2.5 product can be reduced by using re-analyzed model data instead of forecasts. PM2.5 monitors in Sri Lanka’s northern and eastern areas would help, as would the use of more advanced interpolation techniques. Although other researchers have constructed global maps of PM2.5 in the past, to our knowledge this is the first product showing spatial gradients of PM2.5 in Sri Lanka. The map in the Supplemental Materials contains an interactive version of Fig. 10 as a user-selectable layer. This manuscript presents the first analyses of PM2.5 trends in Sri Lanka. The seasonal, diurnal and directional dependencies are consistent with our conceptual understanding of airshed behavior. The northeast monsoon coincides with poor air quality because of reduced dispersion and long range transport. Diurnal trends across the country show the typical rush hour peaks and more accumulated pollution at night. Sites close to busy roads do not see a very pronounced midday PM2.5 dropoff. Long term trends can be assessed when more years of data become available. Plans are afoot to expand PM2.5 monitoring in Sri Lanka, but resource constraints limit the number of FEMs that can be deployed. The LCS correction factors developed here can be used to deploy affordable devices in previously unmonitored areas to better characterize PM2.5 in the country, identify pollution hotspots and aid with public health warnings. If different brands of LCS are used, co-located experiments should be run and representative correction factors developed. The map of PM2.5 annual averages in Sri Lanka, constructed by fusing satellite, model and all available monitoring data can be useful to aid planning efforts. The authors thank the Overseas School of Colombo for providing a PurpleAir sensor, and the staff at the US Embassy in Colombo for operating low-cost sensors alongside the BAM.1 INTRODUCTION
1.1 Continuous PM2.5 Measurements in Sri Lanka with Regulatory Grade Monitors
1.2 Cost-effective Monitors to Fill PM2.5 Data Gaps
1.3 PM2.5 Mapping with Remotely Sensed Data
2 METHODS
2.1 BAM Data Analysis
2.1.1 Meteorological data to help with analysis
2.1.2 Comparing against another temporary BAM in Colombo
2.2 LCS Data Collection
2.2.1 LCS data quality control
2.2.2 Correlations with FEMs
FEM PM2.5 = intercept + slope × sensor PM2.5
FEM PM2.5 = intercept + slope1 × sensor PM2.5 + slope2 × T + slope3 × RH + [slope4 × sensor PM2.5 × T + slope5 × sensor PM2.5 × RH + slope6 × T × RH + slope7 × sensor PM2.5 × T × RH].
2.3 AOD to PM2.5 Mapping and Spatial Interpolation of Measured Data
3 RESULTS AND DISCUSSION
3.1 BAM Trends
3.1.1 Seasonal trends
Fig. 1. Monthly profiles of Embassy PM2.5 from September 2017–July 2021. USG = “Unhealthy for Sensitive Groups”.
3.1.2 Diurnal trendsFig. 2. Diurnal profiles of Embassy PM2.5. Background colors are the same air quality categories shown in the legend of Fig. 1.
3.1.3 Directional trendsFig. 3. Seasonal PM2.5 pollution roses at the Colombo Embassy monitor. USG = “Unhealthy for Sensitive Groups”.
Fig. 4. Analysis of trajectories up to 72 hours prior to arriving at the U.S. Embassy monitor (black dot): (a) concentration weighted trajectories, and (b) trajectory clusters.
3.1.4 Comparing with another temporary BAMFig. 5. CMC and Embassy PM2.5 BAM comparison from Oct 30, 2020–Mar 8, 2021 (93 days with overlapping data). (a) daily averages with diagonal lines showing the 1:1 line ±10% or ±5 µg m–3 differences, and (b) boxplot of diurnal differences between the sites.
3.2 LCS Correction FactorsFig. 6. Comparing daily averages of raw and corrected Colombo Embassy sensor data against BAM.
3.2.1 Statistical model performanceFig. 7. Performance metrics of different daily average models for each sensor, calculated from the 20% testing data set.
3.2.2 Model performance as a function of concentration and meteorological variables
3.2.3 Duration of co-location experimentsFig. 8. Using part of a co-location study (shaded area = training data) to assess PA correlation. Models with interaction terms are omitted for clarity.
3.2.4 Comparing our LCS correction factors against others
3.3 Applying the Correction to PA Data from Other Parts of Sri LankaFig. 9. Comparing all corrected PA data in Sri Lanka, June 2019–July 2021. (a) Boxplot of daily averages, and (b) diurnal variation of NowCast medians. As in Fig. 1, stacked colors indicate air quality categories.
3.4 Sri Lanka’s PM2.5 Spatial Pattern Estimated from Satellite, Model and Measured Data
Fig. 10. Kriged MAIAC-CAMS-PM2.5 concentrations at 1 km intervals over Sri Lanka.
4 CONCLUSIONS
ACKNOWLEDGEMENTS
REFERENCES