Validation of In-field Calibration for Low-Cost Sensors Measuring Ambient Particulate Matter in Kolkata, India

Low-cost sensors (LCS) provide opportunities for neighborhood-level air pollution data collection, yet significant knowledge gaps remain regarding the accurate application and interpretation of LCS. In this study, we present an in-field calibration of a network of 20 low-cost ambient particulate matter sensors (LCS) in greater Kolkata, India, operating between October 2018–April 2019. In order to understand LCS performance in relation to local reference-grade PM 2.5 monitors (RGMs), three of these LCS were co-located with RGMs operated by the West Bengal Pollution Control Board at Rabindra Bharati University (RBU), Victoria Memorial (VICTORIA), and Padmapukur (Howrah, PDM). Data from the co-locations were used to calibrate the LCS network using random forest regression and multiple linear regression approaches. Measured relative humidity and temperature were significant model features. Agreement between the LCS and RGM for 24-h averaged PM 2.5 measurements was strongest at RBU, with an uncalibrated root mean squared error (RMSE) of 27.1 µ g m –3 , followed by PDM (32.6 µ g m –3 ) and VICTORIA (50.7 µ g m –3 ). Multiple linear regression was used to derive calibration models. Cross-calibration between co-located LCS-RGM pairs was tested. The LCS data after cross-calibration correctly identified days as being in or out of attainment with the 24h National Ambient Air Quality Standard of 60 µ g m –3 91% of the time. The corrected data accurately identifies days with an India scale Air Quality Index of “poor” or worse 94% of the time. This suggests that LCS can be a useful supplement to RGM networks for air quality management. Diurnal trends and a high level of correlation across the hybrid LCS-RGM network suggest regional and secondary sources of PM 2.5 are important in Kolkata.


INTRODUCTION
Ambient air pollution is a major environmental health issue.Atmospheric fine particulate matter, or PM2.5 (particles with aerodynamic diameters less than 2.5 micrometers), is one of the leading causes of premature mortality and morbidity worldwide (Cohen et al., 2017).PM2.5 exposure causes an estimated 1.56-year decrease in life expectancy in South Asia, more than in any other region (Apte et al., 2018).Although the World Health Organization has set guideline values for PM2.5 at 5 µg m -3 annual mean and 15 µg m -3 24-hour mean, pollutant levels remain many-fold higher than this value in most places, particularly in low-and middle-income countries (LMICs) (WHO, 2021).Continuous measurements of PM2.5 are needed in order to establish baseline conditions, quantify the local negative impacts of pollution, identify pollution sources, plan policies to comply with set air quality goals, and track air quality improvements (McNeill, 2019;World Bank, 2017).While the density of ground-based reference-grade PM2.5 monitors across India has increased since 2016 under the National Air Quality Monitoring Programme (CPCB, 2022;Sethuraman et al., 2021;McNeill and Nunes, 2017), data are not yet available at high spatial resolution.Low-cost sensors (LCS) provide opportunities for neighborhood-level data collection, enabling the identification of air pollution "hotspots" and the quantification of local health impacts (Pinder et al., 2019).LCS are lower-fidelity sensors that generally operate on optical principles for PM2.5 detection and require less power and maintenance than reference-grade monitors (RGMs).
Although the use of LCS for air pollution monitoring and air pollution research has proliferated in the past decade, significant knowledge gaps and caveats remain regarding the accurate application and interpretation of LCS (Malings et al., 2020;Giordano et al., 2021;Hagler et al., 2018).LCS that measure ambient PM2.5 often underperform under the environmental conditions typical of air pollution events in India (high humidity, high pollution loadings, light-absorbing particles) (Di Antonio et al., 2018;Jayaratne et al., 2018).Sensor performance may also degrade in harsh environments (Amegah, 2018).The impact of environmental conditions and particle characteristics such as size, shape, and composition on different LCS technology remains a knowledge gap.
In-field calibration of LCS sensors has emerged as a solution for improving the accuracy of data from LCS networks (Malings et al., 2020;Giordano et al., 2021).Including RGMs in an air quality network provides a reference for LCS calibration.By co-locating LCS and RGMs, a calibration for the LCS network may be developed.Several studies have focused on local calibrations of LCS distributed in the U.S. (Malings et al., 2020;Zimmerman et al., 2018) and African cities (McFarlane et al., 2021a(McFarlane et al., , 2021b;;Raheja et al., 2022) with machine learning approaches, but to date this approach has had limited application in urban environments in India (Gupta et al., 2022).
In this manuscript, we describe a sensor network deployed by Enviome Research in collaboration with The World Bank between October 2018 and January 2019 in central Kolkata, India, which collected data until summer 2019.The network consisted of twenty low cost PM2.5 sensors (Clarity, Inc.), and included three co-locations with reference grade PM2.5 monitors under the operation of the West Bengal Pollution Control Board (WBPCB).The design of this network allowed for the analysis of the performance of these light scattering based LCS in the Indian urban environment.Using the network, we were able to establish a baseline assessment of local air quality along two major transportation corridors targeted for transition to electric vehicle public transportation (World Bank, 2021) (Fig. 1).The multiple co-locations also enabled a robust test of the principle of field calibration by allowing calibration and cross-check across co-location pairs.

NETWORK DESCRIPTION AND METHODOLOGY
In this section, we describe the sensors, the network design, and the analysis approach.Twenty Clarity Node S air quality monitors were deployed in Kolkata and Howrah, India starting in Fall 2018 (Fig. 1).Table 1 provides a complete list of sensor locations.The sensor network was designed to characterize baseline air pollution levels along the two busiest bus corridors in central Kolkata, and to compare LCS performance to RGMs in three areas of the city.The bus corridors studied were route S9 (Belgharia to Jadavpur) and S12 (Newtown to Howrah), which span across the city from far North to South, and far East to West.Clarity Node S devices were placed at 2-3 km intervals along these routes.Five more Clarity Node S devices were placed near existing PM2.5 RGMs (WBPCB and U.S. Diplomatic Post).Out of the five devices, three were placed in sufficiently close proximity to the RGMs (i.e., on the enclosures housing the RGMs) to be considered co-located for calibration purposes.These three were PDM: Padmapukur, RBU: Rabindra Bharati University, and VICTORIA: Victoria Memorial.Sensors were installed 12-18 feet from the ground.Each Clarity Movement Node S monitor consisted of a Plantower PMS 6003 dual laser light scattering PM sensor, an NO2 electrochemical cell sensor (110-508, SPEC Sensors), and a Bosche BME280 sensor to estimate pressure, relative humidity (RH), and temperature (T) inside the sensor housing.The Node S reported measurements of PM2.5, PM10, NO2, RH, and T at a default frequency of 15 minutes and uploaded the data via cellular signal to the Clarity cloud system.Data were processed, including data cleaning, by Clarity prior to data storage in the Clarity Cloud.No additional cleaning of Clarity data was performed in this study; data were used as received from the Clarity Dashboard.The present analysis focuses on the PM2.5 data.The Clarity sensors were co-located as a group in a controlled environment in Kolkata (SDF building) and checked for consistent performance prior to deployment.WBPCB PM2.5 monitors (RGMs) are Beta Attenuation Monitors (MP101M, Envea Global).These instruments were housed in enclosures roughly 4.2 m × 3.5 m × 2.5 m high (WBPCB, 2018).Co-located LCS were installed on poles extending 3-4 feet from the roof of these enclosures.The RGMs collected sample data every fifteen minutes and uploaded the data to the online data collection and reporting web portal as hourly average.WBPCB instruments are certified on a 24hour basis.WBPCB performed data cleaning prior to storage, but we also screened for values of 0 and 999 µg m -3 from the WBPCB datasets (< 1% of data points).These values were discarded before averaging and further analysis.
Calibration analysis was performed using the scikit-learn package in Python (Müller and Guido, 2017).Basic features of the datasets included PM2.5 measured by RGM and Clarity Monitor, as well as T and RH measured by the Clarity Monitor.Regression was performed for individual co-located Clarity-RGM pairs using Clarity PM2.5, T, and RH as explanatory variables.The 24-hr averaged datasets consisted of 188 (i.e., 24-hr averages for 188 days) and 194 points for RBU and PDM, respectively.A 75:25 train:test split implemented via random distribution was used.The generalizability of the calibration was tested by cross-calibrating between Clarity-RGM pairs (i.e., train dataset from co-location pair 1, test data from co-location Clarity Monitor 2, compared result to the location 2 RGM).Additional details are available in the Results section.
The algorithms tested for calibration were multiple linear regression and Random Forest regression.Random Forest regression is attractive for this application because it is powerful while making it possible to avoid overfitting.Multiple linear regression, if it provides enough accuracy, is valuable in that it produces an analytical expression for the calibration as follows, simplifying calibration of the wider network (Malings et al., 2020;McFarlane et al., 2021a) where the βi are fitting parameters.The default settings in scikit_learn were applied for linear_model.LinearRegression() (Müller and Guido, 2017).For RandomForestRegressor(), we used 100 estimators, a maximum tree depth of 10, 10 minimum samples required to be a leaf node, and a fixed random state of 5.
The raw data and the regression results were evaluated based on their agreement with the WBPCB reference data, as measured by the coefficients of determination (r 2 ) and the root mean squared error (RMSE), and the normalized RMSE (NRMSE).RMSE is calculated according to: where xi is the series of observed values, xî is the expected value, and N is the number points in the series.NRMSE, a unitless metric, is calculated by normalizing the RMSE with the range of the variable, i.e., The corrected data were also evaluated for their accuracy in diagnosing a day as in or out of attainment with the Indian 24 h National Ambient Air Quality Standard (NAAQS) of 60 µg m -3 , or placing the day in the correct Indian Air Quality Index (AQI) category.
Spatial variability in the data was analyzed by calculating the Pearson correlation coefficient, r, between datasets obtained at different sites.For datasets A and B, where N is the size of each dataset, µj is the mean, and σj is the standard deviation.

RESULTS
Data collection spanned the post-monsoon season 2018 (October-November), winter 2018/2019 (December-February), and spring/summer 2019 (March-July).Sensor installation took place between November 2018-January 2019 (Table 1).Comparisons between the WBPCB RGM data and the uncorrected Clarity data at the three co-location sites are shown in Fig. 2. We used 24-hour averaged data since this is the basis upon which the RGM was certified.Typical of Plantower-based instruments, the Clarity sensors showed qualitative agreement with the RGMs, with some high bias for higher PM2.5 loadings (> 100 µg m -3 ).Agreement between the LCS and RGM was strongest at RBU, with an uncalibrated RMSE of 27.1 µg m -3 (NRMSE = 0.070), followed by PDM (RMSE = 32.6 µg m -3 , NRMSE = 0.086) and VICTORIA (RMSE = 50.7 µg m -3 , NRMSE = 0.122).The RBU site is located inside the university campus, away from traffic and other sources (167 m away from the nearest major roadway).PDM is in a primarily residential area near a pond, 15 m from a minor roadway, 167 m from a major roadway, and 373 m from the Mumbai-Kolkata Highway.VICTORIA is in a centrally located green zone near a pond, near a minor roadway, and 200-230 m from two major roadways, so, humidity and local source effects are possible.PDM and VICTORIA showed higher average RH than the rest of the network.Only the RBU and PDM datasets were used for calibration analysis due to the lower Clarity-RGM agreement and higher variability in the Clarity data at VICTORIA.
In order to investigate the sensitivity of the sensor performance to environmental factors, we analyzed the Clarity:RGM agreement after splitting the dataset based on RH and/or PM2.5 levels (Table 2).This analysis was done using hourly averaged data in order to capture diurnal variations in RH.Performance of Plantower-based sensors has been reported to degrade for RH > 75% (Jayaratne et al., 2018).We split the RBU, PDM, and VICTORIA datasets into RH > 75% and RH < 75% groups.The results varied by co-location site, with VICTORIA showing significant degradation in sensor performance for RH > 75%.PDM also showed worse Clarity:RGM agreement for RH > 75%, although the difference was not as great as observed at VICTORIA.No significant RH effect was observed at RBU.
Splitting the dataset on PM2.5 = 100 µg m -3 showed significantly better Clarity:RGM agreement for lower PM2.5 loadings, and worse sensor performance for higher loadings, for all three co-location sites (Table 2).Deterioration of sensor performance for high PM2.5 loadings > 100 µg m -3 is consistent with studies of low-cost optical particle counter performance in Delhi (Crilley et al., 2020) and Plantower PMS3003 sensors in Kanpur (Zheng et al., 2018).Seasonal variation in PM2.5 in Kolkata is strong enough that splitting the data at PM2.5 = 100 µg m -3 is effectively similar to segregating Fig. 2. 24-hr averaged PM2.5 data from WBPCB RGMs (orange) and uncorrected Clarity Monitor data (as received from Clarity Cloud) (blue) for the RBU, PDM, and VICTORIA co-locations during the study period.1:1 lines are shown on the right hand panels as a guide to the eye.Refer to Table 2 and Table 3 for performance metrics.the data by seasons, since loadings are generally < 100 µg m -3 outside the post-monsoon and winter seasons (Fig. 2).There are many other reasons to characterize sensor performance with the changing seasons, including varying meteorological conditions, varying sources, and possible degradation of sensor performance with time.However, because of the timing of the sensor deployment in this study (deployment beginning in post-monsoon, when PM2.5 is high, and ending in summer, when PM2.5 is lower), it is difficult to distinguish the effects of these influences on the sensor performance from the strong effect of PM2.5 loading.
Table 2. Uncorrected Clarity node:RGM agreement for the three co-location sites.Shown are root mean squared error (RMSE (µg m -3 )) and normalized RMSE (NRMSE, unitless, in parentheses), on 24 hr or hourly averaging basis, and for the full dataset, or segregated based on relative humidity level or fine particle mass loading.Random Forest (RF) and Multiple Linear Regression (MLR) analyses were performed on the 24-hr averaged RBU and PDM datasets, because the WBPCB monitors are certified on a 24 hour basis.Factors tested were PM2.5, T, and RH (Table 3).PM2.5 was the most significant explanatory variable in the RF regression, followed by RH and T, consistent with the results of the data segregation analysis (Table 2).The RF and MLR approaches yielded similar satisfactory agreement with the reference data for both co-location sites, with R 2 > 0.9 in each case.Since MLR yields an analytical expression for the calibration model, which is straightforward to apply to the rest of the sensor network, MLR was used in the remainder of the study.Malings et al. (2020) used a piecewise MLR approach for Pittsburgh, USA data, splitting the data at PM2.5 = 20 µg m -3 .We tested this approach, splitting the data at PM2.5 = 100 µg m -3 .Using 24-hr averaged data the segregated datasets were not large enough for the calibration analysis (N < 50 for PM2.5 > 100 µg m -3 ) so hourly averaged data were used.An alternative calibration was developed using the hourly averaged PDM co-location data (PM2.5,corr= 111 + 0.596 × PM2.5, Clarity -0.861 × T -0.801 × RH, RMSE = 27.8 µg m -3 , NRMSE = 0.0732 µg m -3 ).The piecewise calibration model showed improved performance for PM2.5 < 100 µg m -3 (RMSE = 16.5 µg m -3 , NRMSE = 0.165) but performance was worse for PM2.5 > 100 µg m -3 (RMSE = 44.5 µg m -3 , NRMSE = 0.091), and both models underperformed compared to the MLR model developed with the full 24-hr averaged PDM dataset (Table 3).Therefore, we opted not to use piecewise calibration.
In order to test the robustness of applying the calibrations developed at a single Clarity/RGM co-location site to another distant site (11.3 km apart) in the network, the following cross-calibration test was performed using the calibrations developed using the 24-hr averaged co-location data: the MLR calibration developed for the RBU site (Table 3) was applied to the PDM dataset, and the output was compared to the PDM WBPCB reference data.Likewise, the MLR calibration developed for PDM was applied to RBU and compared to the RBU WBPCB reference data.When the PDM MLR calibration was applied to the RBU dataset, agreement for the corrected data with the RBU WBPCB reference data improved (RMSE = 20.1 µg m -3 ) compared to the raw Clarity data (RMSE = 27.1 µg m -3 ), but not as much as when the locally developed calibration was applied (RMSE = 15.3 µg m -3 ).Similar results were observed when the RBU MLR model was applied to the PDM dataset: RMSE was equal to 24.2 µg m -3 , improved compared to the raw Clarity data (RMSE = 32.6 µg m -3 ), but not as much as when the locally developed MLR calibration was applied (RMSE = 10.2 µg m -3 ).The cross-calibration corrected Clarity Monitor data for PDM and RBU accurately diagnosed days as being in or out of attainment with the 24-hour mean Indian NAAQS of 60 µg m -3 , as compared to the WBPCB reference monitor data, 91% of the time.The corrected data identified days with an India scale AQI of "poor" or worse (PM2.5 > 90 µg m -3 ) in agreement with the reference grade monitors 94% of the time.
Once the calibration method was established, the 24 hr average based MLR model derived from the PDM dataset (Table 3) was applied to the entire sensor network to derive a corrected dataset.Average corrected PM2.5 values for the months of November 2018, January 2019, and April 2019, as representative of the post-monsoon, winter, and spring/summer, are shown for each Clarity Monitor in Table 4.Where data are not shown, no data are available for that month for that sensor.Note that co-location data are not available November 2018 therefore the calibration was not developed using data from that time period, however we expect the calibration model performance for November to be similar to that for January since it is in the higher PM2.5 loading period.Averages for the 6 geographic zones and network-wide averages are also shown.Consistent with regional trends for the Indo-Gangetic Plain (Guttikunda and Gurjar, 2012;Bhowmik, et al., 2021), pollution is highest in January, when citywide average PM2.5 is 205 ± 27 µg m -3 , followed by November, with an average value of 153 ± 21 µg m -3 .These values exceeded the NAAQS and fell in the "Very Poor" category of the Indian AQI.PM2.5 levels are significantly lower in April, with an average of 47 ± 8 µg m -3 .PM2.5 levels were highest at Howrah Bus Depot and lowest at Camac Street (U.S. Consulate area).Among the zones, Howrah had the highest average PM2.5 levels, although data were somewhat limited for that zone.There were only 3 sensors in the Howrah zone.Ghusuri had limited data due to loss of the sensor after November 2018, and PDM was not installed until WBPCB approval for the co-location, which was obtained in January 2019.
In order to investigate diurnal patterns in PM2.5 across the network, the calibration developed using the hourly averaged PDM co-location data (PM2.5,corr= 111 + 0.596 × PM2.5, Clarity -0.861 × T -0.801 × RH, RMSE = 27.8 µg m -3 , NRMSE = 0.0732 µg m -3 ) was applied to hourly average data for the network, as shown in Fig. 3.The diurnal trend varied seasonally.Generally, in November and January, maximum PM2.5 was observed in the late-night hours (midnight-1 AM), with an additional minor peak in the morning (7-10 AM), while in April, PM2.5 varied more smoothly throughout the day, with a maximum in the afternoon (1-4 PM).This post-monsoon/wintertime diurnal pattern was consistent with what had been observed earlier for other large cities in the Indo-Gangetic plain (Guttikunda and Gurjar, 2012;Gani et al., 2020).The nocturnal maximum in PM2.5 could be attributed to low boundary layer height at night during post-monsoon and winter seasons.Nighttime emissions such as residential burning for heating or cooking may also contribute.The April pattern of an afternoon maximum with relatively little influence from morning and evening traffic suggests regional non-traffic sources and/or secondary aerosol formation.We note this trend observed in the corrected Clarity data, while unusual for cities in the IGP, is corroborated by the WBPCB RGM data.Gani et al. (2020) reported that secondary components of PM1, consisting of oxygenated organic aerosol, ammonium, nitrate, and sulfate, show a similar diurnal profile, with a single peak in the afternoon, in Delhi during all seasons.
We further investigated the spatial variability in PM2.5 by calculating the Pearson correlation coefficient (r) between the datasets shown in Fig. 3 (RBU, Howrah Depot, Beleghata, Camac Street, and Jadavpur) for November, January, and April.The results are shown in Fig. 4. PM2.5 is highly linearly correlated among these sites for all seasons.This high level of correlation among these sites with differing local sources underscores the importance of regional and secondary sources of aerosol in Kolkata and Howrah.Although still highly correlated (r ≥ 0.79) Howrah Depot showed lower correlation with the other sites, consistent with strong local sources.Correlation was slightly weaker in April as compared to November and January.The high degree of correlation of PDM with the other sites lend additional confidence in the choice to calibrate the network using the calibration developed based on the PDM co-location data.

DISCUSSION AND IMPLICATIONS
Air quality is a major public health issue in Kolkata.Previously published studies have shown that chronic exposure to ambient air pollution in Kolkata adversely affects pulmonary and cardiovascular health of its residents (Lahiri et al., 2000;Roy et al., 2001;Dutta and Ray, 2013).It is important to be empowered with low-cost tools and knowledge to assess local air pollution and the associated health risks to advocate relevant policy measures in addressing environmental pollution issues affecting local communities.Hence, we undertook this study to understand and validate the performance of LCS in Kolkata, which could then be used to supplement existing RGM networks for better air quality management.
During the study period, which covered the post-monsoon, winter, and spring/summer seasons of 2018-2019, PM2.5 levels exceeded the NAAQS 45% of the time (and nearly 100% of the time during the post-monsoon and winter months).This observation is consistent with other analyses of PM2.5 in the IGP region (see, e.g., Zheng et al., 2018;Gani et al., 2020;Gupta et al., 2022).The highest average PM2.5 values in the network were observed at Howrah Bus Depot, suggesting that  idling buses may be a significant local source of PM2.5 that could be reduced with the introduction of electric buses.The diurnal trends and high correlation among sites in different zones suggest that regional and secondary sources are also very important.Winds are mostly north-westerly during post-monsoon and winter seasons (October-February), bringing airmasses from the Indo-Gangetic Plain, transporting regional haze as well as pollution from regional thermal power plants, mining and steel industries.Based on Hybrid Single-Particle Lagrangian Integrated Trajectory model (HYSPLIT) analysis, Mallik et al. (2014) showed that, during April, prevailing winds in Kolkata generally come from the south (Bay of Bengal and coastal India) and transport occurs near surface level.A future multi-season investigation involving aerosol composition and gas measurements would provide necessary insight into the sources of PM2.5 in Kolkata.
Many of the Clarity Monitor sites in the network were near or on roadways or bus stands, whereas the co-location sites, particularly RBU, were selected by WBPCB to be farther from roads for security reasons and to characterize the urban background pollution.With the exception of Howrah Depot, the difference in average PM2.5 levels for the roadside vs. urban background sites in each geographic zone is not consistently distinguishable within the error, even when filtering for expected high traffic times (i.e., weekdays, 9-11 AM).However, some differences can be discerned in the diurnal variation, e.g., the afternoon peak time during summer (Fig. 3).Sites located closer to

Fig. 1 .
Fig. 1.Sensor placement.Markers indicate the locations of Clarity Monitors.Red markers indicate Clarity Monitors co-located with WBPCB reference grade PM2.5 monitors.S12 (purple) and S9 (green) bus routes are indicated.See text for details. Background map © Google, 2023.

Fig. 3 .
Fig. 3. Hourly average corrected Clarity PM2.5 data for representative locations, for months representing the post-monsoon, winter, and summer seasons in Kolkata.Thin grey lines represent +/-one standard deviation.

Fig. 4 .
Fig. 4. Correlation between hourly average corrected PM2.5 measurements at different representative locations in the Clarity network, for months representing the post = monsoon, winter, and summer seasons in Kolkata.

Table 1 .
Sensor Network Details.

Table 3 .
Calibration results for RBU and PDM co-located Clarity Monitor/WBPCB pairs on a 24 hr average basis.RMSE: root mean squared error.RF: random forest regression.MLR: Multiple linear regression.PM2.5 is in units of µg m -3 , temperature is in degrees Celsius and relative humidity is in percent.

Table 4 .
Monthly average corrected Clarity PM2.5 data (µg m -3 ) and standard deviation for November 2018, January 2019, and April 2019, months representing the post-monsoon, winter, and summer seasons in Kolkata.Data shown for each site, geographic zonal averages, and citywide average.