Long-term Field Evaluation of Low-cost Particulate Matter Sensors in Nanjing

Low-cost particulate matter (PM) sensors can be widely deployed to measure aerosol concentrations at higher spatial and temporal resolutions than traditional instruments, but they need to be carefully calibrated under ambient conditions. In this study, a long-term field experiment was conducted from December 2015 to May 2017 at a site in Nanjing to evaluate the capabilities of in-house built low-cost PM monitors using the Shinyei PPD42NS sensor for ambient PM2.5 monitoring. A BAM-1020 particulate monitor was co-located with the low-cost sensors to provide reference readings. Least-square regressions with linear and power-law functions, and an artificial neural network (ANN) technique were used to convert electrical instrument readings to ambient aerosol concentrations. Applying the ANN technique resulted in the best estimation of the hourly PM2.5 (R2 = 0.84; mean normalized bias = 12.7% and mean normalized error (MNE) = 29.7%). The low-cost sensors displayed relatively good performance with high aerosol concentrations but larger errors with concentrations below 35 μg m–3. High humidity (RH > 75%) can cause a larger MNE for these sensors, but the impact of temperature was negligible in this study. A clear sensor deterioration trend was observed during the 18-month field calibration. High correlations were found between the data from a single low-cost sensor and the data from the BAM-1020 when the low-cost sensor was individually calibrated, but the correlations between measurements taken by different lowcost sensor units were only moderate, possibly due to internal sensor variations. The results suggest that these low-cost sensors can measure ambient PM2.5 concentrations with an acceptable level of accuracy, which can and should be improved by calibrating each sensor individually. Special attention should be paid to the accuracy of these sensors after long-term application and in highly humid environments.


INTRODUCTION
Fine particulate matter (PM 2.5 ) is a major air pollutant in China (Chan and Yao, 2008;Wang et al., 2014;Zhang and Cao, 2015).PM 2.5 causes serious health burden, resulting in over 3.2 million premature deaths per year globally (Lim et al., 2012;Atkinson et al., 2016;Brook et al., 2017;Khreis et al., 2017) and over 1.3 million premature deaths in China in 2013 (Lelieveld et al., 2015;Liu et al., 2016;Hu et al., 2017).Air quality monitoring networks have been established in many countries to monitor PM 2.5 concentrations (Elkamel et al., 2008;Ferradás et al., 2010).The Chinese Ministry of Environmental Protection started to monitor ambient PM 2.5 concentrations in January 2013.The monitoring network consisted of about 500 monitoring stations in 74 major cities in China in 2013 (Hu et al., 2014a;Wang et al., 2014), and expanded to nearly 950 stations in 190 cities in 2015 (Zhang and Cao, 2015).Filter-based inertial/gravimetric techniques along with a well-constructed size-selection inlet are generally considered more accurate and used as a reference method to evaluate other indirect measurement techniques.However, the filter-based reference method is usually limited in temporal resolution and requires constant human attendance including filter changing and weighing.Several indirect techniques, such as the tapered element oscillating microbalance (TEOM) and beta attenuation monitor (BAM), have gone through strict field evaluations and demonstrated that they can provide concentration measures equivalent to the United States Federal Reference Method (Chung et al., 2001;Larsen and Benders, 2014;EPA, 2016).These equivalent method instruments can provide higher temporal resolution PM measurements but are still relatively expensive.Traditionally, they are only operationally deployed at the central monitoring sites within a monitoring network.However, recent studies have found that using data from the central monitors alone led to incorrect population exposure estimations in complex urban environments where spatial heterogeneity in aerosol concentration is significant (Hu et al., 2014b, c).
Low-cost PM sensors have been developed in recent years in order to fill the data gaps in the existing regulatory monitoring networks (Li and Biswas, 2017) to provide particle concentrations at much higher spatial and temporal resolutions due to their relatively low cost.These aerosol monitors can also be used to locate pollution hotspots and generate three-dimensional maps of PM concentrations (Rajasegarar et al., 2014).However, many of these lowcost monitors use inexpensive particle sensors, which might have significant deviation from their nominal specifications, and do not have a strict control of sampling and environmental conditions such as the flow rate, temperature and relative humidity.Careful calibration/ evaluation of these low-cost PM sensors is needed for data quality assurance.
Recent studies have shown that the data quality of the low-cost PM sensors needs to be evaluated under realistic ambient conditions.Under controlled laboratory conditions, the calibrated low-cost PM sensors often obtain a high correlation (R 2 > 0.85) with the reference instruments.However, the low-cost PM sensors typically perform less accurately in field conditions.Carvlin et al. (2017) studied the performance of a low-cost sensor in Imperial County, California, and found the agreement of the sensor with BAM was only moderate to high (R 2 = 0.35-0.81).Johnson et al. (2018) conducted field calibration of several low-cost PM sensors in high and low concentration urban environments.They found that the sensor-reported concentrations were only weakly correlated with the TEOM monitor (R 2 ≤ 0.30) at the low concentration Atlanta site but a higher correlation with the BAM monitor (R 2 > 0.80) was found at a more polluted site.The findings indicate that the performance of low-cost PM sensors under the field conditions is quite different from that under the laboratory condition.Therefore, laboratory-calibrated sensors may not be used directly for field measurements (Rai et al., 2017), and it is important to perform field calibration of low-cost PM sensors before they are deployed for ambient monitoring Although previous field calibration studies have been conducted in various locations, all the studies are shortterm, ranging from a few days (Genikomsakis et al., 2018) to a few weeks (Zheng et al., 2018) or several months (Mukherjee et al., 2017;Crilley et al., 2018;Sayahi et al., 2019).However, even though many of the low-cost PM sensors are intended to be deployed in the field with minimal maintenance over a long period of time, long-term studies that cover at least an annual cycle are rarely reported in the literature and the stability of the low-cost PM sensors over long-term periods has not yet been addressed.In addition, most studies utilized a simple linear regression technique in the calibration process to relate raw sensor reading with the actual aerosol concentrations.Some studies show that a simple linear function is generally sufficient to provide moderate to high correlation coefficient values (Rai et al., 2017).However, quite a number of studies (Kelly and Sukhatme, 2010;Austin et al., 2015;Wang et al., 2015;Manikonda et al., 2016) have reported that the sensor response started to saturate at high particle concentrations (higher than 50-100 µg m -3 ) so that other data fitting methods, such as power-law or higher order polynomial functions might be needed to convert raw sensor readings (Rai et al., 2017).
Studies also investigated the impacts of environmental factors, such as relative humidity, temperature and light, on the performance of low cost sensors (Austin et al., 2015;Wang et al., 2015;Manikonda et al., 2016;Rai et al., 2017).A few studies reported that relative humidity had a large effect on the sensor outputs (Wang et al., 2015).Relative humidity affected the performance of particle sensors in different ways.Water in the air may absorb infrared radiation and cause an overestimation of particle mass concentration because of the lessened light intensity.Also, high water vapor content in the air could cause malfunction of the sensors.In addition, the reference instrument may also generate inaccurate outputs under high humidity conditions (Wang et al., 2015).However, there are other investigations showing environmental factors such as temperature and relative humidity have negligible effect on sensor output (Bart et al., 2014;Holstius et al., 2014;Jiao et al., 2015).Holstius et al. (2014) found that variability in hourly BAM output cannot be explained by light or temperature, although hourly relative humidity had some ability to predict hourly BAM responses.
The overall objective of the present study is to evaluate the performance of low-cost PM sensor under ambient environments over a long-term period.Several low-cost PM sensors based on the Shinyei PPD42NS aerosol sensors were built and a long-term (18-month) field experiment (from December 2015 to May 2017) was performed with a co-located BAM-1020.Three data conversion techniques, including linear least-square regression, non-linear least square regression with power-law equation, and artificial neural network (ANN), were used to convert the raw electric signal from the PPD42NS to ambient aerosol concentrations.The performance of low-cost PM sensors was examined in terms of sensor stability and consistency.In addition, the influence of environmental factors including temperature and humidity on the performance was investigated.

Low-cost PM Monitor
The low-cost monitor system (Fig. 1) includes a Shinyei PPD42NS aerosol sensor, a temperature and humidity sensor (DHT22), an OLED display module (128 × 64 pixels), a real-time clock (RTC) module (DS3231) and an SD-card module for data recording.These components are controlled by a microcontroller board (Arduino Mega 2560).All hardware was installed in a 155 × 101 × 95 mm, 250 g polylactic acid case, and was powered by a single 5 V DC input from a high-quality USB wall charger.One end of the case is perforated and there is a small fan (AD0405MB-G72 ADDA with a nominal flow rate of 5.7 ft 3 min -1 ) on the other end to allow air into the case.The overall cost is about $80.The Shinyei PPD42NS sensor is based on the principle of light scattering.When particles are detected in the aerosol chamber, the voltage on the output pin drops from 4.5 V to 0.7 V, creating a low voltage pulse that has a width of 10-90 ms.The standard operation of the sensor requires counting the low pulse occupancy time for at least 30 s and the fraction of low pulse occupancy time (i.e., low pulse occupancy ratio (LOR)) during that accumulation period demonstrates a distinct dependency with the number concentration of particles.In actual operation, a 2-minute accumulation period is used to further reduce the noise in the signal.These 2-minute LOR values are further averaged to 1-hour during post-processing before they are used in the data analysis.PPD42NS has two LOR output channels, one for particles greater than ~1 µm (LOR1) and the other for particles greater than ~2.5 µm (LOR2).Operating environment conditions are suggested to be between 0-45°C and relative humidity less than 95%.The PPD42NS sensor is mounted vertically on one side of the monitor case.The opening of PPD42NS sensor is parallel to the air flow and the bucket shielding prevents the wind to directly blow into the instrument so that the air flow rate into the instrument is controlled by the fan.The DHT22 is a reliable and stable low cost temperature and humidity sensor.It operates on 5 V DC (accepts 3.3-6 V).Temperature measurement has a resolution of 0.1°C and accuracy of < ±0.5°C.RH measurement has a resolution of 0.1% and accuracy of ±5% (https://www.sparkfun.com/datasheets/Sensors/Temperature/DHT22.pdf).The operation range of the temperature and RH sensors are -40°C to 80°C and 5-99% RH, respectively.

Experimental Settings
Nanjing, the capital city of Jiangsu Province, is located in the Yangtze River Delta.The experimental site was set at the atmospheric observation field of Nanjing University of Information Science and Technology (NUIST) (Fig. 2).
The NUIST site is located in the northern suburb of Nanjing (32°12′′N, 118°42′′E).Three low-cost PM sensors were placed next to a trailer in which the BAM-1020 (Met One Instruments) is installed.The three monitors are given IDs A001-A003, respectively, in the following discussions.The A001 monitor is the first prototype and collected data from December 2015 to May 2017.A002-A003 was made later and collected data in July 2016, and from November 2016 to June 2017.The BAM-1020 is a Federal Equivalent Method (FEM) β-attenuation monitor as defined by the U.S. EPA.Studies have shown that BAM-1020 can measure PM 2.5 with accuracy similar to that achieved by filter-based gravimetric methods (Chung et al., 2001).However, a few possible limitations and artifacts exist in this instrument, including (1) heating of the inlet line to a temperature of ~30°C to reduce relative humidity to below 60%, which may underestimate PM 2.5 concentrations when large amounts of volatile particulate matter are present; (2) a slight sensitivity to hydrogen ion concentration present in airborne particles; (3) fluctuations of the sample flow rate due to pressure, relative humidity, and temperature variations (Chung et al., 2001).BAM-1020 is widely used in the National Ambient Air Quality Monitoring Network for monitoring PM 2.5 in China, so this instrument was chosen to be the reference monitor so that the evaluation could be more practically relevant.

Analytical Methods
Least-square regression, power-law regression and ANN techniques were used to convert the raw LOR1 readings into atmospheric PM 2.5 concentrations by using the BAM-1020 readings as references.The linear regression and power-law regression only use hourly average LOR1 as the independent variable and the hourly BAM-1020 as the dependent variable, as shown in Eqs.(1) and (2): (1) where P i represents hourly concentrations of PM 2.5 measured by BAM-1020, and a and b are parameters to be determined from the regression analysis.The ANN technique is a powerful tool with a proven efficiency in dealing with complex non-linear problems (Hogrefe et al., 2001;Taspinar, 2015).Many studies have used ANN to predict air quality (Díaz-Robles et al., 2008;Kim et al., 2012;Zhang et al., 2013;Mishra et al., 2015;Zu et al., 2017;Park et al., 2018).BP (back-propagation) neural network is one of the most widely used ANNs in data fitting.In this study, the neural network toolbox of MATLAB (version R2014b) were used to set up and train the ANN.A BP model was configured to use one hidden layer with twenty neurons.The detailed configuration of the ANN method is shown in Table 1.The low-cost monitor raw data (LOR1, LOR2, temperature and RH, and hour of the day) are correlated with the BAM-1020 measured concentrations.The ANN used in this study is a two-layer feed-forward neural network (NN) with one hidden layer.The fitnet and train functions in MATLAB's Neural Network Toolbox were used to create and train the NNs, respectively.The Levenberg-Marquardt was selected to use in the training function to determine the weight and bias parameters for the NN (Sportisse et al., 2007).The number of hidden neurons was determined through a series of tests with 1 to 100 neurons.Model performance with twenty neurons is the best among all the tests.The entire set of ANN model training data were partitioned randomly into 70% training data and 30% cross-validation data.The cross-validation is an effective way of reducing over-fitting during ANN training.The parameters from the trained ANN ensemble were saved in order to estimate PM 2.5 of other sensors.
The PM 2.5 concentrations based on Eqs. ( 1) and ( 2) and the ANN model were compared with PM 2.5 concentrations monitored using BAM-1020.The coefficient of determination (R 2 ) Eq. ( 3)), mean normalized bias (MNB) Eq. ( 4)), and mean normalized error (MNE) Eq. ( 5)) were calculated to provide a statistical description of the accuracy of the estimates: where P i and O i represent the predicted and observed PM 2.5 concentrations, respectively, and N is the total number of hours with valid observations.Data points with hourly average RH > 95% were excluded from the analysis.For the daily averages, we deleted the number of days with less than 16 hours collected in one day.

Conversion of Sensor Readings to Ambient PM 2.5 Concentrations
Fig. 3 shows the least-square regression results using the linear (Fig. 3(a)) and power-law (Fig. 3(b)) functions for the hourly averaged LOR1 measured by A001 monitor and the hourly PM 2.5 concentrations measured by the co-located BAM-1020 particle monitor.Both equations yield a moderate coefficient of determination between LOR1 values and PM 2.5 concentrations (R 2 of 0.75 and 0.73 for the linear equation and power-law equation, respectively).
To evaluate the ability of these three approaches to convert raw hourly sensor LOR values into hourly PM 2.5 concentrations, the predicted PM 2.5 concentrations by the A001 monitor using these three approaches are compared to the BAM-1020 measured PM 2.5 concentrations.The results are shown in Fig. 4. R 2 of linear and power-law equations is  0.75 and 0.71, respectively, and is 0.84 for the ANN method.The results are comparable to the previous study in Xi'an which is a high concentration urban environment (Gao et al., 2015) and are better than the study in California which is a much cleaner environment (Holstius et al., 2014).The slopes of predicted and measured PM 2.5 using the three methods are all close to 1.0 (1.0 for the linear equation, 1.14 for the power-law equation, and 1.02 for ANN).MNB and MNE range from 12.67% to 23.38% and 29.71% to 45.24%, respectively, for the three methods.Linear regression is better as it yields higher R 2 (0.75 for linear regression vs. 0.71 for power law), and it has smaller MNB (22.4% for linear regression vs. 23.4% for power law) and smaller MNE (41.5% for linear regression vs. 45.2% for power law).ANN shows higher agreement with a higher R 2 and lower MNB and MNE values.Because the ANN method yields the best agreement, this method is used for the rest of the analyses in this study.Fig. 5 shows the time series of the hourly and daily average PM 2.5 concentrations measured by A001 with the ANN technique and BAM-1020 during the 18-month study period (Data points with hourly average RH > 95% were excluded from the analysis.For the daily averages, we deleted the number of days with less than 16 hours collected in one day).The two monitors are in excellent agreement.The daily averaged PM 2.5 concentrations have even higher R 2 (0.93) and lower MNB (5.43%) and MNE (17.02%), compared to hourly results.
The input parameters affect the performance of the ANN method.Four sensitivity tests were conducted with the ANN method to examine the impacts of a different number of input parameters on the calibration results.We used the data of A001 to examine the influence of input parameters and the results are shown in Table 2.The result indicates that including all five input parameters, the neural network generates the best estimation with the highest overall R 2 and the lowest MNE.The neural network performance becomes worse when relative humidity or temperature data were excluded.This is due to the size and optical properties of the ambient particles being affected by the amount of water on the particles.Both temperature and relative humidity can affect particle water and thus size and scattering ability of the particles.Thus, including the temperature and humidity in the ANN analysis leads to better correlations as the ANN can take the impact of RH and temperature effect into consideration.

Impact of Ambient Conditions on Calibration Results
According to the manufacturer's datasheet (Shinyei Corp., 2010), the measurement error becomes more substantial at very low concentrations.To examine the ability of low-cost monitors with the PPD42NS sensors for measuring different ranges of concentrations, Fig. 6 shows the MNE (%), MNB (%), and the ±20% proportion (i.e., the percentage of paired data points of the ANN and BAM-1020 PM 2.5 concentrations that have relative difference less than or equal to ±20%) for concentration ranges in 0-35, 35-50, 50-75, 75-100, 100-150, and > 150 µg m -3 during the study period.The results indicate that the low-cost PM sensor has substantial bias measuring low concentrations of < 35 µg m -3 with both MNB and MNE over 60%, and only 22% data points have relative difference within ±20%.Performance gets better when concentrations are over 35 µg m -3 and when concentrations are over 150 µg m -3 , the ±20% proportion becomes 79%.This finding is consistent with the study by Zheng et al. (2018).Fig. 7 shows the MNB, MNE and the ±20% proportion under different relative humidity and temperature ranges.Even though MNB does not show a very clear trend, MNE gradually increases and the ±20% proportion gradually decreases when RH increases, indicating higher errors for measuring concentrations under higher humidity environment (RH > 75%).The results suggest that special attention should be paid when applying low-cost PM sensor in high humidity environment.Relative humidity affected the performance of low-cost sensors in different ways.Water in the air may absorb infrared radiation and cause an overestimation of particle mass concentration because of the lessened light intensity.In addition, particle water absorption under high humidity leads to particle morphology and size changes, which affect the refractive characteristics of particles (Hiranuma et al., 2008).The performance of the low-cost sensor is quite similar for temperature above 0°C, as shown in Fig. 7(b).The range of MNB, MNE, and the ±20% proportion is 10-15%, 27-33% and 46-57%, respectively.However, the performance is the best when the temperature is under 0°C (-5°C to 0°C in Nanjing in the study period), with MNB of 0.03, MNE of 0.13, and the ±20% proportion of 83%.The PPD42NS sensor is suggested to work under temperature in the range of 0-45°C.The good performance for under 0°C suggests that the sensor is also good enough for a few degrees under 0°C.However, the good performance is mainly due to high concentrations occurring during winters in Nanjing when the temperature was low.

Stability of Low-cost PM Sensor over Long-term Periods
The data of A001 collected from December 2015 to May 2017 were divided into 6 periods of every 3 months, and field calibration was performed for each period to further evaluate the stability of the low-cost PM sensor over the 18 months.Fig. 8 shows the changes to correlations and MNB, MNE and the ±20% proportion as a function of time.A clear sensor deterioration trend is found.MNB and MNE gradually increase (MNB from 9% to 13% and MNE from 19% to 30%), and the ±20% proportion gradually decrease (from 0.72 to 0.52) with time.The accuracy of the low-cost monitor A001 appears to deteriorate in later months.This deterioration might happen due to the aging of the electric components and/or accumulation of dust on the surface of the optical components as no maintenance was done during the operation of the low-cost sensors.

Consistency among Low-cost PM Sensors
Fig. 9 shows the correlation between the PM 2.5 concentrations estimated by A002 and A003 with the ANN  To further verify whether the correlation established by one low-cost PM sensor can be reliably applied to other low-cost PM sensors, all the data of A001 are taken as the training data of the ANN, and all the data collected by A002 and A003 are used as the validation data, respectively.The results are shown in Fig. 10.The R 2 of the ANN predicted and BAM-1020 observed PM 2.5 concentrations is 0.59 and 0.66 for A002 and A003, respectively, which indicates that the ANN trained by A001 data yields moderate correlation for A002 and A003 readings with ambient PM 2.5 concentrations.The bias and errors also can be significant, the MNB range is 20.94% to 29.41%, and the MNE range is 42.31% to 52.92%.Therefore, even though a strong correlation is found for individual low-cost PM sensors, it is recommended that different sensors should be calibrated separately under ambient conditions.Using one correlation for all sensors will likely lead to substantial bias in estimating the ambient PM 2.5 concentrations.

CONCLUSION
In this study, low-cost PM monitors were built using the Shinyei PPD42NS aerosol sensor.The monitors were calibrated with a co-located BAM-1020 particulate monitor for long-term measurements from December 2015 to May 2017 on the campus of Nanjing University of Information Science & Technology.Three methods of calibration were used: linear regression, power law regression, and an ANN technique.Of these methods, the ANN technique yielded the highest correlation between the low-cost sensor estimates and the BAM-1020 PM 2.5 measurements (R 2 = 0.84), and the lowest MNB (12.66%) and MNE (29.71%).Additionally, including the relative humidity and temperature data improved the results of this method.Therefore, the ANN technique can be used to accurately evaluate the performance of low-cost sensors in the field.The PPD42NS exhibited larger errors when measuring low concentrations in Nanjing, especially those below 35 µg m -3 , and relatively high MNE values were also found for measurements taken in very humid environments (RH > 75%), suggesting that the influence of relative humidity on sensor performance must be considered under such conditions.Furthermore, although the low-cost PM sensors in this study showed good agreement with the BAM-1020 following the 18month calibration with the ANN technique (R 2 = 0.84), a clear sensor deterioration trend was observed.Therefore, these sensors must be carefully calibrated after long-term application.Although high correlations were found between the data from a single low-cost sensor and the data from the BAM-1020 when the low-cost sensor was individually calibrated, the correlations between measurements taken by different low-cost sensor units were only moderate, suggesting that the low-cost sensors should be individually calibrated.Our study illustrates the importance of evaluating the long-term field performance of low-cost PM sensors.In real-world applications, calibrating and validating these sensors with regulatory instruments is crucial to obtaining high-quality data.Future studies to develop more reliable and stable sensors with longer lifetimes, and higher efficiency in high humidity environments are recommended.

Fig. 1 .
Fig. 1.The low-cost PM sensor used in this study.

Fig. 2 .
Fig.2.Field monitoring and the overview of the monitoring station.The monitors are covered with plastic washbasins to as a rudimentary method to prevent direct wind gust from entering the monitor and to protect them from natural elements.

Fig. 3 .
Fig. 3. (a) Linear regression and (b) non-linear regression for power law fit of LOR1 and BAM-1020 for low-cost PM sensor A001 (number of hourly data points: N = 2833).Data points with hourly average RH > 95% were excluded from the analysis.

Fig. 4 .
Fig. 4. (a) Evaluation of the linear, (b) power law and (c) neural network approaches in estimating PM 2.5 concentrations.Linear correlation between PM 2.5 estimated by the low-cost PM sensor using the three approaches, and BAM-1020 measurements is conducted, and the equation, R 2 , MNB, MNE values are shown in each sub-panel.

Fig. 5 .
Fig. 5. Time series of (a) hourly and (b) daily averaged PM 2.5 concentrations estimated by low-cost PM sensor A001 with the ANN technique and by BAM-1020 during the study period.

Fig. 8 .
Fig. 8. MNB (%), MNE (%) and the percentage of MNB and MNE within ±20% for measuring PM 2.5 in the different period of observation from December 2015 to May 2017.

Fig. 9 .
Fig. 9.The linear correlation between PM 2.5 estimated by the low-cost PM sensor (a) A002 and (b) A003 using the artificial neural network, and BAM-1020 measurements.

Fig. 10 .
Fig. 10.Evaluation of consistency of low-cost monitors.All data of the low-cost PM sensor A001 is used as the training data of the neural network model, and all the data of (a) A002 and (b) A003 are used as model verification data.

Table 1 .
Detailed configuration of the neural network.

Table 2 .
Influence of input parameters on the performance of the ANN in predicting PM 2.5 concentrations.