Harmonization of the Long-term PM 2.5 Carbon Data from the CSN Sites in New York State

Harmonizing the particulate carbon data from the Chemical Speciation Network (CSN) is necessary to perform reliable long-term trend and seasonal variability analyses, clean air regulation assessments, and climate change studies. But it is challenging because the measurement of the carbonaceous fraction of PM 2.5 (particulate matter with a diameter less than or equal to 2.5 µ m) underwent several changes both in samplers and analysis protocols. To address the above issue, field blanks are used to remove artifacts from samples, an outlier filter is applied to remove anomalies from the dataset, and a regression between retired samplers and the current sampler data is used to establish the harmonization between two co-located urban sites in this study. A second comparison between the retired method and Interagency Monitoring of Protected Visual Environments (IMPROVE) network data was carried out at two sites (one urban and one rural) with co-located samplers. These results show no site dependence for organic carbon (OC) concentrations and small but non-negligible differences for elemental carbon (EC), which can be attributed to the relatively greater uncertainty of the low concentration rural EC measurements. An adjustment criterion that harmonizes the data from the beginning of the sampling period to the present is obtained. The harmonized data shows consistent trends and seasonal variability when compared to the reported data with these trends declining over the period 2001–2018.


INTRODUCTION
To better understand the influence of the chemical composition of fine particulate matter (PM2.5), the Chemical Speciation Network (CSN) was first established in 2000 (EPA, 2003;Schwab et al., 2004), with the ability to measure the PM2.5 total mass concentration, numerous trace elements, major inorganic ions (sulphate, nitrate, ammonium, and more), and carbonaceous components (Organic Carbon (OC), and Elemental Carbon (EC)) (EPA, 2014). However, the complex chemical/physical properties of the PM2.5 carbonaceous components limited the precise quantification of their mass concentrations (Malm et al., 2011;Rattigan et al., 2016).
The CSN system in New York State (NYS) started in 2001 with the Rupprecht and Patashnick Co., Inc. Partisol Model 2300 speciation samplers (R&P) (Solomon et al., 2003(Solomon et al., , 2014Rattigan et al., 2011), and then shifted to the MetOne Spiral Aerosol Speciation Sampler (MetOne) between 2006 and 2007 (Rattigan et al., 2011), which was further replaced for carbonaceous species only by the URG-3000N sequential particulate speciation system (URG) during 2007-2009 to present. Particulate carbon samplers collect samples on quartz fiber filters which are analyzed using thermal/optical methods. The general sampler design and analysis methods can be found elsewhere in the literature (e.g., Macias and Hopke, 1981;Chow et al., 1993;Birch and Cary, 1996).
For filter analysis, URG and IMPROVE samples use the IMPROVE_A thermal/optical reflectance (TOR) protocol to analyze filters and obtain OC and EC, while R&P and MetOne used the National Institute of Occupational Safety and Health (NIOSH) thermal/optical transmittance (TOT) protocol (Chow et al., 2007;Wu et al., 2016). The discrepancies between the instruments (Table S1) and the analysis protocols from R&P and MetOne, to URG, resulted in inconsistencies in the recorded data, which jeopardized the certainty of any research results dependent on this data, given the importance of particulate carbon (organic and elemental) for environment (e.g., visibility), health impacts (e.g., cardiovascular and respiratory diseases), and climate (e.g., direct and indirect effects) (Hand et al., 2012).
The CSN network was well maintained in NYS during the past two decades, with the overlapping of the above instruments during some periods and the overlapping of the CSN and IMPROVE at a few sites, which provide a unique chance to harmonize the different datasets from different instruments based on different protocols, into a continuous long-term dataset. This will benefit both researchers and policymakers. Malm et al. (2011) and Hand et al. (2012) performed a similar study that scaled CSN data to IMPROVE data and compared them, respectively. The advantage of their studies was that they used a larger number of sites than used in this study, and the disadvantage was that they were limited to shorter time periods (between 2005 and 2006 for Malm et al. (2011Malm et al. ( ), and 2005Malm et al. ( -2008 for Hand et al. (2012)). Also, Malm et al. (2011) compared only MetOne data with IMPROVE data, whereas in this study the comparison includes both R&P and MetOne data.

METHODS
The location and period of analysis for the CSN sites used in this study are listed in Table 1 with the location map in Fig. S1. The study included four urban sites in New York City (NYC), two midsize urban sites, one rural site, and one remote site ( Table 1). The time period in this study was 2001-2018. The data adjustment was implemented in three steps, including 1) correcting for field blanks, 2) developing and applying outlier filter criterion, and 3) harmonizing TOT analyzed samples to be consistent with TOR samples, as shown in the following sections.

Categorizing Field Blanks
One major tool to reduce the biases due to changes in instruments, location, and operational procedure is the proper handling of the blank filter measurements (Dillner et al., 2009), which represent both in-route (shipping) and in-situ (instrument, location, and setup) artifacts (Schwab et al., 2004). The R&P and MetOne samplers had the same filter type, diameter, spot-size, and were analyzed using the same protocol (Watson et al., 2008). They had different sampling average volumes, flow rates, and face velocities (Malm et al., 2011;Rattigan et al., 2016). URG & IMPROVE both use the same type and size filters, and both currently use the same analysis protocol (Table S1). Field blanks reported from all samplers show a consistent temporal and negligible spatial variability per each sampler, indicating stability and consistency of field blanks with respect to adjustment of measured samples.
Comparing blank adjusted to unadjusted data for all sites showed no significant differences in observed trend patterns for the R&P and MetOne samples ( Fig. 1). For this reason, field blank  adjustment may affect the value of single measured sample, but not the general behavior of the data. Therefore, using blank filter measurements to adjust carbon data will preserve the data features (Dillner, 2016).

Outlier Filter
Anomalous data could affect any existing trend, it may increase the amplitude of data variability, and alter the trend slope. However, over-removal of anomalies may change variability patterns and/or mask trends. So, it was necessary to establish a criterion for the representativeness of data points with respect to the general stream of data. High outliers only are removed, as very low data does not qualify as outliers using this method. Two measures used are 1) three months running average and 2) three months running standard deviation both calculated for the reported samples. Then any sample that is greater than the sum of the running mean and 3.5 times the running standard deviation is filtered out according to the following expression:

Harmonizing TOT Samples to TOR
Co-located samplers using different analysis methods (TOR vs. TOT) were shown to have disagreement between OC and EC fractions (Watson et al., 2005;Rattigan et al., 2011;Malm et al., 2011). Samples analyzed using the TOT method tend to report more pyrolytic carbon (PC) and hence more OC than the TOR method (Chow et al., 2015;Squizzato et al., 2018). Co-located samples from both MetOne and URG samplers were available for fourteen months at IS52 and ten months at Queens. URG samplers were analyzed for both TOR and TOT from (2007/2009) until November 2015, which provided ample data to examine the goodness of adjustment criteria by comparing the adjusted pre-URG-TOT analyzed data to URG-TOT analyzed data. Two periods of bi-square robust regression analyses comparing IMPROVE and CSN data are carried out at each of IS52 and PSP sites representing the R&P period and MetOne periods. For these analyses the EC, OC, and TC data have all been filtered using the criterion that 0.5 < (CSN_carbon/IMPROVE_carbon) ratio < 2.0.

Field Blanks
Field blank seasonal medians (Fig. 2) and seasonal averages (Fig. S2) were examined. It was found that medians are more consistent and statistically stable. The summer season has higher median (OC) field blank values than any other season, which can be attributed to the increase in SOA during summer compared to any other season (Rattigan et al., 2010), but this is not entirely consistent. Blanks are uniformly high for the R&P and MetOne samplers.
OC field blanks retrieved from the URG sampler are much smaller than retired samplers (compare y-axes in Figs. 2(c), 2(b), and 2(a)). This large difference would magnify the error in any study if data were not adjusted correctly. It should be noted that there were some missed or rejected periods of blank measurements, with a detailed description in Table S2. The full set of seasonal median field blank values used for the adjustment are given in Table S3.

Regression Analyses
The URG-TOT and URG-TOR analyzed data has been regressed; all sites yielded similar results (Fig. S3), with a linear relation as follows: This consistent relation lent support to the decision to use the co-located data from MetOne-TOT and URG-TOR at IS52 and Queens, and additionally, R&P-TOT and IMPROVE-TOR at IS52 and PSP, as a guideline to adjust all pre-URG data at all NYS sites.
The similarity in the regression results from IS52 and Queens (Figs. S4 and S5) led to the use of a combined single set of equations for EC and OC at the urban sites: EC_MetOne-_TOR = 0.72 EC_MetOne-_TOT + 0.32 (3) OC_MetOne_TOR = 0.62 OC_MetOne_TOT + 0.27 (4) Figs. S6-S9 show the regression analysis results at IS52 and PSP sites during the co-located IMPROVE-Pre_URG periods for both EC and OC. Results show that regressing OC IMPROVE_TOR data versus MetOne_TOT data at IS52 has a slope of 0.62, slightly lower than MetOne_TOT versus URG_TOR slope with the intercept significantly smaller (0.09 compared to 0.27), and a slope of 0.58 with intercept of 0.34 for R&P. The EC regression shows a significant difference in slopes that is consistent across rural and urban sites, suggesting a sampler related systematic difference.
The OC regression analysis for R&P at PSP shows more consistent results with that at IS52 and Queens. Yet, the slope is smaller compared to the IS52 regarding MetOne_TOT data. The results from both sites emphasize that the relationship between OC measured by R&P_TOT and IMPROVE_TOR samplers (and hence URG) doesn't show significant site dependency, while MetOne reported data show more site dependance. The relationship between EC concentration measured using pre-URG shows more site-to-site variability. Accordingly, the criteria adopted to adjust the pre-URG data is to use the site dependent regression factors as summarized in Table 2.

Pre-URG Harmonization
Figs. 3 and 4 show the monthly average of pre-URG_TOT, adjusted to TOR and URG_TOR analyzed samples for EC and OC, respectively, at Queens, PSP, and WFM sites. The adjusted EC concentrations were generally higher than the reported measurements. However, the opposite can be seen with OC concentrations, which matches the fact TOR-analysis tends to measure lower OC and higher EC concentrations than TOT-analysis method.
The adjusted values track the reported values, and the difference between reported and adjusted concentrations is proportional to the reported concentrations. Moreover, Mann-Kendall trend significance analysis showed no significant differences between reported and adjusted pre-URG data, Table S4 shows the results of Mann-Kendall trend significance analysis for adjusted EC and OC data. In the case of a non-adjusted dataset, Fig. 4 shows an unrealistic declining trend for OC during the transition from pre-URG to URG sampling. These features of the adjustment method (tracking over the break in sampler and analysis method change and proportionality of deficit to the reported data) preserve the trend and variability of the dataset and guarantee no systematic bias. Although the adjusted pre-URG data preserves the same trends as the originally reported data, one important add on to the data is that for EC, analyzing the un-adjusted data with the URG data at the same time will mask or reverse the trend direction (lower reported EC data). On the other hand, OC reported data which is higher than the adjusted data will unrealistically magnify the trend. Moreover, with adjusted OC data we can figure out that there is no statistically significant trend at WFM (specifically since 2008), which can't be assessed using only reported unadjusted data.
To keep the harmonized dataset as consistent as possible and given that URG became the longest in use sampler, whenever there are co-located data, URG data is used. The full harmonized EC, OC, and TC data annual averages can be found in Fig. 5 which show a statistically significant declining trend (except for WFM OC and TC), in agreement with other studies (Rattigan et al., 2016;Blanchard et al., 2021).

CONCLUSIONS
A method to adjust the particulate carbon data from retired samplers to match the current samplers is provided to help facilitate and enhance the long-term trends, seasonal variability, source apportionment, regulation assessments, and climate change studies. In this study, artifacts were removed from samples using the field blanks and anomalies were removed using an outlier filter. Several regression analyses were used to harmonize the CSN carbon data with the URG. The relationship between TOT-analyzed and TOR-analyzed samples doesn't show significant environmental dependence for OC, while EC samples show higher uncertainty. The resultant harmonized dataset was found to preserve the main features of the reported data which shows declining trends. More details about long-term trends and seasonal variability are the subject of a future study.