Satellite-based Emission Inventory Adjustments Improve Simulations of Long-range Transport Events

Long-range pollution transport (LRT) events have a wide impact across East Asia, but are often difficult to track due to imprecise emission inventories and changing domain scales as the plume moves from source to receptor locations. This study adjusts a bottom-up emission inventory based on changes in remotely sensed NO2 column densities for a source region of East Asia, then with CMAQv5.2.1 simulates transport of LRT plumes to Taiwan. Adjustment of an emissions inventory based on satellite measurements during the COVID-19 lockdown in China led to a -59% reduction in emissions over the relevant source area in China compared to base emissions. As a result, PM2.5 mass concentrations were reproduced to match observations (mean fractional bias, MFB of -13.9% and 18.5% at a remote and urban station) as the plume passed through northern Taiwan. Furthermore, the OMI-adjusted emissions simulation brought all of the major PM2.5 components to within -50% of the measured values. Another LRT event from 2018 with more subtle OMI-adjustments to the emissions was also simulated and with improved overall PM2.5 mass concentration at the northern tip of Taiwan (MFB: -91.5%) compared to the base model (MFB: -102.1%), and an acceptable index of agreement (0.78). For the 2018 event, non sea-salt sulfate concentrations were consistently underpredicted (0.2-0.4), while nitrate concentrations were overpredicted by up to factor of 11. Copernicus Atmosphere Monitoring Service (CAMS) reanalysis of the PM(2.)5 concentrations shows high sulfate concentrations in eastern China in the areas associated with 72-h back-trajectories from northern Taiwan during both events, lending support for future model investigations of sulfate source area production and transport to Taiwan. In order to better track these LRT events out of East Asia and optimize OMI-adjustment methodology, it is recommended to explore other satellite-based products to map unaccounted for SO2 sources upstream of Taiwan.


INTRODUCTION
For more than a decade, there have been frequent haze incidents in eastern China during winter and spring, generating and exporting excessive amounts of PM2.5, particulate matter smaller than 2.5 microns in diameter (Fu et al., 2014;Yang et al., 2016). Taiwan is located in the southeast corner of East Asia, with its main island just ~130 km from mainland China and ~650 km from Shanghai, so is often affected by outflow pollution during northeast monsoon conditions (Chuang et al., 2008;Zhang et al., 2015;Wang et al., 2016;Chuang et al., 2017;Hsu and Cheng, 2019). During these long-range transport (LRT) events, the northern and western parts of Taiwan, accounting for 95% of its population, are primarily impacted by this transported PM2.5, posing an excess health risk to the island's residents (Song et al., 2011;Loftus et al., 2015;Griffith et al., 2020).
The COVID-19 pandemic has provided an opportunity worldwide to monitor air quality improvements during various levels of lockdowns (He et al., 2020;Le et al., 2020;Muhammad et al. 2020;Pani et al., 2020;Venter et al., 2020), including air quality downwind of strong source regions (Griffith et al., 2020). Our previous research analyzed CMAQ simulation results and OMI-NO2 satellite data during a LRT event to Taiwan in the first week of the COVID-19 lockdown in China, and found agreement between modeled PM2.5 concentrations for northern Taiwan under a uniform 50% emission reduction in China and measured PM2.5 concentrations (Griffith et al., 2020). However, that was a relatively crude emissions adjustment. Past studies have utilized satellite data (such as NOx, SO2, CO) to estimate changes in emissions of air pollutants (Qiu et al., 2016;Liu et al., 2018;Zhao et al., 2018;Yang et al., 2021). For instance, Liu et al. (2018) combined OMI-adjusted SO2 emissions of a bottom-up global emission inventory, HTAP, and improved the normalized mean bias (NMB) from 0.41 (HTAP) to 0.03 (OMI-HTAP) in 2010 and from 0.29 (HTAP) to 0.05 (OMI-HTAP) in 2014. However, all of the aforementioned studies were focused on improving the simulation of monthly to annual averages.
Our current study further capitalizes on the severely reduced emissions during a Chinese New Year (CNY) 2020 long-range transport event and adjusts the emission inventory according to changes in OMI-NO2 measurements over a shorter-term period, obtaining an emission inventory closer to the actual situation during the event. Our study goes still further and analyzes the performance of the PM2.5 mass modeling not only at the coastal receptor site during the CNY 2020 event highlighted in Griffith et al. (2020), but also includes a pre-COVID LRT event and data from an urban site in northern Taiwan. Finally, comparison of the measured and modeled PM chemical composition is addressed with support from Copernicus Atmosphere Monitoring Service (CAMS) re-analysis data.

Surface Measurements
This study focuses on two LRT events: Jan. 27-Feb. 2, 2020 (Event 1, coinciding with the firstweek of COVID-19 lockdowns in China) and Jan. 29-Feb. 3, 2018 (Event 2, pre-COVID). The receptor sites highlighted in this study are Cape Fuguei (FG), a site free of local pollution during northerly winds and located at the tip of the main island of Taiwan, and Banqiao (BQ), a heavily urbanized area in New Taipei City located in the Taipei Basin ( Fig. 1). Similar to the FG site during northeast monsoon winds, the BQ site is more heavily impacted by East Asia outflow pollution than further south in Taiwan. BQ is also the most northerly site besides FG with PM composition measurements, thus was chosen as a comparison site. Taiwan pollution observation data (i.e., PM2.5 for this study) were obtained from Taiwan Environmental Protection Administration website (https://data.epa.gov.tw/) and weather data were obtained from the Central Weather Bureau. PM2.5 composition data at Cape Fuguei was measured from Partisol-Plus 2025 dichotomous aerosol samples; more information about the instrumentation, and sampling and analytical procedures can be found in Chou et al. (2017). PM2.5 composition data at Banqiao was measured from R&P ChemComb Speciation Sampling cartridges; more details based on a similar setup from the same team can be found in Chen et al. (2021). Supporting PM2.5 and wind force scale data from Hangzhou and Shanghai in eastern China were obtained from an online air quality monitoring and analysis platform (https://www.aqistudy.cn). Hangzhou and Shanghai are both located in the Yangtze River Delta (YRD), a highly polluted area that LRT events frequently pass through during the northeast monsoon season (Chuang et al., 2020) and was the case for the two events in this study.

OMI-NO2 Satellite Retrieval Data
OMI-UV-V spectrometer data was used to remotely measure changes in NO2 (Krotkov et al., 2016) and adjust the emission inventory for China. OMI is mounted on the NASA Aura satellite, where the NO2 vertical column density is calculated based on the backscattered radiation measurement results in the 405-465 nm window. OMI makes a pass at approximately 13:45 local time every day, with a minimum pixel size of 13 × 24 km 2 . In order to focus on tropospheric data and eliminate cloud interference, we extracted the Level-3 tropospheric NO2 column density product filtered for cloud fraction < 30%.

Models and Modeling Configuration
This study used the Weather Research Forecast (WRF) model (Skamarock and Klemp, 2008) version 3.9.1 and the atmospheric air quality model Community Multiscale Air Quality (CMAQ) model (Byun and Schere, 2006) version 5.2.1, and was conducted with a four nested grid of horizontal resolutions 81 km, 27 km, 9 km, and 3 km for domains 01, 02, 03, and 04, respectively. Domain boundaries are shown in Fig. 1. There are 40 vertical layers based on eta coordinates from sea level to ~24 km in the model and are more concentrated nearer to the surface: 8 layers below ~1000 meters altitude, 13 layers below ~3000 meters altitude, and the remaining 27 layers distributed over the upper ~21 km.
The meteorological field for WRF was initiated with NCEP FNL Operational Model Global Tropospheric Analyses (horizontal resolution is 1° × 1° in latitude and longitude) (https://rda.uca r.edu/datasets/ds083.2/). Observation data assimilation (i.e., observation nudging), conducted within the WRF Preprocessing System, was used in domain 4 (i.e., Taiwan) to reduce the overestimation of wind speed stemming from Taiwan's complex topography. Four-dimensional data assimilation (FDDA) (i.e., grid nudging), conducted directly within WRF, was used in all domains 1 to 4. Both observation data assimilation and FDDA were set to nudge at a frequency of every 6 hours. All other settings for observation and grid nudging were set at the default values outlined in the Advanced Research WRF v3.9 user's guide.
For the emissions databases fed into CMAQ, MICS-ASIA III (Li et al., 2017) from an emission base year of 2010 was used for domains 1-3, the primary anthropogenic emissions areas in East Asia, while TEDS 10 (Taiwan Emission Data System, version 10) (TWEPA, 2019, https://teds.epa.gov.tw/), from an emission base year of 2016 was used for Domain 4. This study proportionally adjusted the base year pollutant emissions from 2010 to 2017 in China as described by Zhang et al. (2018), with explicit details provided in our previous studies (Griffith et al., 2020;Kong et al., 2021). For biogenic emissions, MEGAN version 2.1 (Model of Emissions of Gases and Aerosols from Nature, Guenther et al., 2012) was used for Domains 1-3, and BEIS3 version 3.09 (Biogenic Emission Inventory System, Vukovich and Pierce, 2002) was used for Domain 4.
Modeling performance of the resulting wind direction was judged against acceptable values of Wind Normalized Mean Bias (WNMB, ±10%) and Wind Normalized Mean Error (WNME, ≤ 30%), and wind speed was judged by Root Mean Square Error (RMSE, ≤ 3.0 m s -1 ): where

Adjusted Emission Inventory by OMI-NO2 Data
In order to further adjust the emission inventory, this study calculated the ratio of the OMI-    A is the OMI-NO2 data period from the simulation year. c B is the OMI-NO2 data period from the base year.
The data periods selected for the OMI-NO2 adjustment are aligned in the base and simulation years relative to the dates of Chinese New Year. In 2020, the LRT event occurred soon after the Chinese New Year while in 2018, the LRT event occurred a few weeks before the holiday. Thus, we placed the OMI averaging periods in a representative time period that spanned the relevant LRT event in the simulation year and yet remained completely before (i.e., 2018 event) or after (i.e., 2020 event) Chinese New Year Eve in both the base and simulation years. For Event 1, the OMI averaging period began on Chinese New Year Eve while for Event 2, the averaging period ended 8 days before the holiday (Table 1). In addition, to improve the data coverage for the satellite data, which can be limited due to cloud cover, we chose to extend the OMI averaging time period. The OMI-NO2 adjustment was applied to all anthropogenic emissions in the bottomup emissions inventory.

CAMS Reanalysis Data
To provide additional data sources for PM composition and complement our modeling concentrations, we extracted PM composition information from CAMS reanalysis data (https://www.ecmwf.int/en/forecasts/dataset/cams-global-reanalysis), as provided by European Centre for Medium-Range Weather Forecasts (ECMWF). CAMS data consists of 3-dimensional timeconsistent atmospheric composition fields, including aerosol and gas-phase species and is produced using 4DVar data assimilation in Cycle 42r1 of ECMWF's Integrated Forecasting System (IFS).

2020 LRT Event
OMI-adjustment of the emissions inventory during the 2020 long-range transport event at the beginning of the COVID-19 lockdown in China led to ~59% lower NOx emissions (and other pegged emissions) than the base emissions scenario, in the area associated with the event backtrajectories (Griffith et al., 2020). Thus, the overall emission reduction was similar to the 50% coarse adjustment applied in Griffith et al. (2020) and resulted in a similar PM2.5 concentration (~5% difference) arriving to Cape Fuguei at the northern tip of Taiwan from Jan. 28-Feb. 1, 2020. Fig. 2 shows the comparison of the hourly measured and modeled results during the 2020 event, with the PM2.5 arriving to Cape Fuguei ( Fig. 2(a)) significantly lowered, by a factor of 1.9 compared to the base emissions scenario. To note, the differences between the base emissions scenario in this study compared to Griffith et al. (2020) are due to observation-and grid-nudging of the meteorological data, which resulted in the change of the PM2.5 average concentration was 2.8% at Cape Fuguei during Jan. 28-Feb. 1, 2020. In terms of the aforementioned meteorological data, after nudging the WNMB, WNME and RMSE statistics based on the model vs. measured wind direction and wind speed values were all within the acceptable ranges.
Although Cape Fuguei is an excellent site to monitor the first encounter of a surface level LRT event in Taiwan, it is also of interest to monitor the performance downwind in the urban areas that populate northern Taiwan. Fig. 2(b) shows the measurement vs. modeling comparison at Banqiao located in the heavily populated Taipei basin. Similar to the Cape Fuguei site, modeled PM2.5 mass concentrations were dramatically reduced (factor of 1.9) by the OMI-adjustment of emissions. This gross improvement is most easily observable with the IOA statistic, which improved from 0.72 to 0.89 and from 0.51 to 0.82 for the Cape Fuguei and Banqiao sites, respectively. An aspect over-emphasized in the model was the PM2.5 diel variation at Banqiao, which suggested that local emissions were influential to the overall PM2.5 trend at the site; however, this was not readily identifiable in the observation data. This disagreement is evident in the MFB and MFE statistics near 100% in the assessment of the Banqiao dataset. Even the MFE for the OMI-adjusted data at both Cape Fuguei and Banqiao was beyond the acceptable range (Table 2), but this was due to the poorly characterized PM2.5 mass concentrations before and after the passage of the main plume. MFB and MFE statistics related to the specific 48-h period including this plume were dramatically improved to below 50%, highlighting the ability of the model to capture primary plume transport from China to Taiwan.

2018 LRT Event
A more challenging scenario to simulate is capitalizing on the OMI-adjustment for a more marginal change in emissions during a LRT event. Previous works have shown satellite-adjusted emission inventories can successfully drive monthly to annual measurement vs. modeling agreements  (Qiu et al., 2016;Liu et al., 2018;Zhao et al., 2018;Yang et al., 2021), while the 2020 case above is the first to show a successful modeling of long-range transported PM2.5 during a short event period. These same tools were then applied to a 2018 LRT event to Taiwan characterized by similar backward trajectories as the 2020 event, but were more consistently at a lower altitude (Griffith et al., 2020). Wind direction and wind speed during this 2018 LRT event were reasonably well replicated by the model, except for the wind speed prediction at Cape Fuguei, which overpredicted the wind speed and yielded an RMSE = 4.0. OMI adjustment of emissions during the 2018 event led to an ~17% increase in pollutant emissions for the emission profile of eastern China. Model PM2.5 spatial distribution maps of the 2020 and 2018 events exhibited a similar characteristic of a high PM2.5 air mass jutting out from eastern China and then swooping down and passing through northern Taiwan (Figs. S1 and S2). Fig. 3 shows the measurement vs. modeling comparison of PM2.5 mass concentrations arriving to Cape Fuguei and Banqiao during the 2018 LRT event. The primary plume during this event spanned more than 24 hrs and started from the evening of Jan. 31, 2018. However, both the base and OMI-adjusted modeling scenarios only effectively captured the second half of this plume at Cape Fuguei ( Fig. 3(a)). Importantly, the OMI-adjusted inventory did yield a net increase in pollutant emissions in eastern China, which then led to a ~28% increase in PM2.5 mass concentrations at Cape Fuguei on the afternoon (13-19 h) of Feb. 1, 2018. Overall, the statistical performance of OMI-adjusted scenario at Cape Fuguei was better than the base case with improvements in the MFB, MFE and IOA statistics (Table 2); however, even in the OMI-adjusted scenario, the MFB and MFE were well out of the acceptable range. This was still the case when only accounting for the period from Jan. 31, 2018 at 12:00 LT to Feb. 2, 2018 at 12:00 LT, implicating the stark underestimation of PM2.5 until midday on Feb. 1, 2018 for the poor statistical representation. This underestimation of PM2.5 during the first half of the LRT plume originated upstream as shown in Fig. S3 where PM2.5 mass concentrations in Hangzhou and Shanghai, both in eastern China (Fig. 1), also were underestimated (by factors of 1.8 and 1.5) in the model during the days leading up to the plume arrival in Taiwan. Moreover, by examining Fig. S2, the characteristic East Asia spillover of anthropogenic pollution during northeast monsoon conditions was not evident at 0000 on Feb. 1, 2018 when the measurements were registering the initial arrival of the plume. This absence suggests there were also some upstream meteorological characteristics not captured by the model; rather the arrival of the plume to Cape Fuguei based on model simulations begins ~6 hours later than the measurements. Although it appears that two PM2.5 concentration peaks characterize the 2018 event at Cape Fuguei ( Fig. 3(a)), it was likely a broad peak that was only captured by the model for approximately half of its duration. Nevertheless, the LRT plume was modeled to relative success, earning good correlation coefficient (0.82) and IOA (0.79) statistics. The simulations of the 2018 event arriving at Banqiao reproduced the diel cycles of PM2.5 concentrations reasonably well (r = 0.76 for the OMI-adjusted model), but the OMI-adjusted model yielded worse MFB, MFE, and IOA statistics than the base emissions scenario. This may have been influenced by the mildly misestimated wind direction (WNMB = -6.9%, WNME = 14.2) and wind speed (RMSE = 2.0) during the LRT period. Based on the reduced wind speed measured at Banqiao during the 2018 event (compared to those at Cape Fuguei), there may have been an impact from the mountains that partially block the north side of the Taipei basin, a topographical nuance that is difficult for the model to capture. This is further supported by the ~40% decrease in PM2.5 concentrations from Cape Fuguei to Banqiao during the LRT plume. Thus, the misestimated wind field may have led to a greater influence from transported PM2.5 in the model on top of the locally driven PM2.5 variations. Since the OMI-adjusted scenario mildly increased emissions over the base scenario, we may expect the OMI-adjusted model to exhibit worse statistics during the 2018 event at the Banqiao site.
Rather than simply making pairwise comparisons at each timepoint, conducting distributional comparisons of the model and measured data has also been suggested in other studies (Dennis et al., 2010). This may be particularly relevant for our study because we have so few measurement sites included in the model evaluation; thus, imprecision in the meteorology upstream of the receptor sites could readily lead to spatially and temporally misaligned data points between the model and measurements. Compared to Table 2 in the main text, using the distributional comparisons (Table S1), the relative differences in the base model and OMI-adjusted model were still the same, but the magnitude of some key statistics distinctly changed. For instance, r and IOA were both higher by 0.08-0.19 and 0.02-0.10, respectively, for both models in both years compared to the temporally aligned data. On the other hand, MFB and MFE revealed more variable changes in comparison to using the temporally-aligned data with primary improvement occurring in the MFE for Banqiao simulations of the 2018 event (improved from 53.5% to -38.6% for the OMI-adjusted model). Nevertheless, the conclusions based on this operational statistical evaluation did not change.

PM2.5 Chemical Composition Comparison
In collaboration with other researchers who were collecting 24-h integrated samples of PM composition at Cape Fuguei and Banqiao, we were able to further evaluate the performance of the model and OMI-adjusted emission inventory method. To note, there were no PM composition measurements available for the 2020 LRT event at Cape Fuguei, so this evaluation is focused on the 2018 event at Cape Fuguei and both the 2018 and 2020 events while passing through Banqiao. Fig. 4 places the modeled PM2.5 composition during the 2018 event side-by-side with the measured composition values at Cape Fuguei, all representing 24-h averages centered around 8:00 pm. Underestimation of the PM2.5 mass concentration was expected as observed in Fig. 3(a), and now is attributed to non-sea salt sulfate (nss-SO4 2-, calculated as [SO4 2-] -0.038 × [Na + ]), NH4 + , and sea-salt ion underestimations in the 24 hours (daily average centered at 2000 on Jan. 30, 2018) preceding the plume and nss-SO4 2and undefined species during the plume passage. During the arrival of the plume (daily average centered at 2000 on Jan. 31, 2018), Fig. 3 showed that the model completely missed the incoming PM2.5 concentrations, and from Fig. 4, we know these were mixed between the sea-salt laden concentrations observed on Jan. 30 and the anthropogenic pollution (nss-SO4 2and OM) observed later on. Finally, during the third daily average (centered at 2000 on Feb. 1, 2018), severe underestimations, more than a factor of 3, by the model were still relevant until 1200 on Feb. 1 (Fig. 3(b)) and likely dramatically affected the model vs. measured PM composition agreement. As mentioned, this period of severe underestimation was likely due to some poorly captured meteorological development that delayed arrival of the plume in the model. To note, the OMI-adjusted model decreased the underestimation by the model on all days during the 2018 event, thus this misrepresentation was not compounded by applying the OMI adjustment. In addition, the model overestimated the nitrate concentrations (OMI-adjusted model overestimated by a factor of 11) during the plume passage through Cape Fuguei. The OMI-adjusted model primarily increased the nitrate (19%) and organic matter (11%) concentrations. The estimation of organic matter from the measurements was based on a OCto-OM conversion factor of 1.8, while the model was estimating a factor close to 2.0, thus some of the 'undefined' component in the measured composition could be considered OM, and another portion metals.
In Banqiao, modeled PM composition values (24-h average) more closely matched the observations (24-h integrated measurements from 0000 to 2400) during both the 2018 and 2020 LRT events than at Cape Fuguei (Fig. 5), even though the PM2.5 mass concentration was generally overestimated (Fig 3). Again, the nss-SO4 2concentrations were underestimated (~50%) by both the base and OMI-adjusted models for the 2018 event; OMI-adjustment led to increased OM and undefined components, but little change in nss-SO4 2-, similar to that observed at Cape Fuguei (Fig. 4). OM concentrations estimated from OC measurements at Banqiao should be considered an upper limit as the OC to OM conversion factor of 1.8 is not as reasonable in a high emission urban area, even though the conversion factor is still lower than that used in the model. On the other hand, nitrate at Banqiao during the 2018 event was less overestimated than at Cape Fuguei in part due to local NOx emissions contributing to PM2.5 nitrate at the site. To note, as the samplers at Cape Fuguei didn't measure the re-volatilized nitrate, this could have influenced the larger nitrate overestimation there; however, re-volatilized nitrate at Banqiao only contributed up to ~10% of total PM2.5 nitrate. The PM composition comparison of the 2020 event at Banqiao was only on Jan. 28, two days before the main plume passage. The base model overpredicted all the secondary inorganic components (NO3 -, nss-SO4 2-, and NH4 + ) along with organic matter, whereas the OMI-adjustment reduced all of these contributions to within 10-50% of measured levels and primarily overestimated the PM2.5 due to undefined components. Thus, we have found OMI adjustment of the bottom-up inventory is capable of tracking PM2.5 mass concentration and composition changes during major shifts in emissions, but the performance could be improved for more nuanced perturbations.
The generally underpredicted nss-SO4 2-(0.2-0.4) and overpredicted NO3 -(1.0-11) at Cape Fuguei lead to distorted PM2.5 nss-SO4 2-/NO3ratios, possibly misrepresenting an important indicator for changing emissions in China. While an effective metric for tracking long-term emission changes in East Asia in various media (Itahashi et al., 2014), the nss-SO4 2-/NO3ratio can also be used to check shorter term emission changes from LRT event to LRT event. However, for the 2018 LRT event at Cape Fuguei, the OMI-adjusted scenario nss-SO4 2-/NO3ratio (0.4) was 26 times lower than the measured nss-SO4 2-/NO3ratio (10.4), while at Banqiao these differences were less (only 2.5 and 1.4 times smaller than 2018 and 2020 measurements) largely due to local NOx emission impacts better captured in the model. In the case of nitrate, overestimated NO2 emissions have been noted in other modeling studies comparing top-down and bottom-up inventories Yang et al., 2021), with differences attributed to major reductions in provincial level emissions that are not updated in regional inventories. For evaluating the consistently underpredicted nss-SO4 2during the LRT events in the absence of source area or high-resolution receptor PM composition measurements, CAMS reanalysis data of the PM2.5 nss-SO4 2concentrations at 1000 hPa are shown in Fig. 6. Although the absolute CAMS sulfate concentrations at a particular pressure level should not be heavily relied upon, the relative amounts over time may prove functional when studying LRT events. To note, comparison of PM10 SO4 2from the OMI-adjusted model with CAMS sulfate in the source and receptor areas yielded a range in agreement from R 2 = 0.2 to 0.6. Of particular note, over the source area (site: Hangzhou) modeled PM10 SO4 2was in good agreement (not shown here) with CAMS SO4 2during the 2020 event but lower than CAMS SO4 2during the 2018 event. This corresponds to the underprediction of PM2.5 mass concentrations in the source area noted above during the 2018 event. The CAMS time series values at northern Taiwan corresponded well with the primary PM2.5 plume passage measured at Cape Fuguei in 2018 (Fig. 3), and were in general agreement with measured PM2.5 nss-SO4 2values ( Fig. 5) (there were no PM composition values during the primary peak of the 2020 event). In both events, large nss-SO4 2concentrations originated in eastern China (the 2020 event was associated with a more northerly trajectory than in 2018) and then migrated  Griffith et al., 2020). Spatial distributions are averages of the 0000 UTC values for the specified dates.
southeastward and swiped northern Taiwan as illustrated by the simulations in this study (Figs. S1 and S2). The plume in the upstream location during the 2020 LRT event preceded arrival to northern Taiwan by only about 24 h, while this lag was ~ 48 h during the 2018 event, likely due to lower surface wind speeds during the latter case. To better track SO2 emissions, Liu et al. (2018) sought individual SO2 point sources through OMI retrievals that were not recorded in bottom-up emission inventories and incorporated these into the HTAP emission inventory, improving the NMB considerably in the model. Identifying MICS-Asia III inventory-unincorporated SO2 point sources in East Asia and adjusting NOx emissions according to smaller scale inventories may serve to bring the measured and model PM2.5 nss-SO4 2-/NO3into agreement and highlight the importance of northern Taiwan sites for tracking LRT pollution from East Asia.

Uncertainties and Limitations
For chemical transport models like CMAQ, there are three main areas that introduce uncertainties: (i) the meteorological data used to constrain the meteorological module (e.g. NCEP/FNL reanalysis data was accessed by WRF every 6 hours), (ii) the emissions inventory of the air quality model (e.g., MICS-ASIA III and TEDS 10.0), which was also adjusted by a ratio of OMI-NO2 values in our case, and (iii) the numerical modules (e.g. WRF for meteorology and CMAQ for air quality chemistry and physics).
Evaluations of NCEP/FNL data have been conducted in other studies (Liu et al., 2021) and have shown certain under-or overestimations of observed data, but these are variable depending on the location, time, and the polluted/pristine conditions. We did not do a specific evaluation of the reanalysis data time periods utilized in this study, and thus is a limitation of this work.
Emissions inventories have inherent uncertainties as they are simply not constructed with a high time resolution nor are updated on a frequent enough basis to adequately address short time periods as focused on in this work. Adjustment of the base emission inventories in this work by the OMI-NO2 values during baseline and event periods, then propagate that initial uncertainty of the base emissions through to the OMI-adjusted value. Although uncertainties attached to those base emission inventories are important, particularly MICS-ASIA III since it was associated with the areas contributing the majority of the pollution during the LRT events in this study, the Domains 1-3 base emissions were adopted from previous studies, which established these emissions as relevant during the base emission years (Zhang et al., 2018). Use of satellite data, a column measurement, to adjust surface level emissions of course introduces additional uncertainties. As the primary areas of emission adjustments are heavily urbanized locations in East Asia, the vast majority of NO2 should exist at the surface, thus making this application feasible. Another limitation is that we used differences in NO2 measurements to adjust all of the other anthropogenic emissions in the model. Utilizing OMI-SO2 values would be another approach to consider to in tandem with the OMI-NO2 adjustments for future revisions of this method.
In a dynamic evaluation of our data (Dennis et al., 2010), we did conduct additional sensitivity tests with the emission inventories by turning either the MICS-ASIA III or TEDS 10.0 emissions off, effectively isolating the effects from transboundary pollution from East Asia or local pollution from Taiwan (Fig. S4). At Cape Fuguei, these plots confirm that nearly 100% of the PM2.5 during the study periods was due to transboundary pollution from East Asia (Figs. S4(a) and S4(c)), thus reaffirming that the poor agreement with the measurements in 2018 was due to upstream emissions, the upstream NCEP/FNL data, or upstream chemical/physical processes. At Banqiao, clearly local emissions played a key role, particularly during the times before and after the major plume passages as revealed by turning off East Asia emissions (MICS-ASIA III) led to near full agreement with the complete OMI-adjustment scenario during the 2018 event ( Fig. S4(d)). On the other hand, the 2020 event at Banqiao was impacted by both transboundary and local pollution for the majority of the study period ( Fig. S4(b)).
In this study, the numerical modules were effectively treated as black boxes; we did not conduct a diagnostic evaluation of our model and tweak any of the processes directly (Dennis et al., 2010). We applied some observation and grid nudging in the WPS and WRF, respectively, as was noted in Sec. 2.3, but did not actually modify any processes such as meteorological physics, chemical mechanisms, deposition, etc. This is a limitation of this study since we did not explore any of these in relation to the LRT events discussed here. For instance, perhaps excessive deposition in the model led to some of the underestimations observed, particularly during the 2018 event. Another limitation of the study related to the numerical modules was that the nudging was only conducted every 6 hours; clearly if this were forced to a shorter timescale, the uncertainty in the WRF module would be reduced but at greater computational cost. On the other hand, not exploring those processes allowed us to focus on the adjustment of the emissions inventory.
Probabilistic evaluations more closely address uncertainties in the model than the operational, diagnostic, and dynamic evaluations discussed in this section (Dennis et al., 2010). Instead of generating model output that is reflective of average meteorological and concentration conditions (characteristic of the other evaluations), probabilistic evaluation generates an ensemble average and represents the variance in the model output. This can be accomplished by applying a Monte Carlo approach to select values from within a probability distribution function (pdf) for different constraints. For instance, the NCEP/FNL data in our case could have a pdf assigned to it based on under-or overestimations of the data detailed in other studies, thus producing an ensemble of output values for which to then check if the observations at the receptor site fall within. Of course, this approach is computationally expensive and was beyond the scope of the current study, but should be considered for a more comprehensive LRT modeling study in East Asia.

CONCLUSIONS
This study aimed to employ a simple method using OMI satellite data to adjust a bottom-up emission inventory and improve the modeling of long-range transport events in East Asia. 1) During two LRT events that originated in China and impacted Taiwan, OMI-NO2 adjustments resulted in eastern China emissions changes of -59% and +17% on average compared to the same time period in 2017 (base emissions year) 2) Evaluation of the OMI-adjusted model simulations based on comparison with receptor area observations revealed excellent MFB (-13.9% and 18.5%), MFE (20.1% and 21.5%) model vs. measured value statistics for two sites in northern Taiwan during one of the events. These values were particularly notable based on the short timeframe of LRT events. Modeling of the other LRT event did not exhibit as impressive of MFB (-91.5% and 41.6%) and MFE (93.4% and 52.3%) statistics, although still showed improvement at one site when comparing to the base model emission scenario. 3) PM composition values from the model were evaluated against observed concentrations.
Although some agreement was found during non-peak plume passage or at an urban site, nss-sulfate was underestimated (by ~70%) and nitrate overestimated (up to ~1000%) at a remote site, even after OMI-adjustment. These trends lead to distorted modeled nss-SO4 2-/NO3ratios upon arrival to Taiwan, which is otherwise an effective indicator for emission changes in source regions. Unlike previous efforts to incorporate satellite-based measurements into emission inventories, which focused on longer term average comparisons across months or year, we sought to simulate short-term events that can dramatically change the pollution levels in Taiwan. Outdated NO2 emission inventories in China and unaccounted for SO2 sources upstream of Taiwan are thought to be the primary reasons for the discrepancies observed in this study. However, there were a number of areas left unexplored that could have also contributed to the measured vs. modeled differences, including a diagnostic evaluation of the chemical and physical processes interacting with the anthropogenic plumes in WRF and CMAQ. In the future, in addition to optimization of OMI-adjustment of anthropogenic emission inventories, we can further apply satellite-based adjustment methods to simulate other types of LRT events including dust storms and biomass burning, and could even apply it to air quality forecasting to obtain better modeling results.