Developing High Spatial Resolution Concentration Maps Using Mobile Air Quality Measurements

Mobile air pollution monitoring offers an opportunity to “map” pollutants with much higher spatial resolution than sparse stationary monitors. We develop a framework to address the challenges and constraints to developing higher spatial resolution maps from mobile data. The challenges include the non-uniform spatial resolution and distribution of the measurements; that measurements are made at slightly different locations in each pass of the mobile monitoring platform along a specific route (each “run”); in some cases, the poor precision of global positioning system coordinate data; potential for over/underweighting data; and varying urban background concentrations. We find that use of a reference grid and piecewise cubic Hermite spline interpolation between measurements to give equal weight to each sampling “run” at each grid reference point addresses many of the challenges effectively. A background correction was implemented to facilitate averaging over several sessions. For 1 s time resolution data collected at normal city driving speeds, we show that concentration maps of 5 m spatial resolution can be obtained, by including up to 21% interpolated values. Finally, we use ultrafine particle concentrations to consider the minimum number of sampling runs needed to make a representative concentration map with a specific spatial resolution, finding that generally between 15 to 21 repeats of a particular route under similar traffic and meteorological conditions is sufficient. The concentration maps can afford insights into factors influencing pollutant concentrations at the city block and sub-block scale; information that is useful in urban planning strategies to reduce pollution exposure. Methodical analysis of mobile monitoring data will facilitate meaningful comparison of concentration maps of different routes/studies.


INTRODUCTION
Roadway combustion emits a suite of air pollutants including fine (PM 2.5 ; particle diameter less than 2.5 µm) and ultrafine (PM 0.1, UFP; particle diameter less than 0.1 µm) particles; carbon monoxide (CO); nitrogen oxides (NO x ); black carbon (BC); polycyclic aromatic hydrocarbons (PAHs) and volatile organic compounds (VOC) including benzene.Numerous air quality studies show that compared to urban background levels concentrations of these pollutants are elevated, and in some cases highly elevated on and near heavily trafficked roadways.
Many epidemiological studies have associated elevated concentrations of air pollutants found on and near roadways with a variety of adverse health outcomes, including asthma and other respiratory diseases (Rice et al., 2015), birth and developmental defects (Stingone et al., 2014a), premature mortality (Caiazzo et al., 2013), cardiovascular diseases (Katsoulis et al., 2014) and childhood cancers such as leukemia (Boothe et al., 2014).Children appear to be particularly vulnerable to these adverse effects of air pollution (HEI, 2010).An increasing number of air quality studies use mobile air pollution monitoring.Mobile measurements have several advantages over conventional stationary measurements, including the opportunity for better data coverage, efficient collection of data in close proximity to sources and logistical efficiency (Hagler et al., 2012;Hu et al., 2012;Birmili et al., 2013;Choi et al., 2013;Peters et al., 2013;Van Poppel et al., 2013;Brantley et al., 2014;Lähde et al., 2014).Although mobile measurement data can be highly spatially resolved, they are not always presented as high spatial resolution concentration maps.Many studies have presented either data statistics or aggregated data for streets or route segments (Hagler et al., 2012;Hu et al., 2012;Choi et al., 2013;Peters et al., 2013;Van Poppel et al., 2013).Taking advantage of the high spatial resolution of the data offers potential to identify spatial variations and local air pollution hot spots at sub-block scale resolution, which in turn can provide exposure estimates for near-road communities, pedestrians and transit users to elevated levels of pollution near roadways.A detailed understanding of exposure risks could influence urban planning strategies such as placing transit stops and businesses with outdoor seating away from local pollution hot-spots as well as behavioral changes such as choosing walking routes with minimal pollution exposure.
Recently several studies have presented concentration maps of mobile measurement data (Pirjola et al., 2012;Padró-Martínez et al., 2012;Brimili et al., 2013;Brantley et al., 2014;Lähde et al., 2014;Pattinson et al., 2014;Peters et al., 2014).Pirjola et al. (2012), Lähde et al. (2014) and Padró-Martínez et al. (2012) do not state the spatial resolution of their maps, and provide little description of how the maps were developed; Padró-Martínez et al. (2012) utilized ArcGIS 9.3.1 to develop concentration maps for individual runs.Brantley et al. (2014) and Pattinson et al. (2014) used 1 s time resolution data together with ArcGIS (ESRI) to map the median values of sets of runs and produced 50 m spatial resolution maps.The data along the measurement routes were binned into segments of 50 m and circles of 50 m diameter by Brantley et al. (2014) and Pattinson et al. (2014), respectively.Birmili et al. (2013) took an important step towards preserving the high spatial resolution of the 10 s time resolution mobile monitoring data collected by walking, a configuration that produced data with one measurement roughly each 8 m.They found systematic divergence of global positioning system (GPS) coordinate data from the walking path, and after correcting for this divergence, allocated data into ± 5 m horizontal segments.Their resulting UFP number concentration measurements were presented as maps with 10 m spatial resolution, where each data point was the geometric average of the data from 38 runs.Peters et al. (2014) used a large set of 1 s time resolution data collected using a bicycle as a mobile monitoring platform (MMP), a configuration that produced data with roughly one measurement each 3.2 m.The GPS data were corrected using linear interpolation for short time periods when the GPS data were missing.Data were then projected on to the street based on shortest distance between the data point and the cycling route.The measured particle concentrations were spatially aggregated based on a Gaussian weighting function to produce concentration maps with 10 m spatial resolution.
The main issues encountered in developing high spatial resolution concentration maps from mobile monitoring data are (i) the non-uniform spatial resolution and distribution of the measurements; (ii) that measurements are made at slightly different locations in each measurement pass along a specific route (in each "run"); and in some cases (iii) errors in GPS coordinate data, which are common in dense urban areas with tall buildings.Additionally, there is the question of how much data is sufficient to provide reliable profiles.
The selection of the spatial scale for data aggregation and averaging must be informed by the spatial resolution of the collected data.In practice, the spatial resolution of the collected mobile measurement data is a combined result of the instantaneous speed of the MMP and the sampling rate/time resolution of the instruments used.Unlike for mobile measurements collected by walking or cycling for which the speed of MMP can be kept somewhat uniform, for motor vehicle MMPs large irregularities in the travel speed are unavoidable, especially in urban traffic.Consequently, even when instruments sample at a constant rate, the spatial resolution of data can be non-uniform, making the production of high resolution concentration maps challenging.When the time resolution of the data set is varied due to the nature of either the instrument or the post-data-processing procedures, determining the spatial resolution of collected mobile measurement data becomes more complex.An example of introduced spatio-temporal variability is the BC measurements made with micro-aethalometers, which can use an optimized noise-reduction averaging algorithm (ONA) that prescribes a varying averaging time.(Van den Bossche et al., 2015).However, because variations due to post-data-processing procedures is rare, the spatial resolution of mobile measurement data and consequently the highest possible spatial resolution of mobile measurement concentration maps are often limited by the combination of the data sampling rate and the instantaneous speed of the MMP.Here we propose solutions to the above mentioned mobile air quality data processing issues using a reference grid to map data points, and a piecewise cubic Hermite spline interpolation between measurements to give equal weight to each sampling run at each grid reference point.A background correction can be used to facilitate averaging concentration data over different days/times.Finally, we address the issue of how many repetitions of mobile measurement runs are needed to make representative UFP concentration maps with high spatial resolution.

MEASUREMENTS
The data used in this study are from a field campaign conducted in and around Downtown Los Angeles (DTLA).This campaign is described in more detail in Choi et al. (2016) and is only briefly described here.The site considered here is a 2-by-2 block area centered on the intersection of Broadway and 7 th Street (BW-7 th ) (34°2'42.70"N/118°15'12.23"W).The area consists of densely packed commercial buildings and parking structures, and a fairly typical street canyon configuration with a building height/street width (aspect ratio) of about 1.7.The block lengths are 190 m and 100 m and the street widths are 26 m and 22 m for BW and 7 th streets, respectively.The BW-7 th intersection has a tall building at each of the four corners, with median height of 46 m.The mean building height for the site area is 34 m and the building heights range 3-60 m (Fig. 1).
Measurements of several traffic-related air pollutants including UFP number and size distribution, PM 2.5 , black carbon, particle bound poly-aromatic hydrocarbons (PB-PAH), NO, CO, and CO 2 were collected using fast response   (Choi et al., 2013 (S3)).
instruments fitted inside a MMP, an electric sub-SUV free from self-pollution (Table 1).The instruments in the MMP have different response times due to the characteristics of the instruments and differences in inlet length and flow rates.Air was drawn through a 6'' diameter galvanized steel manifold installed through a window of the rear passenger space located 1.5 m above ground level Sampling ports for each instrument were located in the middle of the manifold with short (0.5-2 m) sampling tubing (1/4" Teflon for gases and 1/4" conductive tubing for particles and 1/2" conductive tubing for FMPS).For each instrument, flow and zero checks were performed before and after each measurement session.To account for any slight day-to-day differences in response time, a time-lag correlation method was used in post-data processing to synchronize the response time of the instruments (Choi et al., 2012).Concentration data and MMP position data were recorded at 1 s time resolution.A complete description of the MMP calibration procedures is available in Hu et al. (2009).
The MMP was driven multiple times along a fixed route (Fig. 1).Each day two data collection sessions were conducted, one in the morning (07:00-10:00) and one in the afternoon (14:00-17:00) with 6-9 and 6-7 runs in the morning and afternoon, respectively (Table 2).MMP turns at intersections were recorded in a time log.The MMP was parked intermittently at various locations for 5 min periods to collect meteorological data from its sonic anemometer mounted on the roof of the MMP.In each measurement session, video recordings of traffic were made at the central intersection using cameras mounted at each of the four corners of the intersection.Detailed information on the traffic signal light changes and traffic counts for all four traffic flow directions were obtained manually by reviewing the video records.All data processing was done in MATLAB R2012a (The Mathworks, Inc.).

DATA ANALYSIS METHODOLOGY
The UFP number concentrations measured using the CPC had the lowest response time (Table 1) and consequently highest spatial resolution, making it the best data set for concentration variations at a high spatial resolution.Here we present the data analysis methodology using CPC measured UFP number concentrations data set.Results for other pollutants data sets are presented in supplementary materials (S1).

Separation of the High-Emitting Vehicle Contribution
Transient concentration spikes from high-emitting vehicles (HEV) are common on roadways.HEV encounters have a strong stochastic element and the associated concentration spikes can deviate as much as 1-2 orders of magnitude above the baseline.As HEV data have the potential to obscure general trends in concentration maps, depending on the question being asked, it may be desirable to remove them.
Here HEV encounters occurred 10 to 14% of the time, depending on the session, defined as follows.To identify HEV-related spikes, a threshold concentration value must be defined.The method developed by Choi et al. (2013) was used here, in which a site-and session-specific threshold was calculated.The threshold concentration value was defined as the baseline plus three standard deviations of the baseline.First, we subtracted a baseline calculated using a robust smoothing function that employs a local regression of weighted linear least squares and a 2 nd degree polynomial model.The smoothing function assigns lower weight to outliers in the regression and assigns zero weight to data outside six mean absolute deviations.Next, the standard deviation of the baseline-subtracted concentrations was calculated.Then all concentrations above this intermediate threshold of three standard deviations were identified as HEV spikes and removed from the data set.A new standard deviation was calculated from the remaining concentration data, resulting in a new threshold.The process was iterated eight times, until the threshold value converged to a constant value.All concentration values above the calculated final threshold were replaced by the baseline concentration values to obtain the final HEV spike-removed concentration time series used for certain analyses.

Correction of Geographical Data
Handheld GPS units are able to obtain coordinates with a horizontal accuracy of approximately 3-5 m when the unit can receive a wide area augmentation system (WAAS) signal.Otherwise, the accuracy is approximately 10-15 m.In urban settings with tall buildings, shadowing effects can result in poor reception of satellite signals (Misra and Enge, 2006) and further decrease position accuracy.In our dataset the GPS data diverged by up to about 30 m from the roadway at times, almost exclusively under slow moving or stationary conditions.This is similar to the divergence reported by Birmili et al. (2013).Often near street corners, the GPS data appeared in clusters or gave sets of position data implying backward movement (Fig. 2).
GPS data were corrected as follows.A time log of the instant the MMP turned at each intersection was used to divide both concentration and position data time series into street segments.Clusters of position data implying backward movement were identified by comparing GPS data to latitude/longitude values along the driving route.As these 'wandering' data clusters were most pronounced under slow moving or stationary conditions, all the concentrations associated with such a set of positions were attributed to the last position which showed a forward movement.

Handling Non-Uniform Spatial Resolution of Measurements
To achieve the objective of developing representative mean concentration maps with high spatial resolution, data averaging should be performed in a manner that does not overweight or underweight any of the data.Initially we adapted an areal average (Hagler et al., 2010;Pattinson et al., 2014) in which values with in a fixed radius were averaged for data points along the route.For example, if a radius of 4 m was chosen, all points within an 8 m diameter circles were averaged and assigned to the center of the circle.This approach proved to be erroneous at small spatial scales because a large number of data points associated with large divergences of GPS data were excluded.
In a novel approach to handling the non-uniform spatial resolution of mobile monitoring data, we first constructed reference lines to provide a framework with which to organize the data.For one way streets, the reference lines were assigned to the mid-line of the street, and for two lane streets the reference lines were assigned to the mid-line of all the lanes in a single direction.These reference lines provided the framework and acted as placeholders to produce concentration maps at different spatial resolutions.
As described earlier, in practice the spatial resolution of the concentration data is related to the instantaneous speed of the MMP.The average speed of the MMP was about 3 m s -1 for all sessions, with a standard deviation of 2.9 m s -1 and 3.4 m s -1 for morning and afternoon sessions, respectively.This average MMP speed is comparable to the 3.2 m s -1 reported by Peters et al. (2014) for bicycling (with 1 s resolution instruments) but is much faster than the 0.8 m s -1 reported by Birmili et al. (2013) for walking (with 10 s resolution instruments).Fig. 3 shows that about 80% of the time the MMP traveled at speeds below 5 m s -1 ; corresponding to a spatial resolution at or above 5 m.However, lower spatial resolution data tend to be clustered in certain areas, such as low trafficked streets and at the middles of the blocks.Based on this understanding of the variation of the spatial resolution of data, we decided to look at spatial scales ranging from 2 m to 40 m and investigated the implications of the choice of spatial scale in data aggregation and averaging.
After the correction of GPS data, each data value for each run was assigned to the closest line reference point along a particular street.After this step, runs typically have some line reference points with no assigned values and others with many values.In cases where multiple data values were assigned to a reference point for an individual run, the Fig. 3. Cumulative fraction of the binned instantaneous speed of the mobile monitoring platform, calculated for the whole dataset including both morning and afternoon sessions.mean of the assigned values was calculated.The rationale for averaging these data points is to not to over-weight the MMP stops.In the dense urban area of our study, street lights spend equal amounts of time as red and green (with a brief yellow phase), but the MMP naturally collects more data while stopped at a red light, sometimes as much as 50 times the data collected when it passes through an intersection on a green light.Thus averaging the extended measurements from the red phase gives appropriate weighting to the green and red phases.Not surprisingly, as the spatial resolution of the reference lines was increased, the number of reference points with no assigned data values increased.These empty reference points present a problem because they cause overweighting of the runs that did produce data at a particular reference point.This problem can have a dramatic effect, producing plots that are very noisy.Over-weighting a run is particularly concerning if that run is influenced by a transient emission event.Further, as the urban background concentration (see definition in Estimation of the Urban Background) varies throughout a measurement session, a run that is missing data at a given reference point results in a temporal bias toward the time periods in which runs are available at that point.
As a solution to these biasing issues, to assure the availability of a concentration value at each reference point for each run, the concentration data within individual runs were interpolated at points where a data value was missing for that run.For this interpolation the Piecewise-Cubic-Hermite-Interpolation scheme (PCHIP) was selected.PCHIP uses a third degree polynomial specified in Hermite form to produce a smooth continuous function of the concentration time series (Fritsch and Carlson, 1980).The piecewise calculation not only keeps the calculation 'local' using only four neighboring data points, but also avoids the oscillations in the interpolated data that are associated with higher order polynomial interpolations.Fig. 4 show that the PCHIP scheme preserves the concentration time series well for individual runs.An investigation of the percentage of interpolated data values required at different spatial scales shows that concentration maps of 4 m spatial resolution can be obtained by including 32% interpolated values; this number falls to 4% for 10 m spatial resolution (Fig. 5).The percentage of interpolated data allowed suggests an upperbound for the spatial resolution of concentration maps.Considering the horizontal accuracy of the GPS position data is about 3-5 m and that the PCHIP scheme preserves the within-run concentration time series well, we choose to allow up to 21% of values to be interpolated, and present concentration maps with 5 m spatial resolution.Once each run has an appropriate single data value assigned at each reference point, the runs are averaged together to create the desired concentration maps.

Estimation of the Urban Background
Day to day and within several hours on the same day, average pollution concentrations often move up and down by a factor of two or more, due to large scale phenomena such as mixing height and turbulence intensity, as well as general traffic trends.These variations in the urban background must be accounted for prior to averaging data from different sessions and days.The urban background can be defined as the ambient air pollution concentration that does  spline over the minima of reasonably narrow time windows, and that the choice of a 5 min or 10 min time window does not make a significant difference.Their study showed that for UFP concentrations, the time series-based background estimates using a spline over minimums (after removing background zones from the time series) and the locationbased estimate of the median of a background zone were in good agreement.
Here we obtained a time series-based background estimate by fitting a smooth function to the minimum values in 10 min time windows for each morning and afternoon session.By subtracting the spline of minimum values from the measured concentration values, the background-subtracted concentration time series were obtained.This resulted in approximately 6.8% negative values in the concentration time series, which is comparable to the 7.3% reported by Van Poppel et al. (2013) for a location based background correction method.Background-subtracted time series from different days can be averaged to produce mean concentration maps to probe spatial variations.However, such maps do not represent the measured concentration values, only their variability.To address this issue, a representative urban background can be added back to provide exposure estimates.The percentage contribution of the background to the measured concentration, calculated using the ratio of mean of background time series to the mean of measured time series, was 33% and 26% for morning and afternoon sessions, respectively.This is comparable with an all session average of 26% reported by Brantley et al. (2014).In order to make the background corrected concentrations representative of the measurements, we averaged the estimated background splines for different sessions and added the resulting mean background spline to each background-subtracted concentration time series to obtain the background-adjusted concentration time series.These background-adjusted concentration time series from different days were used together with corrected GPS data to assign concentration values to the reference grid points and then data within individual runs were interpolated at points where a data value was missing for that run, as described in Handling Non-Uniform Spatial Resolution of Measurements.Next, all runs from several days were averaged (morning and afternoon sessions separately) to obtain the mean concentration maps (Fig. 6).

High Spatial Resolution Concentration Maps
The 5 m spatial resolution maps shown in Fig. 6 are the result of careful consideration of several underlying data processing issues of mobile monitoring data.With the use of a background correction, we were able to average data over sessions from different days, and thus over a higher number of runs.After averaging data over varying effects of micro-meteorology, traffic volume, traffic fleet composition and background concentrations over different measurement sessions and different days, resulting UFP concentrations maps retain the robust block and sub-block scale features of the concentration variation, making them a potentially useful tool in identifying pollution hot spots at the block or sub-block scale.
Fig. 6 shows the UFP concentration maps at 5 m spatial resolution for the full data set including HEV-related spikes ("raw", Figs.6(a) and 6(c)) and also for the data with HEVrelated spikes removed ("spikes removed", Figs.6(b) and 6(d)).The dominant feature of the "raw" concentration maps are the 'hot spots' that appear at and near intersections, including both the area where queues form and where vehicles accelerate away from intersections.Once the HEV related spikes are removed, features appear that reveal more about the influence of the built environment on street level concentrations.While "raw" concentration maps are important in exposure analysis, maps with HEV spikes removed help understand various other factors influencing small spatial scale variations of the UFP concentration.
The "spikes removed" data reveal features at both the block-and sub-block scales.Fig. 6(d) shows that at the block-scale, in the afternoon 6 th street shows the highest concentrations despite having low average traffic volume compared to other streets.On 7 th street, in both morning and afternoon, there are generally concentrations on the east-bound side compared to the west-bound side, despite having nearly the same traffic flow in both directions.Moreover,Fig. 6(b) shows that at the sub-block scale, in the morning on BW north-bound near the intersection of 8 th and BW, the south end of the block has elevated concentration in comparison to the queue forming north end.A similar situation can be noted on 8 th street, just west of the intersection of BW and 8 th , where the east end of the block shows elevated concentration in comparison to the queue forming at the west end.Many of these features can be explained by the surface level wind flow patterns that are heavily influenced by the local built environment, traffic patterns and non-vehicle local sources.More detailed analyses of the effects of surrounding building morphology, micrometeorological variations and air flow patterns due to the built environment and traffic patterns on concentration distributions at different scales will be presented separately (Ranasinghe et al., in prep.).

Estimation of the Minimum Number of Runs Needed for Representative Concentration Values
Due to transient and small spatial scale variations in pollution concentrations, a single run of mobile measurements is clearly unable to capture a representative concentration field of an area.This raises the question of how many repeated measurements are needed to estimate a representative concentration field.Clearly this question is dependent on variability in meteorological as well as traffic conditions, features that in some cases might require very large amounts of sampling.Here we are interested in typical morning and afternoon conditions that, in the case of our study site, are the most common by far.The average wind speeds on BW were 1.2 ± 0.2 m s -1 for mornings and 1.3 ± 0.5 m s -1 for afternoons.The most prevalent wind direction on BW was SW in both morning and afternoon sessions.On 7 th the average wind speeds were 1.0 ± 0.1 m s -1 for mornings and 1.5 ± 0.1 m s -1 for afternoons.The most prevalent wind directions on 7 th were ESE in the mornings and NE in the afternoons (Table 3).To investigate this question, the following data experiment was performed on the UFP number concentration data set.
First, all morning runs and all afternoon runs from the background-corrected concentration data set were collected separately.Each of these sets had runs spanning several days; many with fairly similar meteorological and traffic conditions (Table 2).For mornings, up to 22 runs were available for BW south-bound and 7 th east-bound and 24 runs for BW north-bound and 7 th west-bound.For afternoons, up to 19 runs were available for BW south-bound and 20 runs for other streets.For each street, at each line reference point, runs were selected at random (without replacement) and the mean concentration was calculated using an increasing number of runs, up to one less than the total number of runs available.This process was repeated 10 times for each street, choosing runs in different random order.For the sets of 10 repeated mean concentration calculations at different reference points and for different number of runs averaged, the relative error (standard deviation normalized by mean) was calculated and plotted (Fig. 7(a)).As shown in Fig. 7(a), the rate of decrease in relative error varies among reference points along a given street.For simplicity, the maximum relative error along each street is considered and plotted against number of runs averaged for HEV spikes removed data (Fig. 7(b)) and for HEV spikes retained data (Fig. 7(c)).The minimum number of runs needed for the relative error to drop below 0.15 is calculated (the green or yellow symbols on each plot in Figs.7(b) and 4(c)) and considered as the minimum number of runs needed for a representative UFP concentration value.
The estimate of the minimum number of runs needed for representative UFP concentration values at 5 m spatial resolution varies somewhat from street to street and is dependent on the data filters applied (Figs. 7(b) and 7(c)).For HEV spikes removed data, the maximum relative error along each street vs. number of runs averaged (Fig. 7(b)) initially drops rapidly (in the first 2-7 runs).The maximum relative error along streets also drops rapidly initially, from the initial values of 282-90% to 50% at 4-9 runs, after which it decreases more slowly, reaching 15% at 15-21 runs.Hence the estimate of the minimum number of runs needed for representative concentrations at 5 m spatial resolution ranges 16-21 runs for the mornings and 15-16 runs for the afternoons (Fig. 7(b)).For mornings when 16 runs are included, the average relative error considering all four streets is 11%.For afternoons when 15 runs are included, the average relative error considering all four streets is 9%.The morning sessions usually have low wind speeds (Table 3).Consequently, the turbulent kinetic energy (TKE) and variance of vertical wind velocity (σ w ) are lower in the mornings in comparison to the afternoons (Table 3), denoting lower atmospheric turbulence and mixing rates.The need for more runs for the morning sessions can be attributed to the lower mixing rates, resulting in a stronger influence of local sources on pollutant concentrations.
The inclusion of transient and large HEV spikes generally increases the minimum number of runs needed for all the streets and for both AM and PM sessions (Fig. 7(c)).Similar to the HEV spikes removed data, morning sessions need more runs compared to afternoon sessions.For HEV spikes retained data, the maximum relative error along each street vs. number of runs averaged drops slowly compared to HEV spikes removed data.The maximum relative error along streets also drops from the initial values of 244-143% to 50% at 8-14 runs and drops below 15% only 5 sessions out of the 8 sessions.For all the streets maximum Table 3.Average surface meteorology at BW-7 th .Here, u * is the friction velocity, σ w is the variance of vertical wind velocity and TKE is the turbulent kinetic energy*.

Date
Temp. (°C) u * (m s -1 ) σ w (m s -1 ) TKE (m 2 s -2 ) Temp. (°C) u * (m s -1 ) σ w (m s  relative error along the streets drop below 22% at 21-23 runs for the mornings and for 17-18 runs for the afternoons.Hence we conclude that the minimum number of runs needed for representative UFP concentrations at 5 m spatial resolution is at least 21-23 runs for the mornings and at least 17-18 runs for the afternoons. These results apply only to UFP concentrations because the minimum number of runs needed for representative concentration values depends on the magnitude of variance of the data set.Hence the results depend on the pollutant considered.We also showed that the results depend on the data filters applied (Fig. 7(c)).The effect of spatial resolution on the estimate of the minimum number of runs needed for representative concentration values is discussed in supplementary material (S2).For both HEV spikes removed and spikes retained data sets, the initial values of the maximum relative error markedly decreased when spatial resolution was decreased to 10 m.The minimum number of runs needed for representative concentration values generally decreased for all the streets and for both AM and PM sessions.
In their effort to assess the minimum number of runs needed for representative concentrations at the street-scale, Van Poppel et al. (2013) and Peters et al. (2013) used data from moderately sized sets of mobile monitoring runs (20-24 runs), selecting different numbers of runs at random (without replacement) and averaging them to calculate the street means or medians.They used 1s time resolution data, collected by a MMP travelling at an average speed 2.7 m s -1 .The minimum number of runs needed to obtain representative concentrations was defined as the point at which these mean/median values calculated using a sub-set of runs came within a certain percentage deviation (15%-25%) of their "representative values".They defined the "representative values" as the mean/median of all available runs.Peters et al. (2013) using a 15% deviation percentage concluded that for UFP concentrations the number of runs needed was 16 and 18 for the two sites considered.Van Poppel et al. (2013) used a portion of the data set used in Peters et al. (2013) study and concluded that for UFP concentrations, a 25% deviation could be achieved from 10-16 and 8-16 runs depending on the street, for analysis without and with background correction, respectively.In a continuation of this work, Van den Bossche et al. (2015) used a large dataset (96-256 runs) of BC measurements for a similar exercise.BC was measured at 1 s time resolution but as discussed earlier, the spatial resolution of these data is variable and complex due to the use of a post-data processing technique (ONA).Allowing replacement in the random selection of runs and employing a background correction, trimmed mean and 25% deviation they concluded the number of runs needed is 14-61 depending on the street and also showed that this rose to 108 runs when considering a spatial resolution of 20 m.Prior studies (Peters et al., 2013;Van Poppel et al., 2013) done with small UFP data sets are different from this study in terms of both the way in which the minimum number of required runs is defined and in the spatial scale considered.Despite these differences, our estimate of the minimum number of runs needed for representative UFP concentration values is also comparable with these two prior studies.

CONCLUSIONS
Our proposed methodology produces concentration maps that preserve the valuable high spatial resolution of mobile monitoring data.By addressing the issues associated with non-uniform spatial resolution of measurements and the uncertainties associated with GPS data in a methodical and logical process, we are able to minimize the ambiguity of concentration maps.We showed that careful consideration should be given to all the factors influencing spatial resolution of underlying data; time resolution of the instruments, average speed of the MMP, post-data-processing procedures, when choosing an appropriate spatial resolution for producing average concentration maps.Adapting such a methodical data analysis for mobile monitoring data can facilitate straightforward and meaningful inter-comparison of concentration maps from different studies.The resulting high spatial resolution concentration maps provide a tool to identify pollution variations/hot spots at the block and subblock scale, information that could be used to develop urban planning strategies to minimize pedestrian exposures in near-road environments.

Fig. 1 .
Fig. 1.The sampling route of the mobile monitoring platform (MMP) in downtown Los Angeles.BW denotes Broadway and EB, WB, NB, and SB represent eastbound, westbound, northbound and southbound, respectively.Map source: Google Earth.

Fig. 2 .
Fig. 2.An example of the divergence of GPS data in an urban street canyon.The color dots show the position data of four runs, obtained from GPS device (Garmin GPSMAP 76CS) while driving along Broadway Northbound.The blue squares denote the reference line; a close representation of the actual driving route of the MMP during the data collection.
Fig. 4.An example of the interpolation of concentration data values for an individual run at 2 m spatial resolution.'Data' points (circles) represent observed concentration values.'PCHIP data' points represent the interpolated concentration values (squares).Note that the maps shown in Fig. 6 use 5 m resolution and thus fewer interpolation points than this example.

Fig. 5 .
Fig. 5.The percentage of interpolated points in the data set used to calculate the mapped concentration as a function of spatial resolution of the reference lines, for data with 1 s time resolution and a mobile monitoring platform mean speed of about 3 ± 3 m s -1 .

Fig. 6 .
Fig. 6.Spatial varation of background corrected UFP concentrations avaraged over (a, b) morning and (c, d) afternoon sessions from three days for (a, c) data including of HEV related spikes and (b,d) data excluding HEV-related spikes.The spatial resolution of the maps is 5 m.The heights of the buildings in the nearby area is shown in gray scale.

Fig. 7 .
Fig. 7. (a) The relative error of repeated calculations of mean concentration of HEV spike removed data, for different numbers of averaged afternoon runs included in the averaging (x-axis), at each line reference points along a single example street (BW SB) (y-axis).(b, c) The variation of maximum relative error along different street segments vs. the number of runs averaged for morning (AM) and afternoon (PM) sessions (b) for HEV spikes removed data and (c) for HEV spikes retained data.The green and yellow symbols denote the points at which the relative error is at or below 0.15.The spatial resolution of the maps considered is 5 m.

Table 1 .
Monitoring instruments on the mobile monitoring platform.