A Novel Approach in Quantifying the Effect of ... - ACS Publications

Jul 7, 2015 - School of Environment, Faculty of Science, University of Auckland, Auckland ... National Institute of Water and Atmospheric Research, Au...
0 downloads 0 Views 4MB Size
Article pubs.acs.org/est

A Novel Approach in Quantifying the Effect of Urban Design Features on Local-Scale Air Pollution in Central Urban Areas Georgia Miskell,† Jennifer Salmond,*,† Ian Longley,‡ and Kim N. Dirks§ †

School of Environment, Faculty of Science, University of Auckland, Auckland 1010, New Zealand National Institute of Water and Atmospheric Research, Auckland 1010, New Zealand § School of Population Health, Faculty of Medical and Health Sciences, Auckland 1010, New Zealand ‡

ABSTRACT: Differences in urban design features may affect emission and dispersion patterns of air pollution at local-scales within cities. However, the complexity of urban forms, interdependence of variables, and temporal and spatial variability of processes make it difficult to quantify determinants of local-scale air pollution. This paper uses a combination of dense measurements and a novel approach to land-use regression (LUR) modeling to identify key controls on concentrations of ambient nitrogen dioxide (NO2) at a local-scale within a central business district (CBD). Sixty-two locations were measured over 44 days in Auckland, New Zealand at high density (study area 0.15 km2). A local-scale LUR model was developed, with seven variables identified as determinants based on standard model criteria. A novel method for improving standard LUR design was developed using two independent data sets (at local and “city” scales) to generate improved accuracy in predictions and greater confidence in results. This revised multiscale LUR model identified three urban design variables (intersection, proximity to a bus stop, and street width) as having the more significant determination on local-scale air quality, and had improved adaptability between data sets.



INTRODUCTION At local scales, the shape and form of the urban environment can affect air pollutant flow through pedestrian spaces, resulting in the trapping and recirculation of emissions in some locations and the promotion of dispersion in others.1−3 Urban design can modify local pollutant patterns by determining traffic flow (and therefore emissions) with stop-start conditions around intersections, limiting vehicle access to some spaces, and restricting vertical or horizontal dispersion by the presence of buildings, street width, tree canopies, bus shelters and awnings. Local-scale air quality can also be affected by meteorology, regional emissions, and geographical settings which may not be independent of urban design. As a result, central business districts (CBDs) are characterized by highly complex patterns of pollution, especially for spatially dynamic traffic-dominated and diesel-related pollutants such as nitrogen dioxide (NO2),4 which carry higher health exposure impacts than petrol.5 Given this complexity, a combination of measurement and modeling methods is often required to effectively investigate the causes of air pollution at local scales. The number of possible urban design features affecting concentration, and quantifying the relations among sites to make meaningful associations requires some type of modeling to be carried out. Previous work using passive sampling techniques6 to document spatial variability has identified a pervasiveness of spatially stable hotspots of traffic-dominated pollutants such as NO2.7−9 A well-established statistical technique in air quality research is “land-use regression” (LUR), which identifies site-specific variables that correlate with measured concentrations to predict © XXXX American Chemical Society

conditions at other points within a domain for which the model assumptions hold.10,11 Identification and quantification of the size and direction of effect for a range of potential variables upon observed concentrations has made LURs popular in air quality science, with the simplicity and accessibility of model construction and interpretation further supporting the use of this model. Physical dispersion modeling studies have also been used to establish the effect of some urban design features on local air pollution.12 However, dispersion models require a large amount of often uncertain or unavailable input data (such as long-term data on wind variables), carry high implementation costs, and rely on mechanical processes associated with dispersion being adequately captured by the model.11 In complex urban environments, LUR models have been found to provide comparable results when compared to predictions from physical-based dispersion models13,14 with significantly reduced complexity, computational requirements, and data input. The empirical nature of LURs also means that only very limited a priori assumptions need to be made about atmospheric properties and dispersion processes. Common urban design features used to account for trends in urban concentrations have often been source-related. Hoek7 reviewed 25 LUR models, which all identified a traffic-related Received: January 29, 2015 Revised: July 1, 2015 Accepted: July 7, 2015

A

DOI: 10.1021/acs.est.5b00476 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology variable as important determinants of air quality. Road length and distance to the nearest major road are typical predictors.15 Other variables included land-use, population characteristics, and physical geography,16 which may be related to either emission or dispersion processes, or both. Selection of initial variables for inclusion in LUR models is often determined by data availability, measurable features, and a variable’s prevalence in previous work. Previous research using LUR methods often used measurement networks which included sites from rural, suburban and urban land-uses combined into one data set, resulting in a multisite model optimized for overall performance.17,18 Impacts of individual variables can be expected to be different for and within each land-use type, making the standard LUR model building criteria inappropriate when seeking to understand specific important local-scale urban designs. The standard LUR model building referred to is to the ESCAPE19 design widely used in other research, and will be referred to in this article as the “standard” model. LURs strictly describe only the conditions and range of variables within a specific data set and can be sensitive to any deviation, rendering it difficult to apply the model to an independent data set.20 Previous attempts at using independent LUR models have often focused on the site-specific errors of this technique, with no strategy devised to correct or refine them, other than to allow coefficients to change in relation to each data set.21 Common validation methods use internal procedures, such as ‘leave one out cross-validation’ or splitting a sample into two,16 which can result in dependence issues (by network design, or targeted or random splitting of the sample). This paper presents the results from a spatially dense passivesampled network in the Auckland CBD area, and applies a standard LUR model designed to describe concentrations at the local-scale.19,22 A methodology to improve model accuracy and adaptability is proposed using local-scale and city-scale data sets to offer a solution to the data set-specific concern in LUR development.

Figure 1. Annual seasonally adjusted NO2 concentrations for the cityscale data set (μg m−3). Black locations are the local-scale data set for comparison. The red line delineates the CBD area and the star is the central public transport terminus.

MATERIALS AND METHODS Auckland is the largest city in New Zealand, with a population of around 1.5 million people and 900 000 registered cars.23 High private car ownership levels of old vehicles,24 a public transport system almost exclusively comprised of diesel buses, and isolation from other regional sources of NO2 make Auckland a suitable site for a local-scale measurement network and model. Nitrogen dioxide (NO2) sources are predominately trafficrelated in Auckland.25 The CBD is 4.33 km2 with one harbor immediately to the northeast. The central public transport terminus is located in the northeast and a number of motorways surround the southern end (Figure 1). The study area was approximately 0.9 by 0.5 km in size, or 0.15 km2 for the specific monitored areas (Figure 2). Measurements were conducted between 19th August and first October 2013 during winter/spring by three repeated measures. Palmes diffusion tubes were deployed for 14 ± 2 days at 62 sites over 16 block sections from nine roads, with the second campaign having reduced observations due to limited available tubes (n = 52). Tubes were deployed in duplicates at a height of 2.5 m, with one site in proximity to a reference regulatory air quality monitoring station within the study area for colocation purposes (155 Queen St). Site selection was completed by identifying areas that had a variety of surrounding urban design features, assumed higher concentrations/variability in NO2, and suitable mounting places. Specific sites were determined by availability

due to the high spatial density of measurements and minimization of potential vandalism/safety concerns within a CBD area. Analysis and handling procedures followed the Atomic Energy Authority report26 recommendations with blanks, travel, and background measurements made for quality assurance. Data were processed based on hourly exposure lengths, and so differences in exposure for the three campaigns were not problematic. Seasonal corrections were applied based on five-year monthly averages from the colocated reference station within the study area to convert tube measurements to estimated annual averages, similar to the methodology in Henderson, Beckerman, Jerrett, and Brauer.27 Monthly averages were compared to the annual average to give a relative ratio, with all concentrations expressed in μg m−3. Site characteristics were defined by a large set of explanatory variables with data being collected by physical counts/ observations, previously collected available data, and GIS data set calculations. A total of 22 commonly selected and measurable variables were used in the initial stages based on all available variables and motivated by previous study measurements7 (Table 1). Variable impacts were evaluated by either being hypothesized sources of pollution (e.g., the traffic intensity) or being hypothesized to modify the relations between sources and concentrations for a site (e.g., awnings) and have been presented in Table 1 (as “Source” or “Modifier”, as appropriate). Some variables (e.g., the “number of bus lanes”) were labeled as sources



B

DOI: 10.1021/acs.est.5b00476 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

compared by ranking concentrations for hotspot identification. The top-five sites from each campaign were compared on their level of agreement (with overall ranking acting as baseline). A separate diffusion tube data set had been collected prior, which was used to provide comparison to the local-scale data set. Auckland Council (AC) measured NO2 at 41 locations using passive sampling of 28 day intervals between first June and first September 2011. Twenty-one sites of these sites were located in or around the CBD area (Figure 1), and they broadly covered the same area of the city as the local-scale data set. The AC data set, however, included urban and suburban zones (“city-scale”). As a result, variations in urban design coefficients, such as longer distances to traffic lights and lower building footprints, were observed in this data set compared to the local-scale data set. Limitations of the AC data set in the context of this research were the small sample size (creating difficulty in isolating individual urban design effects), lower spatial resolution of the data set, and the historically anomalous regional temperatures observed during the data collection period in 2011. Source and modifier characteristics can be expected to be similar for the two data sets as they are derived for the same area, although the difference in measurement times and for different seasons arguably makes them independent. Dispersion simulations were not included in this research as explanatory variables, as larger distances between sites are often required to adequately estimate transportation fluxes or source decays than were present in the local-scale data set.30 As the two models required similar explanatory variables in order to then construct a single multiscale LUR model, dispersion simulations were not carried out for the city-scale data set either. Single variable regression analysis was performed to check the direction and significance of each variable on the response, with the adjusted R2 noted and ranked. An additive multivariate regression model was developed, with the highest ranking variable selected and then rerun using the remaining variables to find the next significant variable. Variable selection was done using a supervised forward-selection stepwise process, as human judgment kept results interpretable by controlling the assumed direction of effect. This type of model-building criteria was similar to the widely adopted practises from the ESCAPE project.19 The LUR was evaluated for goodness-of-fit to the collected data by maximizing the percentage of accounted-for variability (adjusted R2), improving accuracy by minimizing the difference between model and measurements values (root-meansquare error - RMSE), Spearman rank correlation performance, bias estimators (Mean Error), and an overall p-value that was statistically significant (3) were assessed. The coefficient of variation (CoV) tested the level of agreement between duplicate readings and had a threshold of 0.25, similar to Mölter.31 Model validation was carried out in two ways: the internal ‘leave-one-out crossvalidation’ (LOOCV) method, and the external method where the model was validated against an independent data set and similar diagnostics as before to assess model fit. The two developed LUR models were used to predict an independent data set: one where the local-scale LUR was used to predict the city-scale AC data and one where the city-scale LUR was used to predict the local-scale data (labeled as “local-scale” and “city-scale” models). In an effort to improve the predictive abilities of an independent data set, the local-scale model was redesigned using the AC data. This was done with the objective of making a formula that would work for both data sets, and therefore become more adaptable. Explanatory variables would need to be identified by at least one of the data sets using the ESCAPE method. A variable in a model that had a positive association in one data set and a negative association in the next illustrates uncertainty in how it affects surrounding concentrations. It could be explained by another variable (multicollinearity), be a function of the spatial-scale, or be a form of statistical overfitting, which then removes any description of the variable. The first redesign was to allow the local-scale model to have adjustable coefficient values to maximize performance to the AC data set. The second redesign was to then use the standard model

construction steps from the ESCAPE project to improve independent data set predictions by finding common urban design factors to make an adaptable “multilevel” model. The steps taken were to evaluate the variable coefficients for the independent data set on assumed direction of effects, as stated in Table 1. 1. Variables that did not follow the direction of effect were removed, one at a time, based on the largest p-values, and the model re-evaluated at each iteration. 2. Once all variables were within the assumed direction of effect, the nonsignificant p-values were ranked (>0.2). The least significant p-value was removed from the model and this continued until all explanatory variables in the final model had p-values of ≤0.2 and in the correct assumed direction of effect. The remaining variables had flexible coefficient values to maximize specific fits (and improve adaptability). This created three different types of LUR models; • “Standard”, where a LUR using the ESCAPE criteria has been developed and tested on the same data set. • “Independent”, where a LUR using the ESCAPE criteria has been developed on one data set and tested on an independent data set. • “Multiscale”, where a LUR using the ESCAPE criteria has been developed on one data set and then reworked using D

DOI: 10.1021/acs.est.5b00476 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology Table 2. Summary Statistics for the Seasonally Adjusted Local-Scale Dataset on Each Campaign and Overall Values NO2 concentration / μg m−3 min

max

camp.

no. sites

n (days)

ave.

SD

site ID

value

site ID

value

one two three overall

62 52 62 62

16 13 15 44

31.18 40.91 35.31 34.99

7.69 6.79 8.07 7.4

103 61 58 58

16.22 25.7 18.22 17.7

11 11 11 11

53.93 58.71 56.75 56.5

an independent data set, and then tested using both data sets.



Table 3. Summary Statistics for the Seasonally Adjusted CityScale Dataset average

SD

min

max

IQR

21

33.91

6.33

23.4

47.6

29.5−39.89

passive (μg m −3)

CoV (%)

one two three

42.04 50.03 45.09

41.18 47.28 44.11

1 4 2

standard

independent

multi-scale

intercept V1 V2 V3 V4 V5 V6 V7

15.63 1.64 0.83 4.06 −2.88 1.44 0.0005 −0.02

26.32 −0.01 2.91 0.44 1.38

23.85 2.42 0.81 −0.04

V1, etc. denotes the different variables (standard, independent and multi-scale results).

both directions would have higher concentrations compared to more narrow, single-lane streets. The six other variables associated with increased urban NO2 were V2: an increase in bus stop density within 100m radius (“sum of bus stop 100m”) V3: if an awning or side-walk cover was beside the measurement point (“awning”). V4: an increase in bus lane density within 100m (“sum of bus lane 100m”). V5: no tree or park-space within 25m to the measurement location (“vegetation”). V6: an increase in building footprint within 100m (“building footprint 100m”). V7: decreased distance in meters toward the next set of traffic lights when obeying traffic direction, or an association with specific corners of an intersection (“distance toward traffic lights”). Stability of the variables over time were checked by running the standard LUR criteria for each tube campaign data and comparing results. The adj. R2 values for all three were lower than the overall result, with the first campaign having the highest R2 with 0.5. Explanatory variables selected were “number of lanes”, “building footprint 25m”, “distance to traffic”, “sum of bus lane 100 m”, and “sum of bus stops 100 m” for the first campaign, “distance to traffic”, “sum of bus lane 100 m”, “awnings”, “sum of bus stops 100 m”, and “vegetation” for the second campaign (adj. R2 0.37), and “number of lanes”, “sum of bus stops 100 m”, “building footprint 100 m”, and “distance to traffic” for the third campaign (adj. R2 0.42). All variables selected were in the final model (when ignoring buffer size). This revealed relative variable stability over time, albeit with higher variation. Four variables were selected in the city-scale model (Table 6; adjusted R2 of 0.84). Variables were similar to the local-scale model, with selected variables: V1: “Distance toward the next set of traffic lights”. V2: An increase in bus stop density within 50m radius (“sum of bus stop 50m”).

Table 4. Reference and Unadjusted Passive Concentrations at 155 Queen Street continuous (μg m −3)

coefficient

a

proximity to the reference analyzer all had CoV values within 0.05, supporting the reliability of this sampling method (Table 4). The local and city-scale data sets illustrated the hotspot

campaign

26.87−36.86 35.33−44.18 31.28−41.02 31.15−39.36

Table 5. LUR Coefficient Values from the Three Models for the Local-Scale Dataseta

RESULTS AND DISCUSSION A seasonally adjusted average of 35.0 μg m−3 was observed for the local-scale data set, ranging from 17.7 and 56.5 μg m−3 and standard deviation of 7.4 μg m−3 (Table 2). The city-scale data set had an average of 33.9 μg m−3 (Table 3). The tubes in

no. sites

IQR

concept, with the maximum measurement both being located around the northern public transport terminus (47.6 μg m−3 and 56.5 μg m−3, respectively). Local meteorological and emission patterns were comparable for the time of year across the three campaigns, supporting the sampling period being similar to historic conditions. Checks on outliers (max Cook’s Distance = 0.12), normality (p-value = 0.6), variable multicollinearity (max variance inflation factor = 1.8), and nonlinear trends all produced satisfactory results for the use of a multilinear regression model. No significant correlations existed among the explanatory variables, when using a 0.6 threshold.32 This gives support to the use of individual variable terms in the LUR models. The three campaigns had 73% agreement on the top five ranked sites (11 out of 15), illustrating that relative concentrations can exhibit relative spatial stability for high concentration sites. This could help in the identification of areas where air quality mitigation would be beneficial, and what local scale urban details could be causing spatial patterning. The local-scale LUR identified seven urban design variables from Table 1 to collectively explain 62% of the observed variability in the NO2 measurements (Table 5). Model diagnostics were all found to be acceptable, and the internal LOOCV had an overall average adjusted R2 of 0.62. The variable with the highest individual correlation to the response was “number of lanes” (V1), with an adjusted R2 of 0.37 and p-value