Using Building Heights and Street Configuration to Enhance

Sep 3, 2013 - PM10, NOX, and NO2 for 2008−2011 for London, U.K. A separate set of models ... Information on building heights together with streets w...
1 downloads 0 Views 833KB Size
Subscriber access provided by University of Glasgow Library

Article

Using building heights and street configuration to enhance intra-urban PM , NO and NO land use regression models 10

X

2

Robert Tang, Marta Blangiardo, and John Gulliver Environ. Sci. Technol., Just Accepted Manuscript • Publication Date (Web): 03 Sep 2013 Downloaded from http://pubs.acs.org on September 3, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Environmental Science & Technology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27

Environmental Science & Technology

1

Using building heights and street configuration to enhance intra-urban PM10, NOX and

2

NO2 land use regression models

3

Robert Tang1*, Marta Blangiardo1, and John Gulliver1

4 5

1. Small Area Health Statistics Unit, MRC-PHE Centre for Environment and Health, School

6

of Public Health, Imperial College London, St Mary’s campus, London, W2 1PG, UK.

7 8

* Corresponding author:

9

[email protected]

10

Tel: +44 (0)20 7594 5027

11

Fax: +44 (0)20 7594 0768

12

Small Area Health Statistics Unit, MRC-PHE Centre for Environment and Health, School of

13

Public Health, Imperial College London, St Mary’s campus, London, W2 1PG, UK.

1 ACS Paragon Plus Environment

Environmental Science & Technology

14

TOC/Abstract Art

15

2 ACS Paragon Plus Environment

Page 2 of 27

Page 3 of 27

Environmental Science & Technology

16

Abstract

17

Land use regression (LUR) models have been widely used to provide long-term air pollution

18

exposure assessment in epidemiological studies. However, models have rarely offered

19

variables that account for the dispersion environment close to source (e.g. street canyons,

20

position and dimensions of buildings, road width). This study used newly-available data on

21

building heights and geometry to enhance the representation of land use and the dispersion

22

field in LUR. Models were developed for PM10, NOX and NO2 for 2008-2011 for London,

23

UK. A separate set of models using ‘traditional’ land use and traffic indicators (e.g. distance

24

from road, area of housing within circular buffers) were also developed and their

25

performance compared with ‘enhanced’ models. Models were evaluated using leave-one-out

26

(n-1) (LOOCV) and grouped (n-25%) cross-validation (GCV). LOOCV R2 values were 0.71,

27

0.50, 0.66 and 0.73, 0.79, 0.78 for traditional and enhanced PM10, NOX and NO2 models,

28

respectively. GCV R2 values were 0.71, 0.53, 0.64 and 0.68, 0.77, 0.77 for traditional and

29

enhanced PM10, NOX and NO2 models, respectively. Data on building volume within the area

30

common to a 20 m road buffer within a 25 m circular buffer substantially improved the

31

performance (R2 > 13%) of NOX and NO2 LUR models.

32 33

Keywords

34

GIS, land use regression, street canyon, street configurations, building heights

3 ACS Paragon Plus Environment

Environmental Science & Technology

35

1. Introduction

36

Land use regression (LUR) models have been widely used for exposure assessment in

37

epidemiological studies.1-3 The structure of LUR models varies between studies (i.e. number

38

of variables, different distances and buffers sizes) due to location and size of area (i.e. city,

39

national, continental), but typically models include information on traffic, road length, land

40

use, population and altitude. Air pollution is quickly dispersed in open country but elevated

41

levels of pollutants are often seen in densely built-up areas, especially in heavily trafficked

42

streets canyons.4 Source data in LUR models are, however, often relatively coarse in spatial

43

resolution (e.g. 100 m CORINE land cover used in the ESCAPE project5) and thus do not

44

represent the physical characteristics of dispersion environments (i.e. buildings and street

45

configuration). Some dispersion models simulate the physical properties of air pollution

46

dispersion around buildings (e.g. ADMS-Urban), but are usually limited to 10-15 buildings

47

(i.e. one street or road intersection) in any model run. Thus, dispersion modeling with

48

buildings is not practical for city-wide exposure assessment. A few studies have developed

49

‘hybrid’ approaches by combining outputs from dispersion models with other variables in

50

LUR,6-8 but thus far these models have the same limitation of not including the effects of

51

buildings on the dispersion of air pollution.

52 53

Information on building heights together with streets widths has been used in previous LUR

54

studies to calculate aspect ratio to act as an indicator of pollutant trapping.9,10 This has been

55

limited to considering the height of buildings on either side of the street at the point of

56

interest, thus it does not account for the effects of street canyon length or other buildings in

57

the vicinity of sources. One constraint has been the availability of data on building heights,

58

which in the past had to be measured in the field and/or imputed from aerial photography or

59

satellite imagery.9,10

4 ACS Paragon Plus Environment

Page 4 of 27

Page 5 of 27

Environmental Science & Technology

60 61

During recent years, new datasets on building heights and geometry, and detailed urban land

62

use data have been made available in several countries including the UK. The Landmap

63

dataset (www.landmap.ac.uk) provides city-wide data on building heights and geometry

64

(with horizontal and vertical accuracy of +/- 0.5m) for much of urban UK, whilst OS

65

MasterMap (http://www.ordnancesurvey.co.uk/oswebsite/products/os-mastermap/index.html)

66

provides highly detailed topography and land use data, including areas of roads (i.e. road

67

width) and land use at fine spatial scale (i.e. 70% of daily

89

mean concentration available for each year between 2008-2011 were selected. Supporting

90

information about these sites, including site type (as defined by LAQN; i.e. kerbside,

91

roadside, industrial, suburban and urban background), and British National Grid X-Y site

92

coordinates were downloaded from the LAQN website. Site coordinates were checked using

93

online Google Maps (maps.google.com). Daily mean concentrations of pollutants were

94

downloaded then averaged over a four-year period (i.e. 2008-2011) to provide long-term

95

average concentrations (i.e. avoid the influence of meteorology in individual years). There

96

were 42, 57, and 56 monitoring sites for PM10, NOX and NO2, respectively, that passed the

97

site selection criteria. Locations of these monitoring sites can be seen in Figures S1-S3

98

(Supporting Information). The 4-year average concentrations of each pollutant are normally

99

distributed.

100 101

Development of predictor variables

102

A range of predictor variables were derived from GIS datasets for developing the LUR

103

models; a full list of variables can be found in Table S1 (Supporting Information). Traffic,

104

land use and population variables extracted were similar to those used in the ESCAPE study.5

105

The London Atmospheric Emissions Inventory 2008 (LAEI 2008) was used for road and

106

traffic variables. It comprises a digital road geography, attributed with information on type of

107

road (‘motorways’, ‘A-roads’ and some significantly trafficked ‘minor roads’), traffic flows

108

and speeds, with separate categories for fleet type (e.g. buses, light and heavy vehicles),

109

within M25 boundary in Greater London. For land use variables, we used a combination of

6 ACS Paragon Plus Environment

Page 6 of 27

Page 7 of 27

Environmental Science & Technology

110

CORINE land cover for Europe (CLC) and the Land Cover Map of Great Britain 2007

111

(LCM2007) (Source: Centre of Ecology and Hydrology) on a 25m2 grid. The land cover

112

datasets were aggregated into five groups: high density urban, low density urban, industry,

113

ports and railways, and urban green space. Further details on this procedure can be found in

114

Supporting Information. Population variables (i.e. number of inhabitants and households)

115

were extracted from the UK 2001 Census (Source: Office for National Statistics) – the closest

116

available year.

117 118

Four variables on buildings and/or street configuration were offered to create enhanced

119

models, including two on aspect ratio as used in other studies9, 10, 12, and two new variables

120

designed as part of this study: (1) aspect ratio (i.e. average building height / road width); (2)

121

maximum aspect ratio (i.e. maximum building height / road width); (3) building volume (i.e.

122

sum of building area × height); and (4) building volume / road width. The new variables are

123

used to account for building intensity and stature close to roads with traffic information; in

124

other words, they aim to represent both the height and continuity of street canyons. The

125

variable on both building volume and road width is thus an extension of aspect ratio to

126

account for both the shape of the canyon and the density of buildings. Higher values of these

127

variables are expected to have a positive effect (i.e. increase) on pollutant concentrations.

128

Building areas and heights were extracted from Landmap. Road widths and areas of urban

129

green space surfaces were extracted from OS MasterMap.

130 131

The building and road width data were extracted only at locations where a road emission

132

source from LAEI (i.e. with traffic information) is present (i.e. at road- and kerb-side sites),

133

and limited to a distance of 100m to represent distance of influence of buildings and street

134

configuration on pollution trapping in the wake of sources.4,11 Two types of buffers were

7 ACS Paragon Plus Environment

Environmental Science & Technology

135

used: (1) circular buffers as used in other studies, and (2) road buffers - buffers of road

136

center-lines were created, intersected with circular buffers, and features within the area

137

common to both buffers (e.g. buildings alongside roads) were clipped and extracted for each

138

of the four building/street configuration variables. Distances for road buffers were 20, 30, 40

139

and 50m from the center of roads within 25, 50 and 100m circular buffers. The road buffers

140

thus focus on areas in the immediate wake of road sources. Figures S4 and S5 in Supporting

141

Information illustrates how road and circular buffers were created.

142 143

Model development

144

Models were developed for PM10, NOx and NO2 using ArcGIS (v10.0; ESRI) to generate

145

predictor variables and SPSS (v20.0; IBM) for regression analysis. Variables were regressed

146

against monitored concentrations of each pollutant. For traditional models, the predictor

147

variable showing the highest correlation with monitored concentrations was first entered in

148

the model and entry of proceeding variables followed a supervised forward stepwise

149

method.13-15

150 151

Two approaches were used to develop enhanced models. Firstly, the four building and street

152

configuration variables were regressed in turn against the residual from traditional models

153

(i.e. coefficients of the variables in the traditional model were not allowed to change), similar

154

to Eeftens et al..12 Secondly, models were developed from scratch with all variables being

155

offered in the regression analysis. We ensured that building/street configuration variables

156

were only entered into the model if a local traffic variable (i.e. traffic load within a 25 m

157

buffer) was already present in the model. Initially, the best performing variable representing

158

local traffic sources were identified, then stepwise coupled with building/street configuration

159

variables. Once the best combination of a local traffic and building/street configuration

8 ACS Paragon Plus Environment

Page 8 of 27

Page 9 of 27

Environmental Science & Technology

160

variable was selected other variables reflecting background traffic (i.e. road and traffic

161

variables beyond the extent of the local traffic variable) and non-traffic sources (e.g.

162

population, housing, industry, green space) were offered.

163 164

A variable is included in a final model if it provides: (1) greater than 1% increment in

165

adjusted R2; (2) p-value 1%; and (2) p-value < 0.05 are

244

highlighted. For PM10, no enhanced variables were found to be statistically significant (p
0.05, thus

257

model residuals were deemed not to be spatially dependent.

258 259

In order to compare the predictive powers of each variable and their contributions to absolute

260

predicted values, regression coefficient × the (90th percentile – 10th percentile) were

261

calculated.12 Predicted long-term average concentrations associated with each variable by site

262

can be found in compound graphs presented in Figures S7-S9 (Supporting Information).

263 264

Model evaluation

265

Table 2 shows summary statistics from cross-validation. Figure 1 shows predicted

266

concentrations from GCV plotted against monitored concentrations. Summary statistics for

267

each evaluation group in GCV can be found in Table S5.

268 269

For LOOCV, traditional models yielded R2 (MSE-R2 in brackets) values of 0.71 (0.70), 0.50

270

(0.48), 0.66 (0.64), and enhanced models yielded R2 values of 0.73 (0.71), 0.79 (0.78), 0.78

271

(0.77) for PM10, NOX and NO2, respectively (Table 2). For all pollutants, enhanced models

272

outperformed traditional models, with 2%, 29% and 12% higher explained variance (i.e.,

273

LOOCV R2) in monitored concentrations for PM10, NOX and NO2, respectively. The fall in

274

R2 between model development (Table 1) and LOOCV R2 was below 10% (3~7% drop for all

275

models), which is comparable to changes in R2 between model development and evaluation

276

in the recent ESCAPE project (within 15%), indicating stable models.5

277 278

The R2 values obtained from GCV for PM10, NOX and NO2, respectively, were 0.71 (0.69),

279

0.53 (0.51), 0.64 (0.63) for traditional models and 0.68 (0.67), 0.77 (0.77), 0.77 (0.76) for

13 ACS Paragon Plus Environment

Environmental Science & Technology

280

enhanced models. Enhanced models yielded 24% and 13% increase in explained variation in

281

NOX and NO2 concentrations, but the PM10 model was not improved.

282 283

Values of RMSE calculated from LOOCV and GCV were found to be similar, with higher

284

values of RMSE from traditional models, with the exception of PM10 in GCV. IOA ranged

285

from 0.89~0.94 for LOOCV and 0.83~0.94 for GCV. IOAs are higher in enhanced models;

286

with 11% and 5% more agreement between predicted and monitored values in GCV for NOX

287

and NO2, respectively. Values of FB for all models are small (-0.002~0.004 for LOOCV; -

288

0.015~0.009 for GCV), indicating that models are relatively free from bias (i.e. low levels of

289

over- or under-prediction). The highest bias was under-predictions of enhanced PM10 model

290

in GCV (FB = -0.015; i.e. under-prediction of about 1.5%). Values of beta (i.e. regression

291

slopes) fell within 95% confidence interval in all cases.

292 293

Discussion

294

We developed enhanced intra-urban PM10, NOX and NO2 land use regression models for

295

London, using data on building heights and street configuration alongside traditional LUR

296

variables (i.e. land use, population, roads etc.), and compared their performance with

297

traditional models. To our knowledge, this is the first study both to employ city-wide building

298

geometry/heights and detailed land use datasets (with spatial accuracy of +/-1 meter in urban

299

data) to represent the effects of the built environment on the pollutant dispersion in LUR

300

models.

301 302

We offered the enhanced variables in addition to the traditional models using a similar

303

approach to the one applied in the Netherlands.12 We subsequently developed new models by

304

offering enhanced and traditional variables together. Results showed that models with

14 ACS Paragon Plus Environment

Page 14 of 27

Page 15 of 27

Environmental Science & Technology

305

enhanced geographical variables provided substantial improvements in model performance

306

for NOX (MSE-R2 = 26% in GCV) and NO2 (MSE-R2 = 13% in GCV). Although a higher

307

value of R2 was achieved in model building for the enhanced PM10 model, its performance

308

measured against the traditional model in evaluation was mixed.

309 310

LUR model variables and model performance

311

We developed a range of variables using circular and road buffers of varying size to represent

312

building densities and street canyons based on canyon length, height, width and building

313

areas. These variables represented both the immediate street-level dispersion environment

314

and the dispersion field in the surrounding area around monitoring sites. This is an

315

improvement in LUR models over simple street canyon (i.e. no canyon; partial/full canyon)

316

classifications or aspect ratios which typically only consider the height of buildings on

317

opposite sides of streets at the air pollution monitoring site.9,10,12

318 319

In enhanced models, variables extracted with road buffers were retained in some models over

320

traditional circular buffers. Smaller sized road buffers (