Source Attribution of Elevated Residential Soil ... - ACS Publications

Amit Goyal and, Mitchell J. Small, , Katherine von Stackelberg and, Dmitriy Burmistrov, ... Off-Site Forensic Determination of Airborne Elemental Emis...
0 downloads 0 Views 1MB Size
Environ. Sci. Techno/. 1995, 29, 883-895

Source Attribution of Elevated Residential Soil lead near a Battery Recycling Site MITCHELL J . SMALL* Civil & Environmental Engineering and Engineering & Public Policy, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 A R T H U R B . N U N N , I11 Air Compliance, Inc., Feasterville, Pennsylvania 19053

BARBARA L. F O R S L U N D A N D DANIEL A. DAILY Advanced GeoServices Corporation, Chadds Ford, Pennsylvania 1931 7

A statistical methodology is developed to estimate site emission vs urban background contributions to elevated soil lead in areas adjacent t o historic lead processing facilities. The methodology is applied in a study of residential soil lead contamination near a former automotive battery recycling facility in Pennsylvania. The site contribution is scaled to deposition estimates from an atmospheric dispersion model with unit emissions. Indicator variables are used to represent the effects of lead-based house paint and other background sources. The statistical model characterizes the observed soil lead concentration at each location as the sum of log-normally distributed components from the site and urban background; The model is used to estimate the spatial profile of mean contributions from the historic site emissions and the probability that the facility caused exceedances of a 5OOpg/g criterion for remediation. The analysis helped to determine the extent of responsibility for cleanup and provided a rational basis for the termination of residential sampling at farther distances from the site.

1. Introduction Lead has been identified as one of the most commonly occurring contaminants at Superfund sites (1, 2 [Table 3-21), Operating facilities such as lead mines, primary and secondary lead smelters, and battery recycling plants can discharge lead to the surrounding environment through either air, surface water, or groundwater pathways. A particularlyimportant pathway for human exposure occurs when fugitive particulates containing lead are emitted to the air and deposited on the surrounding soils. The stability of lead in soils is such that elevated concentrations can persist for many years, even after closure of the facility. In residential areas, exposure to lead in soils occurs directly through inhalation or ingestion associated with gardening activities or childhood play or indirectly through the soil lead contributed to house dust, which is subsequently inhaled or ingested and absorbed into the blood or body tissue (3-6). While industrial facilities can have a significantlocalized impact on soil lead concentrations, there are many other sources of lead to the environment, particularly in urban, residential areas. These include vehicular emissions (31, lead-based paint (7-141, and other localized sources such as fuel and trash burning and disposal (3, 7, 14). While automotive lead emissions have been dramaticallyreduced since 1975, significant residual lead remains on soils in areas with vehicular traffic, and large-scale patterns of soil lead concentration are clearly associated with this factor (12, 15-17). Contributions from lead-based paint are especially high adjacent to older homes where there has been repainting preceded by aggressive scraping, sanding, and/or burning (18). Thus, while the natural concentration of lead in soils tends to range from 5 to 50 pglg, with an average of about 20 pglg (19, 201, concentrations in residential urban areas (without major point sources) typically range from 50 to 1000 pglg or more, with the number of exceedances of 500 or 1000 pglg depending on the intensityof the vehiculartrafiic,lead paint, or residential combustion activity. As soil lead concentrations of 5001000 pg/g represent the range of current targets for determining a need for remediation in residential areas (24, it is difficult to determine whether exceedances of these targets at moderate distances from a facility are due to the facilityrelease or to the other urban sources. Conflicts over the required extent of remediation may thus arise. A particular activity that has contributed to major lead contamination is the recycling of automotive lead-acid batteries. Operations at battery recycling facilities consist of battery cracking, the draining of spent acid, separating the electrode lead from the battery casing, and, in some cases, remelting the recovered lead. Unfortunately, many plants were historically operated without the necessary care to ensure full capture and containment of the lead. Currently, 29 of the more than 1200 sites on the Superfund National Priorities List (NPL) are former battery recycling facilities (22).

The subject of our study is typical of the batteryrecycling facilities described above (see Figure 1). The site is located ~~

* To whom correspondence should be addressed; e-mail address: [email protected]; Fax: 412-268-7813.

0013-936x/95/0929-0883$09.00/0

0 1995 American Chemical Society

VOL. 29, NO. 4, 1995 / ENVIRONMENTAL SCIENCE & TECHNOLOGY 1883

PROPERTY LINE

@ e

PENNSYLVANIA DER AIR MONITORS MELTING POT

w

FIGURE 1. Former battery site (scale length: 240 ft = 73.2 m).

in Pennsylvania and, when operational, included a dropoff area for battery cracking and draining of spent acid, a melting pot for lead recovery, and a disposal area for broken casings. Very little is known about the operating conditions of the melting pot, and emission estimates are unavailable. The surrounding community extends over an area of 5.7 square miles (14.8km2),with a population of approximately 4070 (1990). The plant operated from 1963to 1981, and in 1987, EPA sampling determined that contamination had migrated off site. The current owner of the site and the U.S. EPA entered into a consent agreement and order to perform emergency removal activities. The consent order required that the current owner secure and stabilize the site, determine the extent of contamination both on and off site, and address contamination in the nearbyresidential area. During 1988 and 1989, extensive soil, sediment, and surface water samples were collected from the site and the surrounding residential areas. Based on the extent of contamination identified during the sampling program and an EPA action level of 500 pglg, soil was removed from 80 residential properties by the end of 1989. This effort consisted of removal of 6 in. of soil from part or all of the yard, soil replacement, and 884

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 29, NO. 4 , 1 9 9 5

relandscaping of the property. Homes were thoroughly cleaned, and all carpets were replaced. Despite this effort, disagreement arose as to whether remediation was required at additional homes farther from the site which also had measured soil lead concentrations above 500 pglg. A scientifically credible basis for estimating the site contribution to the observed soil lead was needed so that a line delineating the extent of responsibility could be drawn. This provided the motivation for this study. To address the need for source attribution at the site, a statistical methodology was developed to relate the lead concentration of residential soil samples to factors representing the former plant emissions and background urban sources such as vehicular emissions and lead-based house paint. The model allows probability estimates to be made for the contribution of each source to exceedances of the 500pglg threshold. The remainder of this paper describes the available data set in detail and the statistical model developed to evaluate the data. The model is criticallytested to evaluate its representativeness and accuracy. The model is then applied to determine the extent of responsibility for remediation based on the predicted mean contribution and probability estimates for source attribution.

-

----PROPERTY LINE o LeadGoncmraUon~OverSWppm P (k-1 a Lou!ConcmMmUonsLe8slhanSWppm

FIGURE 2. Soil lead concentration in residential surface samples near the site.

2. Soil Lead Data Set Residential soil lead data were collected as part of the offsite investigation conducted during 1988 and 1989. The initial off-site sampling was limited to properties close to the site. As data were received and evaluated, sampling efforts continued and were expanded radially away from the site until soil lead concentrations were predominantly less than 500 pglg. An additional area beyond this point was sampled, intending to confirm that the limit of contamination had been identified. The number of locations sampled on each residential propertywas dependent on the size of the property and the extent of open available ground. An average of five locations (minimum of three, maximum of 59) were sampled on each residential property. Soil samples typically consisted of an 18-in. (45.7 cm) vertical core stratified into three layers of 0-2,2-10, and 10-18in. (0-5.1, 5.1-25.4, and25.4-45.7 cm). Where an organic root mat existed, the vegetation was removed prior to collecting the sample. Soil sample digests were analyzed by flame atomic adsorption, EPA Method 239.1. The digestion procedure followed EPA Method 3050 (SW-846). A 1-2-g sample of homogenized soil was digested in hot, reagent-grade nitric acid, followed by 30%hydrogen peroxide. Heated, dilute, reagent-grade hydrogen chloride was used as the final digestant. Of the approximately 3500 samples taken in the area surrounding the site and analyzed for total lead, nearly 2500 were from residential properties. The subset of these corresponding

to surface measurements (0-2 in. (0-5.1 cm)) comprise the primary data set analyzed in this paper (1077 observations). The soil lead data indicated a trend of decreasing soil lead concentration with distance away from the site, as shown in Figure 2. However, as also shown, individual measurements were highly variable, and some elevated concentrations appeared at greater distances. To better characterize the variation, each residential surface soil sample was classified as to whether it was (i) from a "disturbed location where constructionor soil replacement had occurred late in the period of operation or since closure of the site; (ii) close to an older home (within5 ft of a home built prior to 1955)and therefore subject to significant effects from weathered paint; (iii) near a drainage feature such as a downspout, roof runoff line, or tree where deposition could have been enhanced or collected from a larger area: or (iv) at or near locations with evidence of other potential sources of lead, such as trash burning or coal dust. Other factors,such as proximity to roads, were considered for the model but were not included; no major highways pass through the study area. The statistical model to distinguish the plant influence from background sources considers the aforementioned factors in conjunction with the magnitude of deposition from the plant release. The atmospheric transport and deposition model used to determine the site deposition factor for input to the statistical model is now described. VOL. 29, NO. 4, 1995 / ENVIRONMENTAL SCIENCE & TECHNOLOGY 1885

TABLE 1

Atmospheric Modeling Assumptions source type deposition type receptor grid terrain elevation area option w i n d profile exponents vertical t e m p gradients regulatory default MET data particle density particle size

fugitive dust cloud

melting pot stack

area Sehmel polar/discrete Yes rural def au Ita defaulta Yes Scranton 1.5 g/cm3 50% 25 prn 50%, 50 prn

point Sehrnel polar/discrete Yes rural defaulta defaulta Yes Scranton 1.5 g/crn3 loo%, 2 p n

aDefault assumptions as specified by U.S. EPA for regulatory applications in the United States. The predicted deposition from lowlevel area sources is very insensitive to these assumptions.

Seven sources, including six area sources and one stack source for the melting pot, were evaluated with the model (Figure 3). All model predictions were made using an assumed unit emission rate of 1.0 gls. Source area 1is the primary location where batteries were dropped off and cracked open during the period of operation at the site. As such, the highest on-site soil lead concentrations were observed at this location, and it was believed to be the most important source location for fugitive dust lead emissions. The other areas served as disposal points for discarded battery casings. The position and relativelyhigh elevation of area 1 also suggested that it should be a source area of primary importance. The deposition profile resulting from a unit emission from source area 1 is shown in Figure 4. As shown, the areas of highest deposition are located to the northeast and southeast of the plant. This is consistent with the predominant wind patterns at the site and the increasing terrain heights in these directions.

3. Atmospheric Dispemion and Bepartkn Metlel

4. Statistical Model

An atmospheric dispersion and deposition model was

In the statistical model, it is assumed that the total soil lead concentration of a sample, Li,is the sum of the background concentration, Bi, and the site contribution, Mi:

developed to predict the fate of melting pot stackemissions and fugitive dust plumes from the facility. Since the historic emission rates from the plant are not known, the model was evaluated with unit emissions to describe the general shape or pattern of the off-site deposition profile. The modeling procedure is based on the EPA Industrial Source Complex Short Term (ISCST) model (23, 241, as modified by the CalifomiaAir Resources Board (CARB)(25). The CARB model includes a deposition subroutine known as the Sehmel Deposition Program, commonly used for evaluation of dry deposition of toxic air contaminants (26). However, an important limitation of that program is that it does not include a provision for accounting for depletion of the plume as it travels outward from the emission source, so that mass balance is not preserved. For situations where deposition is a small component of the overall mass balance (e.g.,stack releases with small particle sizes), this omission is minor. However, in our application, where alow fugitive dust plume is modeled, significantplume depletion occurs, and a mass balance correction is essential. A mass balance correction was implemented in the form of apost-processor to the CARBmodel. This was performed using a technique referred to as “Q reduction” or “source strength reduction”. This involvescalculatingthe total mass of lead predicted to deposit at discrete distances (usually 50-m increments) out from the source. For each incremental distance, the amount deposited is subtracted from the source strength for each succeedingdownwind distance. The importance of including this correction for the emission from the site fugitive dust sources is demonstrated by the result that approximately 50% of the emitted mass from these sources is predicted to deposit within the first 350 m, with over 90% deposited by 1000 m from the site. The basic assumptions of the atmospheric model are summarized in Table 1. These include assumptions concerning the different source types, meteorological inputs, inclusion of complex terrain effects, the particle sizes and density for the deposition calculations. The assumed sizes and densities of the particles emitted from the area fugitive dust sources are especiallyimportant inputs to the model. The on-site conditions during operation of the plant were similar to that of an unpaved road, and particle sizes and densities for the area sources were selected to reflect these conditions (271. 886

1

ENVIRONMENTAL SCIENCE &TECHNOLOGY / VOL. 29, NO. 4, 1995

Li= Bi+ M i The background and site contributions are assumed to be independent and log-normallydistributed. The distribution of the total concentration is thus the sum of the two lognormals, with mean

and variance var[Li] = var[B,l

+ var[Mil

(3)

To develop the model, regression equations are formulated for E[BJ, var[BJ, E[MJ, and var[MJ as a function of the relevant predictive variables which characterize a soil sample. 4.1. Soil Sample Variables Included in Model. The candidate variables considered for the model included the following: D, the indicator for disturbed sample (1= yes, 0 = no); Sl-S7, the deposition rates calculated at the sample location by the atmospheric transport model for a unit emission from the plant area sources (Si-SG), and the melting pot (S7) (computed aspg m-2 s-l deposited per gls emitted, with resulting units of pg g-’ m-2; that is, pglm2 deposited per g emitted); E, the indicator for enhanced deposition near structures, trees, or drainage features where lead that would have deposited over a larger area is preferentially collected (1 = yes, 0 = no); CO, the indicator for close and old, i.e., sample is within 5 ft of home built before 1955-used as surrogate for lead paint effects (1 = yes, 0 = no); and T,the indicator for evidence of trash, debris, or other miscellaneous sources that could affect sample (1 = yes, 0 = no). The initial sample was comprised of 1077 soil measurements taken from 230 properties. However, 314 of these measurements were taken from disturbed locations. Because disturbance acts to modify or eliminate the historic record of deposition and the effects of disturbance cannot be modeled in an additive manner, these samples were eliminated from further analysis. The remaining 763 undisturbed samples (representing 194 properties) form

/

FIGURE 3. Source areas for atmospheric dispersion and deposition model (scale length 240 ft = 73.2 In).

the basis for the analysis of site and urban background contribution to soil lead. A summary of the statistical properties of the overall data set, including the subsets of disturbed and undisturbed samples and subsets of the undisturbed sample based on the indicator variables, is presented in Table 2. As indicated, the disturbed samples do indicate significantly lower concentrations compared to the undisturbed samples. Note that 48% of the samples in the site study area with CO = 1 also have E = 1, so that there is significant co-occurrence of the enhanced deposition and close-and-oldindicators. Table 2 also summarizes the statistical properties of soil lead data from a background control area, used later in this paper as a basis for comparison to the inferred background distribution for the site study area. Preliminary analysis of the study area data indicated that only SI, the deposition associated with the battery dropoff and cracking location, was significantlycorrelated with soil lead. The other sources were not found to be significantly correlated with the observed soil lead (either individually or collectively). W i l e the other sources have undoubtedly contributed some lead to the observed soil samples, their signal was indistinguishable from the noise in the data set, and all of the site effect is thus assumed to result from SI.The indicator variable Twas also not found to be significantly correlated with soil lead and was dropped from further analysis. The failure to identify significant correlation between soil lead and T could either be due to an inadequate ability to properly identify the sample locations with trash or debris or the inconsistency or insufficient magnitude of the trash effect on soil lead. The

significantvariables remaining in the model thus included S,, E, and CO. 4.2. Model Equations. The predictive equation for the mean soil lead concentration of a sample i is given by

+

E[L,I = a1 plCOi

+ pzSli+ PJEjSli

(4)

which is the sum of the background mean

+

E[&] = a, plCOi

(5)

HM,] = PzSli + p&&i

(6)

and the site mean

The variance of a given sample is

var [L~I= a, + P,co,

+ psE2[M,I

(7)

which is the sum of the background variance

+

var[B,] = a2 P,COi

(8)

and the site variance

var [Mil = p5E2[Mil

(9)

The background mean and variance of a sample thus depend on whether it is close and old. The site mean depends on Sli and whether the sample exhibits enhanced deposition. The expression (j3 ,82)/82 is the ratio of the average site soil lead contribution of an enhanced vs a nonenhanced sample, given the same deposition value S l i .

+

VOL. 29, NO. 4, 1995 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

887

-0.b-Si

CONTOUR LINE

gig%$ FIGURE 4. Deposition (rg m-2 s-l) calculated for unit emission (1.0 g/s) from source area 1 (scale length 240 fi = 73.2 m). TABLE 2

Statistical Summary of Residentid Surfaced Soil Lead Data (/cg/g) nuantiles (YO) sample group

no. (%Jb

total data set disturbed ( D = 1) undisturbed ( D = 0) unenhanced ( E = 0) enhanced ( E = 1) not close and old (CO = 0) close and old (CO= 1) no trash ( T = 0) trash ( T = 1)

1077 314 763 (100%) 624 (82%) 139 (18%) 652(85%) 111 (15%) 710 (93%) 53 (7%)

total data set disturbed ( D = 1) undisturbed ( D = 0) not close and old (CO= 0) close and old (CO = 1)

187 80 107 (100%) 92 (86%) 15 (14%)

min

25

50

15

max

Site Study Area 579 890 390 551 657 986 595 834 935 1461 626 887 841 1427 670 1013 485 466

7 7 10 10 26 10 33 10 44

182 78 224 213 260 215 291 225 169

342 238 386 372 540 373 540 389 369

640 462 695 640 1015 660 845 700 620

13300 4760 13300 9340 13300 9340 13300 13300 2850

Background Control Area 213 455 53 50 332 573 193 174 1183 1176

9 9 14 14 67

42 22 86 80 210

85 37 170 140 570

200 71 295 2 40 2100

3050 300 3050 1100 3050

mean

standard deviation

* Residential surface measurements (0-2 in.; 0-5.1 cm). Percentages of the undisturbed sample analvzed for the statistical model

The relationship for the variance of the site concentration in eq 9 assumes a constant coefficient ofvariation (standard deviation divided by the mean) for the site contribution, given by /3s1'*. 4.3. Model Estimation. Equations 4 and 7 for the soil lead mean and variance are estimated sequentially. First, the parameters al,PI,82, andj3, are estimated by regressing the observed Li vs COi, SIj, and E& Then, the residual, ci = Li - E[LJ, is computed for each sample, and the 888

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 29, NO. 4, 1995

coefficients a2,/34, and BSare estimated by regressing e? vs COi and @2S1i+ ,5&Sd2 (the later is equivalent to P [ M J ) . This method of determining var[LJ is appropriate since ElcJ from the regression model in eq 4 equals 0, and

The relationship for var[LJ = E[e2J is thus determined by regressing cZi vs the predictive variables in eq 7.

TABLE 3

Estimated Coefficients for Statistical ModeP mean leq 46) background factors

variance (eq 7c) site factors

method

a1 (rdd

81 Gude)

82 (m21d

ordinary least squares

355 (45) 275 (34)

277 (96) 310 (57)

349 (41) 418 (148)

weighted least squares

133

h21d

108 (66) 397 (229)

+

background factors

site

az bdsP

84 (PdSP

factor 85

412 800 (275 600) 215 900 (128 200)

1 411 200 (686 000)

1.16 (0.43) 1.44 (1.18)

448 300 (324 700)

+

a Standard errors of the estimates shown in parenthesis. €(Li) = a 1 plCOi + p2Sli fiI€iSli.Cvar[Lil= a2 + p4COi+ P6P[Mil. Note that the units of 82 and 83 are r g l g concentration in the soil per pglm2 deposited per g emitted, Le., pglg divided by I g g-’ m-2 = m2.

An estimation method was sought that could account for the highly skewed, heteroscedastic (unequal variance) nature of the data while still maintaining the additive form of the model. The approach of weighted least squares regression was chosen. The primary intent of weighted least squares is to reduce the influence of observations which have a high expected variance, ideally in proportion to this variance. A variety of approaches are available for selecting the weights in weighted least squares regression (e.g., refs 28 and 29). A weighting factor l/SI2was chosen in this study because of the strong influence of SIon the expected variance. In addition to addressing the problem of heteroscedasticity, weighting by 1 IS? has the advantage that it places higher weights on observations farther from the site. It is here that the uncertainty and controversy over the site vs background contributions is greatest, and this is where the model would be utilized to make regulatory decisions as to whether or not to remediate properties. It is thus especially critical that the model fit the observations well in this portion of the study area. The coefficients determined by the regression analysis are shown inTable 3. The weighted least squares estimates are shown along with those determined from ordinaryleast squares regression. Significant differences are apparent in the coefficients estimated from the two methods. In particular, the weighted estimates tend to attribute more of the observed soil lead to the site factors and less to the background sources. While there is no scientific reason to prefer this result, it is noteworthy from a regulatory standpoint that, of the two estimates considered, the one chosen is more conservative with respect to assigning responsibility for remediation to the plant. Ultimately, the validity of the weighted least squares model is supported by the confirmation studies in section 5. The weighted regression results imply a background mean of 275 pg/g and a background standard deviation of 465 pg/g for samples which are not close and old (CO= 0) and 585 and 815pg/g, respectively, for samples which are close and old (CO = 1). The estimated site contribution to the mean soil lead of a sample with nonenhanced deposition ranges from 21 to 2128 pg/g (corresponding to a range in S1 from 0.05 to 5.071, with a mean across samples of 293 pg/g. These values nearly double (& /32]//32 = 1.95) for enhanced deposition samples. A contour plot of the mean soil lead concentration attributable to the site for nonenhanced samples is shown in Figure 5. 4.4. The Log-NormalModel. The comparisons and applications of the model performed in the study employ the assumption that Bi and Mi are log-normallydistributed. The probability density function of the background soil lead of a sample is given by

+

where bi denotes a particular value of the random variable Bi, and t~~ and 4~~are the parameters of the distribution, computed by

(13)

where

YBi

is the coefficient of variation

(14) and E[BJ and var[BJ are determined from eqs 5 and 8. A similar expression describes the distribution of the site contribution, Mi, with parameters computed from the estimated moments in the regression model (eqs 6 and 9). The cumulative distribution functions (cdfs) for the background and site concentrations are computed in the usual manner for a log-normal distribution: FBi(bi)

= @[(In bi - 6BBi)/4Bil

(15)

where @ [ I is the cdf of a standardized normal variate. The inferred parameters for the background distribution are computed from Table 3 and eqs 12- 14 as E B ~= 4.94 and 4~~= 1.16 for a sample that is not close and old (CO = 0); and 6,= 5.82 and #Ei = 1.04 for a sample that is close and old (CO = 1). These imply a geometric mean (or median) background concentration of 140pg/gfor a sample that is not close and old and of 337pglg for a sample that is close and old. The computedvalue of for the site distribution differs for each sample, as it depends on the values of Sli and Ei. The estimated value of p5 implies a coefficient of variation of 1.20 for the site contribution, so that 4~~= 0.94 for all samples. The distribution of the total lead concentration for a sample is computed by noting that the total lead is the sum of the background and site components, which are each log-normal. Because the site distribution differs for each sample, the total distribution differs as well. The cdf for the total lead distribution of a sample is determined by calculating the exceedance probability: VOL. 29, NO. 4,1995 / ENVIRONMENTAL SCIENCE &TECHNOLOGY

889

FIGURE 5. Estimated mean soil lead concentration (/rglg) attributable to site (for nonenhanced sample, E = 0) (scale length: 240 ft = 73.2 m).

1 - FLr(ZJ= 1 - FBt(lJ +

h:=o[l - FM,(l, - bJlfBz(bJ db, 1

(17) The first term on the right-hand side of eq 17 (1 - FB,(EJ) is the probability that the background component of the sample is greater than 1,. The second term, given by the integral, is the probabilitythat the background component is less than I,, but the site component is big enough so that the total lead concentration is greater than l,, Le., m,> 1, - b,. This decomposition is useful as it allows an identification of the probability that the site contribution caused the exceedance of a particular value of 1, (e.g., 500 pglg) at a given location. Equation 17 is evaluated numerically. The distribution function computed in eq 17 differs for each sample i. The predicted distribution for soil lead in the entire set of n samples, denoted by FL(~), is computed as a mixture of the n individual sample distributions, each with mixture weight lln:

Equations 17 and 18 are used in the following section to compute the fitted distribution of soil lead for the sample locations for comparison with the observed distribution.

5. C h a t i o n of Model Results Before using the predictions of the statistical model, it is appropriate to demonstrate the adequacy of the model fit 890

ENVIRONMENTAL SCIENCE &TECHNOLOGY / VOL. 29, NO. 4, 1995

and the consistency of the model results with observed data, both at the site and elsewhere. Three comparisons are presented for this purpose: (i)The fitted mixture model for the sample locations is compared to the observed data distribution for these locations in the study community; (ii) The inferred background distribution is compared to observed soil lead data collected in an area near and similar to the study community, but upwind of the site; and (iii) The model coefficient for the site contribution &) is used to estimate emission rates and ambient lead concentrations near the site during the period of plant operation for comparison to historical ambient air measurements. Comparison of the observed and fitted mixture distributions for the sample locations provides a first check of the adequacyof the assumed model structure-is it possible to estimate parameters for the model that reproduce the observed distribution? This is not an independent confirmation of the model, rather it is an empirical check of the goodness of fit. Indeed, so long as the underlying structural assumptions are adequate, one would expect a fitted distribution based on three explanatory variables and seven parameters estimated from the data to provide a very good match to the observed distribution. As shown in Figure 6, the fitted distribution provides a nearly perfect match to the observed distribution, except at very low percentiles where the fitted model slightly overestimates the soil lead concentration. The predicted fraction of sampleswithsoilleadconcentrationabove5OOpglgis0.391, while the observed fraction is 0.398.

10000 3

Wl

I

,1

1

5 10 20

50

1

I

I

,

I

80 90 95

I

I

99 9989

Percent Less Than Given Value FIGURE 6. Comparison of predicted and observed distribution of soil lead in study area sample.

The comparison in Figure 6 demonstrates that the model accurately represents the sampled distribution of total soil lead in the study community, but what can be said about the allocation of this lead between the background and site sources? Is the inferred distribution for the background lead in the study community consistent with available information? Unfortunately, we cannot knowwhat the soil lead concentrations would have been in the study community had the battery recycling plant never been present, but we can examine data from another hopefully similar area. A background soil lead study was conducted in a control community during 1991-1992. The control community has demographic,housing, road, and traffic conditions that are similar to those of the study area, and it is located far enough upwind that neghgible contributionsfrom the plant are expected to occur (negligibledeposition is calculated from the atmospheric transport model for this location).A data set of 187 samples was collected, of which 80 were identified as disturbed (see Table 2). The distribution of the remaining 107 samples is compared to the inferred distribution for the study community background in Figure 7. Because the overall background distribution is a mixture of the background distributions for samples that are close and old and samples that are not close and old, it is necessary to note the fraction of the control community samples for which CO = 1. Fifteen of the 107 samples (14%) are close and old, compared to 15%in the study sample. The derived overall background distribution for the control community, shown in Figure 7A, is thus nearly identical to the background distribution predicted for the study sample. As shown in Figure 7, the derived overall background distribution (Figure 7A) and the derived distributions for samples that are close and old and those that are not (Figure 7B) are similar to those observed in the control sample, particularly with regards to the median, overall level of variability, and magnitude of the difference between samples with CO = 0 and CO = 1. However, some differences in the distributions are apparent. In particular, the control samples that are not close and old exhibit a similar median to that predicted by the statistical model, but a somewhat lower variance. One possible explanation for the lower variance is that the control survey encompassed a smaller, more homogeneous area than did the study data set.

The final model confirmation analysis, comparing observed and predicted ambient lead concentrationsduring the period of plant operation, requires a back-calculation of the implied lead emissionrate from the plant. Assuming the deposition occurred over a time period of duration t and that the lead was mixed into surface soil with bulk the density @b, to an effective penetration depth of implied emission rate from the statisticalmodel is estimated as

The units of eq 19 can be resolved by noting that the units for ,& (m2)derive from the regression coefficient units for eq 4, (ug of lead/g of soil) per pg of lead deposited m-* (g of lead emitted from site)-' (see note c in Table 3). The value of p2 is used (rather than 83 or some combination of ,!32 and 8 3 ) since j32 represents the areawide average ratio of soil lead concentration to unit deposition, without areal enhancement from rooftop runoff, etc. Prior to implementing eq 19, the effective penetration depth for the deposited lead (6,) must be estimated. In this application, this is the depth over which the measured concentrations in the upper soil layer (0-2 in.) are assumed to apply in order to account for all of the deposited lead. Since some of the deposited lead has penetrated beyond the upper 0-2-in. layer, 6, must be greater than 2 in. To determine the appropriate de, an exponential function was assumed to represent the concentration profile with depth

L(z) = L,

+ Lo, exp(-wz)

(20)

where z is the depth from the surface, L, is the natural soil lead concentration that is approached at large z, Lod is the anthropogenic increment to the soil lead concentration at the surface, and w is the exponential decrease rate of concentration with depth. With this profile, the mean soil lead concentration over an interval z1 to z2 is given by

The average measured concentrations (using all nondisturbed residential samples) in the three soil zones are given by x(zone 1; zl, z2= 0, 2 in.) = 657 pg/g Z(zone 2; zl, z, = 2, 10 in.) = 300 pg/g I(zone 3; zl, z2 = 10, 18 in.) = 157 pg/g These are used to determine the three parameters of eqs 20 and 21 (three equations, three unknowns) as

L, = 136pg/g Lo, = 671 pg/g w = 0.2655 in.-' (0.1045 cm-'1

The value of & = 136 pg/g is below the inferred surface layer background lead concentration from the statistical model of al= 275 pg/g; this is appropriate since the latter includes urban background contributions from vehicular VOL. 29, NO. 4,1995 / ENVIRONMENTAL SCIENCE &TECHNOLOGY

891

loooo

3

10000 p

. B,

A, Total sample

Based on whether sample IS close to an old house a

*...

,**'

!,/ ..// 2- /."

1000

1000 P Q

Y

za!

100

100

A f o r Study Area Background

-._ 0

10

v)

10

Observed Distribution for Conaol Area

Observed for Control Area

'3 .1

1

5 10 20

50

80 9095

1

99 99.9

.I

1

5 10 20

50

80 90 95

99 99.9

Percent Less Than Given Value

Percent Less Than Given Value

FIGURE 7. Comparison of derived distribution for study area background to observed distribution for control area sample. TABLE 4

Comparison of Predicted and Observed Ambient Lead Concentrations location nearby properties property 1 property 2 property 3 distant background location location 2.5 mi. from plant

predicted concna (pg/m3)

sample period

no. of samples

observed data median concn (pg/m3)

4.4 (3.4-5.4) 4.3 (2.3-5.9) 3.8 (2.5-5.9)

3/25/75-5/8/75 3/25/75-9/11/75 3/25/75-8/28/75

47

5114~5-9111n5

av concnb(Irg/m3)

35

1.6 1.8 1.6

5.2 (1.9) 4.8 (1.7) 4.1 (1.4)

25

0.4

0.5 (0.04)

14

Predicted value is due to the site release alone and is the average of 4-7 values computed at different locations on property. Range of model predictions for the different locations is shown in parentheses. Standard error of average concentration shown in parentheses.

emissions, leaded paint, etc. while JL, is an estimate of preanthropogenic soil lead concentration. Equations 20 and 21 can now be used to compute the implied fraction of the deposited lead (above JL,) residing in the top 2 in. of the soil profile. This fraction is 0.41, which implies an effective penetration depth of 6, = 210.41 = 4.9 in. (12.4 cm). Equation 19 is solved with this value of 6,, with a representative bulk density for soils of the type present in the study area, @b = 1.8 g/cm3, a 17-yearperiod of operation (z = 17 year), and the value of j32 from the statistical model, 418 m2. This results in an estimate of Q = 0.175 g of leadls = 15.0 kg of leadlday. The estimated daily emission is roughly equivalent to the lead content of 1.5 automotive batteries. Assuming an average plant processingrate of 500- 1000 batteries per day, the estimate implies that between 0.15 and 0.3% of the lead entering the plant was released to the ambient air. While this number appears reasonable, there is no way to validate it per se, except to examine the implied values of the ambient lead concentrations near the site and to compare these to historical measurements. Using the inferred emission rate for source area 1, the ambient lead concentration was computed and compared to observedvalues at selectedlocations where ambient data were available. Since the initial model evaluation was conducted with a unit emission rate of 1 gls, the ambient concentrations obtained from this were simply multiplied by the factor 0.175. The predicted values are shown and compared to the observed data in Table 4. Table 4 compares predicted and observed ambient lead concentrations at three locations on residential properties 892

ENVIRONMENTAL SCIENCE &TECHNOLOGY / VOL. 29, NO. 4,1995

near the plant where ambient air quality measurements were collected by the Pennsylvania Department of EnvironmentalResources. These sampleswere obtained during 1975 when the plant was in full operation. The locations of the three properties are indicated in Figure 1. Since the precise location of the monitor on the property is not known, model predictions were made for from 4-7 points on the property, and the average value was computed. Table 4 shows the range of model predictions for the multiple locations on each property to indicate the possible impact of this uncertainty in monitor location. Ambient measurements collected during this period at a fourth monitor 2.5 mi. southeast of the site are also shown; the average of these data, 0.5 pglm3, is assumed to be reflective of the average ambient background concentration for the area. Ambient samples were collected over a 24-h averaging period using the analytical method now found in 40 CFR Part 50, Appendix G. Table 4 indicates very good agreement between the predicted and observed ambient lead concentrations at the three near-site locations. The predicted values fall between the median and mean values of the observed data at all three monitors. (Since the atmospheric dispersion model predicts average concentrations, the proper comparison is to the arithmetic mean of the observed data.) In each case, the predicted mean is slightlybelow the observed mean with the difference ranging between 0.3 and 0.8 pgl m3. When the average ambient background lead value of 0.5 pg/m3 is added to the predicted ambient values associated with the site release, the agreement between predicted and observed average concentrations is nearly

SOURCE AREA

--

-- - -- - - PROPERTY LINE ~0.33- Probf’but-for” 1 CONTOUR

FIGURE 8. Probability that an exceedance of 500 p u g soil lead was due to the added site contribution (for a sample which is not close and old, CO = 0, and not enhanced, E = 0) (scale length: 240 ft = 73.2 m).

exact and well within the standard errors of the observed means. Indeed, given the approximate nature of the factors estimated for eq 19, even approximate agreement between the observed and predicted average ambient concentrations would be viewed favorably. The close agreement indicated inTable 4thus provides further corroboration for the validity of the model and the associated attribution of the background and site contributions.

6. Probability Model for Background and Site Contributions In discussions with regulatory officials,various criteria were considered for determining a “significant”site impact. One criterion involved an evaluation of the mean in Figure 5-properties within a selected mean contour would be eligible for remediation if they had soil lead readings above 500pg/g. Another criterion involved a “but-for’’determination-but for the influence of the site, the soil lead concentration would not have exceeded 500 pglg. Since the background and site contributions to the lead of a given sample cannot be identified determinisitically, this criterion was implemented by determining the probability that the site contribution caused the exceedance. If this probability was above a selected value, the property would be eligible for remediation. The probajiilitythat an exceedance of the 500pglg threshold is caused by the added site contribution can be estimated (using eq 17) as

prob(but-for)

500& 2 500) - prob(B, 500 n L, 1 500) prob(L, 1 500) = prob(B,

-

J:z[l - FMr(500 - @If€?,(@ 1 - F,z(500)

db,

+ J:z[1 - F , ( 5 0 0 - bJIf4(@ db,

(22) The calculated probability that the exceedance is attributable to the site depends on whether a sample is close and old (when CO, = 1,the probability is lower); the value of SI, (the greater SI,, the greater the probability); and whether a location exhibits enhanced deposition (when E, = 1,the calculated probability is greater). Equation 22 was evaluated numerically, and a contour map was developed to depict the attribution probabilityfor a sample that is not close and old and not enhanced (CO, = 0 and E, = 0). As shown in Figure 8, the probability of a but-for is close to one near the plant and tapers off to between 0.1 and 0.2 near the edge of the study area. Since both the mean and the probability of the site attribution are direct functions of SI,, the contours in Figures 5 and 8 are very similar in shape, and both are similar to the SI contour plot in Figure 4.

As a result of this analysis and related sensitivity studies conducted to ensure that the results were robust to alternative assumptions, particularly in the atmospheric

VOL. 29, NO. 4, 1995 / ENVIRONMENTAL SCIENCE & TECHNOLOGY I 8 9 3

transport and deposition model (30), agreement was reached between the site owner and the U.S. EPA on the extent of off-site responsibility for soil removal at homes with soil lead measurements above 500 pglg. The mean contribution from the site and the probability of causing the exceedance of 500pglgwere considered in determining the extent of required remediation, and soil removal was implemented at an additional 20 properties in 1992. Properties where the probabilitywas less than 0.33that the exceedance of 500 pglg was caused by the plant were not remediated. Also, the results allowed agreement that additional sampling farther from the site (where any exceedances of 500 pg/g soil lead were very unlikely to be attributable to the plant) could be terminated, with attention returning to the on-site remediation activities.

7. Summary and Conclusion The statistical model developed in this paper provides an effective characterization of the spatial distribution and likely source of soil lead in the residential area near a former battery recycling plant. The model scales the lead from the facility to the spatial pattern of deposition generated with a unit emission in an atmospheric dispersion model. Six area sources and the melting pot stack were evaluated with the model; the only deposition profile to be significantly correlated with the observed spatial pattern of soil lead was that associated with area source 1, the main battery dropoff and cracking location of the former plant and the on-site location with the highest soil lead concentrations. Consideration of enhanced deposition was made by noting whether a soil sample was collected near drainage features where deposition over a larger area could have collected. The urban background sources considered in the analysis included lead-based house paint, represented by a surrogate variable indicating whether the soil sample was collected within 5 ft of an older home, and other sources such as trash and coal dust, represented by an indicator variable determined from the survey observation and records. The soil lead concentration was significantly correlated with the indicator variable for lead-based house paint but not with the variable for other sources. The statistical model was formulated to predict the spatial distribution of both the mean and the variance of the soil lead data. Weighted least squares regression was used to estimate the model parameters, accounting for the heteroscedasticity of the data. The site and background contributions to the lead concentration of a sample were assumed to be log-normally distributed and independent. The resulting distribution of soil lead for the area is a mixture of the sum of the log-normal components. The predicted mixture distribution for the sample is shown to be nearly identical to the observed distribution of the measurements. Independent model confirmation studies were conducted by comparing the inferred background distribution for the study area to that observed in a similar, upwind area and by comparing inferred estimates of ambient lead concentration near the site to those measured during the period of plant operation. Both comparisons indicated very good agreement, providing good corroboration for the accuracy of the model and supporting its use for source attribution. The model results were used to help identify those additional properties with soil lead measurements above 500pg/g, where the current owner of the site would pay for soil removal and relandscaping. 894 m ENVIRONMENTAL SCIENCE &TECHNOLOGY / VOL. 29, NO. 4, 1995

The statisticalmethodologypresented here provides one method for evaluating spatial patterns of soil contamination and assigning likely responsibility for this contamination. Other procedures, such as scanning electron microscopy (e.g., ref 311, can be used to help identify particular lead compounds or isotopes in particle samples and relate these to likely sources. These methods are currently being used to further check and corroborate the results of this study at the site. The statistical methodology is generally applicable to historic sites where the spatial extent of contamination and responsibility is difficult to determine because of the presence of other background sources and because of the high level of variability of the data. The procedure can be applied in locations with multiple point, area, or line (e.g., highway) sources, using the computed deposition (or unit deposition) from each as explanatory variables in the statistical model. The statistical model is particularly effective because it accounts for site and background contributions to the mean spatial trend of the soil contaminant levels, the variability around this trend, and the probability that exceedances of a threshold value were due to the added contamination from the site.

Acknowledgments Financial support for this study was provided by Gould, Inc. The insights, encouragement, and patience of Robert Fedor and James Cronmiller of Gould are gratefully acknowledged. Technical support for this study was provided by James Taylor of NePo Associates and Paul Haggerty of Advanced GeoServices Corp. The scientific insights and critical comments of William Steuteville of the U.S. EPA, Region 111, helped to provide the motivation and point the direction for this study. However, this paper has not undergone EPA peer review, and no official endorsement should be inferred. Useful comments and suggestions were also provided by Cliff Davidson, Michael Escobar, Lara Wolfson, and the anonymous reviewers.

Literature Cited (1) ATSDR (US. Public Health Service, Agency for Toxic Substances

and Disease Registry). ATSDR Biennial Report to Congress: October 17, 1986-September 30, 1988; U S . Public Health Service: Atlanta, GA, 1989. (2) NRC (NationalResearch Council). Environmental Epidemiology, Volume 1 , Public Health and Hazardous Wastes; National Research Council Committee on Environmental Epidemiology, Board on Environmental Studies and Toxicology, Commission on Life Sciences; National Academy Press: Washington, DC, 1991. (3) U.S. EPA. Air Quality Criteria for Lead; EPA 600/8-83-028AF, BF, CF, DF, and EPA60218-831028A U.S.EPA Office ofResearch and Development: Research Triangle Park, NC, 1986. (4) Sedman, R. M. Enuiron. Health Perspect. 1989, 79, 291-331. (5) Marcus,A. H.; Cohen, J. Environ. Geochem. Health 1989,9,161174. (6) Wixson, B. G.; Davies, B. A. Enuiron. Sci. Technol. 1994, 28 ( l ) , 26A-31A. (7) Fairey, F. S.; Gray, J. W., I11 J. South Carolina Med. Assoc. 1970, 66, 79-82. (8)Ter Haar, G.; Aronow, R. Environ. Health Perspect. 1974, 7,8389. (9) Bogden, J. D.; Louria, D. B. Bull. Environ. Contam. Toxicol. 1975, 14 (31, 289-294. (10) Jordon, L. D.; Hogan, D. J. N. Z. J. Sci. 1975, 18, 253-2130, (11) Solomon, R. L.; Hartford, 1. W. Enuiron. Sci. Technol. 1976, 10 (8),773-777. (12) Mielke, H. W.; Blake, B.; Burroughs, S.; Hassinger, N. Enuiron. Res. 1984, 34, 64-76. (13) Tong, S. T. Y. Enuiron. Manage. 1990, 14, 107-113. (14) Francek, M. A. Environ. Pollut. 1992, 76, 251-257.

(15) Mielke, H. W.; Anderson, J. C.; Berry, K. J.; Mielke, P. W., Jr.; Chaney, R. L.; Leech, M. Am. 1. Public Health 1983, 73 (12), 1366-1369. (16) Mielke, H.W.;Adams, J. L.; Reagan, P. L.;Mielke, P.W., Jr.Environ. Geochem. Health 1989, 9, 253-271. (17) Mielke, H. W. Appl. Geochem. 1993, 7 (Suppl. 2), 257-262. (18) Mielke, H. W. Am. 1.Public Health 1991, 81 (lo), 1342. (19) Lovering, T. G., Ed. Lead in the Environment; Geological Survey Professional Paper 975; USGS: Washington, DC, 1976. (20) Nriagu, J. 0. Lead in soils, sediments and major rock types. In The Biogeochemistry of Lead in the Environment, A; Nriagu, J. O., Ed.; ElsevierlNorth Holland Biomedical Press: New York, 1978; pp 15-72. (21) US.EPA. Update on Soil Lead Cleanup Guidance;Memorandum from Don R. Clay, Assistant Administrator, Office of Solid Waste and Emergency Response; OSWER Directive 9355.4-02;US.EPA, Washington, DC, 1991. (22) Royer, M. D.; Selvakumar, A.; Gaire, R. 1.Air Waste Manage. ASSOC.1992, 42 (71, 970-980. (23) US.EPA. Industrial Source Complex ( I S 0 Dispersion Model User’s Guide, 2nd ed.; EPA-450/4-88-002a; U.S.Environmental Protection Agency Office of Air Quality Planning and Standards: Research Triangle Park, NC, 1987. (24) US. EPA. Guideline on Air Quality Models (Revised) and Supplement A; EPA-450/2-78-027R U.S.Environmental Protection Agency Office of Air Quality Planning and Standards; Research Triangle Park, NC, 1987.

(25) California Air Pollution Control Officers Association. Toxic Air Pollutant Source Assessment Manual for California Air Pollution Control Districts andApplicants forAirPollution Control District Permits; CAPCOA Sacramento, CA, 1987. (26) Cross, B. E. Deposition Rate Calculations for Air Tonics Risk Assessments in California. Presented at the 8lstAnnual Meeting of the Air Pollution Control Association, Dallas, TX, 1988. (27) Pinnick, R.; Fernandez, G.; Hinds, B. D.; Schaefer, R. W.; Pendleton, J. D. Aerosol Sci. Technol. 1985, 4 , 99-121. (28) Deaton, M. L.; Reynolds, M. R., Jr.; Myers, R. H. Commun. Statistics, 1983, E12 (l),45-66. (29) Myers, R. H. Classical and Modern Regression with Applications; PWS-Kent Publishing: Boston, 1990. (30) Advanced GeoServices. Statistical Evaluation and Source Attribution of Soil Lead Data for Marjol Battery Plant, Throop, Pennsylvania; Advanced GeoServices Corp.: Chadds Ford, PA, 1993. (31) Hunt, A.; Johnson, D. L.; Watt, J. M.; Thornton, I. Environ. Sci. Technol. 1992, 26, 1513-1523.

Received for review April 26, 1994. Revised manuscript ceived December 28, 1994. Accepted January 9, 1995.@

re-

ES940259F @

Abstract published in Advance ACSAbstracts, February 15, 1995.

VOL. 29, NO. 4, 1995 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

895