Uncertainty in risk assessment A case
sm involving sulfir transport and health effects
M. Granger Morgan Max Henrion Carnegie-Mellon Universiry Pinsburgh, Pa. 15213 Samuel C . Morris Bmkhaven National Laboratory Upton, N. X 11973 Deborah A. L. Amaral Decision Focus Los Altos, Cal$ 94022 Environmental risk assessment often involves the use of models, the results of which are sometimes used in regulatory decisions or in the drafting of legislation. Because of limited scientific understanding, however, interpretation of models often involves uncertainty. For example, there may be uncertainty about what value to use for specific quantities or coefficients in the model, such as an oxidation rate or the slope of a health effects function. There also may be uncertainty about the actual mathematical equations that should be used to define the model. Experts frequently have different opinions about which coefficient values and model equations should be used. Because they can have important imlution, how much difference does it plications, such uncertainties must not make which health effects experts be ignored. Fortunately, a variety of one consults? techniques for dealing with them are “Objective” evidence often is inade now available (1-6). In a study of the qnate for defining the structure a policy effects of long-range transport of SUImodel should have, the value specific fur-containing air pollution from a coal- model coefficients should assume, and fired power plant (7, s), we refined and the nature and extent of associated undemonstrated a number of these tech- certainties. Under such circumstances, niques. In this example the various often the best thiig one can do is ask opinions of atmospheric scientists all experts to use the evidence available, in led to similar conclusions about the conjunction with their scientific howllong-range transport of sulfur air pollu- edge, to make subjective judgments. tion. However, the various opinions of When such judgments are included in health experts led to divergent conclu- models they should not be hidden but sions about the level of mortality that should be set forth careiidly, openly, and explicitly. could be caused by this pollution. We asked ourselves two questions: If the uncertainty is not too great it is To estimate the average annual mass appropriate to represent uncertainty balance of sulfur in the plume of a about the value of the model coefilarge power plant, how much differ- cients or factors with subjective probaence does it make which atmospheric blity distributions, that is, with a forscience experts one consults about mal mathematical statement of the oxidation rates? “odds” an expert would give that the To assess possible human mortality coefficient actually does have the variarising from exposure to this air pol- ous values he thinks it might have. 882 Environ. Sci.Technol., Vol.
ls, No. 8, 1985
l l _ l _
i
Then, either by means of analytical techniques or using a statistical procedure called stochastic simulation, the uncertainties can be carried through the model calculations and displayed in the model’s results. Stochastic simulation draws samples from the probability distributions that describe each of the uncertain coefficients. These sample values are used to m the model and get an answer. The answer is stored away, and the process is repeated many times. A probability distribution that describes the uncertainty in the model’s output can be constructed using a compilation ofalltheanswers. When experts disagree, the analyst should be careful not to combine their views too quickly. If, despite modest disagreement, the views of different experts lead to roughly the same policy conclusions, then we believe it is a p propriate to combine views. On the other hand, when different experts’ views lead to significantly different conclusions, analysts should not mask
0013-938W85/09190662~1.50/0 0 1985 American Chemical Society
or ignore these dif€erences but should instead focus on understanding and evaluating their implications.
~~
FIGURE 1
Analysis technique
The sulfur example Our analysis involved a hypothetical new 1-GW (1000-MW)sulfurscrubbed, coal-fired power plant in the upper Ohio Valley. The annual average concentration of sulfate aerosol was chosen as the air pollutant of interest Lwaw it was the only sulfur-related pollutant for which at least some of the health experts we consnlted believed they could provide quantitative estimates. This choice of pollutant clearly l i i t e d the generality of the conclusions that could be reached. The general form of the analysis is illustrated in Figure 1. The work performed was divided into three broad tasks: developing mass balance models for multiday transport, developing models to estimate population exposure.~, and developing models to estimate rates of mortality arising from thii exposure. The details of each task are reported elsewhere (7-9). Tbe atmospheric science and health effects experts who participated in this study were chosen from among the leading U.S.scientistsworking in these fields. We wanted to work with experts who represented the full range of major scientific views. However, because we did not intend to combme the results from different experts we were not concerned with obtaining a statistically representative sample of experts. That is, we did not try to make the number of our experts who had any given view proportional to the number of all experts in the world who share that view.
Mass balance models Separate mass balance models for multiday transport were developed to reflect the individual judgments about model form and coefficient values of Seven atmospheric science experts participating in the study. ' h o other experts were unable to supply a set of judgments complete enough to allow us to construct models. Although we imposed no constraints on the form of the models proposed by the experts, all models involved firstorder chemistry. For any given time and location the rate of conversion of sulfur dioxide into sulfur aerosol could be expressed in terms of the percentage of remaining sulfur dioxide converted each hour. Consequently, we were able to incorporate each of the seven specific models within a general modeling framework we called Boxw. The Boxcar modeling framework represents the plume as a string of 96 boxes. Each hour, pollution is moved from one box to the next down the
string. The height to which turbulence canses mixing at the bottom of the atmosphere-that is, the mixing layer height-varies with time of day and season. It is important to keep track of how much of the pollution is above or below this layer. To do this each of the 96 boxes in the model framework is divided into an upper and lower box. Dry depsition of sulfur dioxide and sulfate aerosol comes from the lower layer. Precipitation scavenging or rainout is modeled stochastically from both layers on a seasonal basis. The fraction of sulfur emitted by the plant as sulfate aerosol and the various coefficients required to describe the rate of oxidation of sulfur dioxide as a function of time of day, distance from the plant, and season of the year were obtained or "elicited" as subiective probability distributions from each expert. The distributions describing the experts' judgments about oxidation rates are- s ' in Figure 2; further details are available in the literature (7, 8). Coefficients of dry deposition and precipitation scavenging were
represented as subjective probability distributions developed by the authors after an extensive review of the literature and consultation with the experts. A stochastic simulation of the Boxcar model allows one to estimate the uncertainty concerning the amount of sulfur dioxide and sulfate present as a function of the time of flight of the pollution plume. An example for the case of atmospheric science expert No. 1 is shown in Figure 3. The total mass balance as a function of flight time also was computed, and the results for the case of expert No. 1 are illustrated in Figure 4. We have compared similar results obtained from the individual models d e veloped for each of the seven atmcspheric science experts (7, 8). We conclude that the uncertainty in the output of the model for any given expert is the same as, or sometimes greater than, the differences between the outputs of models of different experts. Thus, to arrive at an average of the annual mass balance as a function of plume flight time it did not matter Envimn. Sci. Technol., MI. 19. NO. 8.1985 66-
FIGURE 2
Comparison of curnulathre pmbaMllty dislributions.
0
1
Average oxidation rate, Wh Winter numb
1.o
[
0.4
&
0.4
0.2
8
0.2
rn
n
E'
5
0
1.O
0.8
1 1: .g
0
o
i
i
i
i
i
i
i
0.6
0 0
Average oxidation rate, %h
PmbaMlity density functions for total sulfur emitted from the plan(.
Fraction of sulfur as So,
1
2
3
4
5
6
7
Annual average percentaga of sulfur smiited as primary sulfate
FIGURE 3
664 Envimn. Sci. Technol.. Vol. 19. NO. 8.1985
3
2
Average oxidation rate, Wh
Fraction of sulfur as S O ; '
which member of the set we consulted, provided we asked for a tidl characterization of the uncertainty about oxidation rates associated with their judgments. A sensitivity analysis has shown that this conclusion does not depend strongly on the specific form of distribution used to characterizeloss rate coefficients (7, s). &hating population dose The Boxcar model estimates mass balance with flight time, but it does not account for tlight trajectory. To estimate population dose, it was necessary to "wed" the Boxcar model to one of the standard plume trajectory models. Because our mortality models contained thresholds, we needed to make estimates of dose that were conditional on a variety of thresholds, both in the level of background sulfate concenmtion and in the population density. We constructed a grid of population densities for the eastern US. and Canada on a map with O S 0 grid cells (8). The 1980U.S.and 1981 Canadiancen-
FIGURE 4
Mass balance as a function of plume flight time
100% loss
06-
100%
so4-2
FIGURE 5
Estimates of annual mortality from chronic exposure to sulfate air
pollution.
7
Health expert 5
7
,
,
,'Heanh expert 1
- Atmospheric expert 1 _____ Atmospheric expert 6 10'
iv Deaths&
sus data were used as the basis for the grid. A similar array was constructed for background sulfate concentration on the basis of 1978 Sulfate Regional Experiment (SURE) data (10).Because the SURE study measured rural background concentration levels only, overall background levels have undoubtedly been underestimated in our work. A set of 384 plume trajectories (four times each day for four days each month in 1976 and 1979, under meteorological conditions as they existed during those years) was produced with the Air Resources Laboratory atmospheric transport and dispersion model (11). Each trajectory was mapped onto the population density and ambient sul-
fate grids. Downwind population densities above threshold values were averaged over all trajectories at each time interval. These average population densities were combined with the corresponding 96 hourly Boxcar segments and then integrated to obtain an estimate of incremental population dose (8). Exposures were not adjusted for indoor vs. outdoor differentials because these were already considered in the development of the mortality models. Mortality models and estimates We developed models of expected mortality based on interviews with health effects experts. The interviews were similar to those we conducted
with the atmospheric scientists. Seven experts were consulted-two inhalation toxicologists, two epidemiologists, and three who have studied historical mortality records by means of regression techniques. Unlike the pollution models developed by the atmospheric scientists, the mortality models provided by the health experts for chronic steady-state exposure of the general public to specified sulfate levels were so diverse that it was impossible to construct a singlemodel framework. Separate stochastic simulation models were Constructed for each expert's judgments using general policy-modeling s o f t w a r e , called Demos, which was developed at Carnegie-Mellon University (12). One expert supplied a morbidity model, which, as reported elsewhere, was compared with the mortality models (8, 9). ' b o experts were unable to provide quantitative models. Using the procedure set forth in Figure I , we combined the three groups of models to produce estimates of mortality, An illustration of the results for a plant in Pittsburgh, Pa., is shown in Figure 5. Results from a plant in Cincinnati, Ohio, were similar (7, 8). We can set a fairly high upper boundary of a few thousand excess deaths per year on the level of mortality. However, between this upper boundary and the lower boundary of no effect, there is no agreement among our set of health effects experts concerning the likely mortality that would result from exposure to sulfate. Within even this small group, almost any answer is possibleincluding an answer that with 100% probability there is no increase in mortality-depending on which expert is consulted. In contrast to the case of the sulfur transport models, uncertainty arising from differences among the models of the health effects experts is generally larger than the uncertainty associated with the model output of any single expert. The effect of uncertainty about atmospheric processes is small compared with the uncertainty about mortality. Further refinement of atmospheric and exposure models at this stage would d o little to reduce uncertainty about average annual mortality. Comparing t h e experts There are significant differences between the two groups of experts we worked with in this study. The atmospheric scientists share similar disciplinary skills and a large, common literature. When asked about details of other experts' work, they gave specific comments that made it clear that they were intimately familiar with studies outside their immediate areas. Enviran. Sci. Technol.. Vol. 19, No. 8. 1985 665
Eliciting subjective judgments from experts Although the classical view of probability is based on I..,...".. ,,rr(ur,,r)r
,.' "I
-CYl,r,lrr.
.I....*
L l l r l r
^.+"". ,a "" a,, i""".~"'La,'L i"
^.A
(18,"
growing school of thought that views probability as r e p resenting a statement of degree of belief. The two views are not mutually exclusive. Indeed. the field of Bayesian statistics includes the former as a subset of the latter. If one adopts the view that a statement of probability can be a statement of degree of belief, then a probability distribution provides a convenient vehicle for experts to describe their judgments about how well or how poorly the value of a given coefficient is known. How should one go about asking an expert to pro. vide such a judgment? There are numerous articles on experimental psychology that suggest that in making judgments about uncertainty, persons unconsciously use a number of simple rules of thumb. Ordinarily, this serves well. But in some situations these rules can introduce significant cognitive biases. For example, there is a common tendency toward overconfidence, that is, underestimating the uncertainty in an uncertain quantity Thus, clearly the first step to take before asking an expert to provide subjective judgments for use in risk assessment is to look carefully at psychological literature to design an interview that steers clear of obvious pitfalls and that minimizes possible cognitive biases. Serious risk assessors should not be interested in the snap judgmenits of experts. Risk assessors need careful, critically c:onsidered, professional judgments, This lakes time and thought and requires an interviewing team that kncW I S the relevant scientific literature well enough to hc?Ip the expert think things through from all possible angles. .. , . . . . . For example, in prepararion lor inrerviewina me armospheric science expc?rts, we assembled file of most of the literature, prc?paredone- or two.paqe summaries of all the importanI t recent papers, and prepared simple, graphic aids summartzingthe results of most of these papers on a few I arge pages. The first several hours of each interview were devoted to an extended technical discussion in INhich we had the expert d e scribe his or her current thinking about the state of the field. We then turned to a more specific discussion of the form our model shoulId assume . ., .and the coefficients .. that we would need to a m v tnat mooei. univ aner several hours of technical'di&ussion did we gei down to the details of eliciting specific subjective probability distributions. TO elicit experts' judgments. it is important first to establish very clearly and specifically what quantity is being considered. Once this is done, the first step in the actual elicitation is to establish the outer limits of
a
..
the range of values the quantity might assume. To do *hi" ,, it ic uIIvII **tan ktal"'l.l 1- ctim.. n,F,,,,U3 ~ur.,,,,ulate the expert's thinking: "You have estimated that the maximum value this coefficient could assume is six. Now suppose that you leave this field for a few years, and when you come value back to it, you learn that the generally accepted . .. . for this coefficient is 6.5. Can you think of any plausible way this miqht happen?" There are seveial reasons to start with a discussion of the extremes rather than the expert's best estimate. Among other things, this can help offset a tendency to produce too narrow a distribution because of the operation of anchoring andadjustment, that is, making judgments by starting with some initial value, such as a first guess, and then adjusting the estimate in light of various factors that were not previously considered. However, there is considerable experimental evidence that persons often do not adjust their estimates enough, so the initial value tends to "anchor" the final estimate. The elicitation of the expert's judgment generally proceeds with questions of the form, "What is the probability that the value of the quantity is greater than [or less than] four?" Points normally are chosen in random order, and experts are asked lo try to treat each question separately In the elicitations performed for this article, we rarely obtained scattered or inconsistent results. When we did, we asked additional questions, :iometimes shifting to elicitation of the value of a coeffi(:ient given a fixed probability of occurrence. A mental aid commonly used in elicitation is the [xobability wheel. This is a disk with a colored sector t hat can bevaried from 0 ' to 360O.Subjects are asked I,o set the size of the colored section so that the chance [hat a spinner would land on the colored portion is the same as the pro&ibility that should be iltti3ched to the question being an!swered. Once enough pi,ints had been elicited, v ve showed a ._ - cumuiauve ,-.: UISLIIUULIUII rough plot of these.puinrs as a curve to the expert and asked whether it looked appro. priate. After the session was over, members of the elicitation team examined the elicited points and their own notes from the session and constructed smooth curves. These were compared, differences were resolved, and then digitized plots of probability density . .. . . . . . . . . .. , .. funciions ana cumuiaiive aisiriouiion iuncuons were sent back to the expert for further review and possible refinement. There is nothing magic about the elicitation of experts' judgments. but doing so satisfactorily requires knowledge, experience, and good judgment. It is not something that should be undertaken lightly or without adequate preparation
_.:_..
_I:_._:L...:__
Additional reading For a general introduction to the problems of making judgments that involve uncertainty and to the specific topics of heuristics and cognitive bias, we suggest the following: Kahneman, D.; Slovic, I?;Tversky, A,, Eds. "Judgmen1 under Uncertainty: Heuristics and Biases"; Cambridge University Press: Cambridge, U.K.. 1982. Fur specific discussions of expert elicitation, we s u p gest the following: Morgan, M. G.; Henrion. M.; Morris, S. C. "Expert
666 Envimn. Sci. Technol.. MI. 19. No. 8. 1985
Judgments for Policy Analysis." Brwkhaven National Laboratory Report BNL 51358; Brookhaven National Laboratory: Upton, N.Y.,1980. Stael von Holstein, C.-AS.; Matheson. J. E. "A Manual for Encoding Probability Distributions"; SRI International: Menlo Park, Calif., 1979. Boyd, D. W.; Regulinski, S. G. "Characterizing Uncertainty in Technology Cost and Performance." R e port ii14; Decision Focus: Palo Alto, Calif., 1979.
In contrast, the health effects experts have a broader range of disciplinary backgrounds. Their literature is more diverse, incomplete, and ambiguous, and many experts are unfamiliar with details of studies outside their particular areas. There is less agreement about what evidence is important or relevant; this is reflected clearly in the range of responses that were obtained from the health experts.
No substitute for science With this example w e have demonstrated a number of techniques for characterizing and dealing with uncertainty in quantitative environmental risk analysis. We believe that these and similar techniques are widely applicable. However, there is some potential for their misuse (2). The most important risk is that, because the techniques yield technical-looking results, they may be inap propriately used as a low-cost substitute for actual scientific research. The set of techniques for including uncertainty in environmental modeling discussed in this article allow one to make fuller use available scientific knowledge and i u limitations in decisions that must be made now-as we wait for science to provide more complete understanding in the future. We believe that through peer review and other professional mechanisms for
Dealing with uncertainty at EPA Savwrai ywam ego crns umcw or nlr Quality Planning and Standards began to consider using expert subjective probability distributions and other techniques for handling uncertainty in environmental models. in support of the development of revised ambient air quality standards. Staff members encountered a variety of difficulties in their first attempts, which were made during the revision of the oxidant standard. However, they persisted. They sought the help of a number 01 leading decision analysts. psychologists. and other experts and have now d e veioped a fairly sophisticated pro. gram in this area. Recently. subjective probability distributions on the health effects of lead were obtained from a group of experts. These distributions describe the experts’ judgments of the likelihood that various levels of lead in blood can give rise to various specific health end points. The officeplans to use these distri butions as one factor in its consideration of the revised air quality stand. aid for lead.
quality control, the dangers arising from the possible misuse of these techniques can be minimized. Moreover, w e believe these dangers are outweighed by the substantial contributions wider use of such techniques can make in raising the quality of quantitative environmental risk analysis. Acknowledgment We thank the following atmospheric science and air pollution health experts for participating in our study: A. Alkezweeny, M. Amdur, S.Bozzo, R. Faster, B. Ferris, J. Forrest, R. Frank, N. Gillani, D. Hegg, R. Husar. T. Larson, L. Lave, E Lipfert, L. Newman, S. Schwartz, and E Speizer. In addition we are grateful for the collaboration of W. Rish and for the assistance provided by C. Bankovilz, W. Clark, 1. Congemi, E. Coveny, C. Cunard, C. Davidson, G. Duncan, C. Fastman. B. Fischhoff, E. Fisher, L. Hamilton, S. Johnson, F! Michael, E. Morgan, S. Morris, B. Nieman, S. Rocco, W. Sholar, F! Steranchak, and H. Thode, Jr. This work was supported primarily by National Science Foundation Grant PRA-7913070. Support also came from the Health and Environmental Risk Analysis Program of the Office of Health and Environmental Research, U.S.Department of Energy. Before publication, this article was reviewed for suitability as an ES&T feature by Talbot Page, Environmental Engineering Science, California Institute of Technology, Pasadena, Calif. 91125 and Thomas H. Moss, Director, Research Administration, Case Western Reserve University, Cleveland, Ohio 44106.
Atmwphcric Transport and Di$pcrsmn Model.“ NOAA Technical Memorandum ARL-SI. Air Resources Laboratorb. National &eanic and Atmospheric A d m k s t r a tion: Silver Spring, Md.,.1980. (12) Henrion, M.; Morgan, M. 0.RiskAnal., Val. 5 No. 3. 1985.
M . GmnRer Morgan (I.) hcwds rhe Dcparrmenr of b i g i n w r i n g and Public Policy ar Carnegie-Mellon Universirv. He received his Ph.D. from rhe Deparrmenr of Applied Physics and Informarion Science or rhe Universiry of California ar San Diego in 1969. His researchfocuses on problem in technology and public policy. He is parricu k d y inreresred in rechniques for characrerizing and dealing qunnrirarively wirh uncerrainry in policy analysis.
Referenm
Max Henrion ( r ) is an assisranr professor in rhe Departments of Engineering and Public Policy and Social Science at Carnegie-Mellon Universiry. He received his Ph.D. from rhe School of Urban and Public Affairs ar Carnegie-Mellonin 1982. His research is on decision rheory and rhe design of compurer aids for decision d i n g under uncerrainry. He is currenrly involved in rhe applicarion of rhese ideas ro probl e m of environmenral policy. including acid deposition.
( I ) Morgan, M.0.et al. “A Generic ‘PreHEED’ on Characterizing and Dealing with UncerLlinN in Health and Environmental Risk Assc&ncnr,” report prepared for the Health and Environmental Risk Assessment Program, U.S. Department of Energy; Department of Engineering and Public Policy, Carnegie-Mellon University: Pittsburgh, Pa., 1983. (2) Morgan, M. 0.In “Assessment and Management of Chemical Risks”; Rodericks. 1. V.; Tardiff, R. G . . Ms.; ACS Symposium Series 239. American Chemical Society: Washington, D.C., 1984; Chapter 8, pp. 113-29. (3) “Reactor Safety Study: An Assessment of Accident Risks in U.S. Commercial Power Plants.” NUREG 751014; Nuclear Regulatory Commission: Washington. D.C.. 1975. (4) Howard. R.; Matheson. I.; North, W. Sciencc 1972, 176. 1191-202. (5) North, W.; Merkhofer. M. Compur. Ope,: Res. 1976.3, 185-207. (6) Rotty. R. “Uncertainties Associated with Future Atmosoherie CO, Levels”: Institute for Energy Analyiir. Oak Ridge Arsuciatcd Untvcrsitics Oak Ridge. Ten” 1977 (7) Morgan. M (i et 31 Risk Anal 15%. 4.
Sumuel C. Mom’s (1.) i s wirh rhe Biomedical and Environmental Assessmenr Division of Brookhaven National Laborarory. He i s an adjuncr associareprofessor in rhe Deparrmenr of Engineering and Public Policy ar Carnegie-Mellon Universiry. He received his Sc.D. from rhe Graduare School of Public Health, University of Pinsburgh, in 1973. Morris is currently srudying healrh effects in workers ar a cwl gasifcarion planr in Yugoslavia and projecring potenrial health risk of U.S. synrhericfuel developmenrs.
.
201-16.
Morgan. M. G. et al. “Technological Uncertainty in Policy Analysis.” final report to the National Science Foundation on Grant
(8)
PRA-7913070; Carncgie-Mellan University: Pittsburgh. Pa., 1982. (9) Amaral. D.A.L., Ph.D. Dissertation. Carnegie-Mellon University, Pittsburgh. Pa.. 1983.
Bankovitz. C., Brookhaven National Laboratory, Upton, N.Y.. personal communication. (11) Heffter. 1. “Air Resources Laboratories (IO)
Lkbond~A. L. Anraml (c) joined Decision Focus after a year as an American Chemical Sociery Congressional Fellow in rhe office of Sen. Mar Baucus (D-Monr.). She received her Ph. D. from rhe Department of Engineering and Public Policy ar Carnegie-Mellon Universiry in 1983. During her year as a congressional fellow she worked on the reauthorization of the Clean Air Act. At Decision Focus she has performed a decision analysis on PCBs and rransformers. Environ. Sci. Tmhnol., Vol. 19. NO. 8. 1%
667