Estimating the mean of data sets with nondetectable values

Jul 1, 1990 - Estimating the mean of data sets with nondetectable values. Curtis C. Travis, Miriam L. Land ... Citation data is made available by part...
14 downloads 9 Views 1MB Size
Estimating the mean of data sets with nondetectable values By Curtis C. Travis and Miriam L. Land

Interest in the determination of trace levels of contaminants in environmental media has increased with the recognition that even trace levels of pollutants can pose risks to human health and the environment. The analysis of trace level environmental data is frequently hampered, however, by chemical concentrations that are below detection limits established by analytical laboratories ( I , 2). Such concentrations are generally reported as “less than detection limit” rather than as actual numerical values. In determining the mean of such data, three approaches are commonly used: assume all nondetectable points are equal to zero, assume nondetectable points are equal to half the detection limit, or assume nondetectable points are equal to the detection limit. All these methods introduce a bias and result in erroneous estimates of the mean and standard deviation ( I ) . A method has been known for some time that circumvents these problems (3, 4). It assumes that measured environmental data represent repeated samples from a lognormal probability distribution where only sample values above the detection limit are known. However, these values are often enough to define the right hand tail of the lognormal distribution, from which it is then theoretically possible to reconstruct the entire distribution (and thus obtain knowledge of the mean and standard deviation). To aid in this analysis, engineers have introduced a graphical approach called log-probit analysis (3, 4). In a log-probit analysis, all measurements (both detectable and nondetectable) are assumed to be samples taken from the same lognormal probability distribution. The assumption of lognormality for environmental data is fairly universal. A probit scale is designed so that when samples from a lognormal distribution are plotted on a probit scale, they will lie on a straight line

Curtis Travis

Miriam Land

(Figure 1). In a probit analysis involving both detectable and nondetectable values, the nondetectable values are treated as unknowns, but their percentile values are accounted for. Thus, if there were 100 samples, 30 of which are nondetectable, the first detectable data point would be plotted at the 31st percentile. If sufficient data exist, they can be used, through linear regression, to define the straight line characterizing the entire data set. The geometric mean concentration for the data set (both detectable and nondetectable values) is then determined from the 50th percentile value. Thus, a probit analysis allows the geometric mean to be extrapolated

from detectable values even if it is below detection limits (provided there are sufficient detected values to define the probability distribution). Available probit tables (4) and computerized programs (5-7) facilitate the process of calculating probit values. The geometric mean of the environmental concentrations can be estimated from the point on the regression line corresponding to the 50th percentile, and the standard deviation can be estimated from the antilog of the slope of the-regression line. We briefly present a probit analysis of a heavily censored data set. Because of the extreme toxicity of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), much concern and debate have arisen about human exposure to it. Levels measured in fish taken from lakes and rivers in the United States confirm that TCDD is bioaccumulating in fish and that low-level contamination of fish is widespread. Roughly 65% of the TCDD concentrations in whole fish were below detection limits. The probit plot, however, provides an estimated geometric mean of 0.45 pg/g. This example illustrates one of the primary benefits of probit analysis: its ability to estimate geometric means that are near or below the detection limit. Log-probit analysis provides a robust method of evaluating environmental data sets with a large percentage of values lying below the detection limit. Limitations of the method have been pointed out ( I , 8, 9) ; however, it is easy to use and it is less biased and more accurate than other frequently used methods (2). We therefore suggest that it become the method of choice for analyzing trace level environmental data.

References Porter, P. Ward, R. C.; Bell, H . F. Enoiron. Sci. Technol. 1988, 22, 856-61. (2) Gilliom, R. J.; Helsel, D. R. “Estimation of Distributional Parameters for Censored Trace-Level Water Quality Data”; open file report 84-729; U.S. Geological Survey: Washington, DC, 1984. (1)

Environ. Sci. Technol., Vol. 24, No. 7,1990 961

(3) Gilbert, R. 0. Statistical Methods for Environmental Pollution Monitoring; Van Nostrand Reinhold: New York, 1987; pp. 168, 181-82. (4) Finney, D. J. Probit Analysis: A Statistical Treatment of the Sigmoid Response Curve, 2nd ed.; Cambridge University Press: London, 1952. ( 5 ) Sette, A. et al. J. Zmmunol. Methods, 1986,86, 265-77. (6) Abou-Setta, M. M.; Sorrel], R. W.; Childers, C. C. Bull. Emiron. Contam. Toxicol. 1986, 36, 24249. (7) SAS Institute Inc. S A S User's Guide: Statistics, Version 5 Edition; S A S : Cary, N C , 1985. (8) Gilbert, R. 0.; Kinnison, R. R. Health Phys. 1981, 40, 377-90. (9) Lambert, D. J. Am. Stat. Assoc. 1988, 83, 1226-27.

Curtis C. Travis is the director of the Office of Risk Assessment at the Oak Ridge National Laboratory in Oak Ridge, TN. He has a Ph.D. in mathematics from the University of California-Davis. Miriam L. Land, an employee of DPRA, Inc., is a subcontractor to the Office of Risk Analysis, Health and Safety Research Division at Oak Ridge National Laboratory. She has a B.A. degree in businessfrom Graceland College and an M.B.A. and a master's degree in statistics from Kansas State University, Her primary research involves mathematical modeling of emironmental data.

From pesficides/residues to food and feed processing

lournal of

andFmd wiculturalchemistry Now published monthly!

Editor: Irvin E. Liener University of Minnesota

Associate Editors: G. Wayne Ivie

USDA

MamhaU Phillips USDA

Let the Journal of Agricultural and Food Chemistry keep you up to date on the production and safety of foods, feeds, fibers, and other agricultural products, as well as the chemical, biochemical, and nutritional aspects of foods and feeds. As a monthly publication, the journal promises to deliver an additional 700 pages per year. Read wide-ranging reports and original research used by your colleagues virtually every working day!

962

Environ. Sci. Technol., Vol. 24, No. 7, 1990

ACS Member** 1990 Rates

l Y m 2Years

us.

$25 $41 $52

$45

$61

$117

CanadatGMexico Europe* All Other Countries* *Includes Air Service

$77 $99

I -

Nonmember 1 Year

$204 $220 $231 $240

**Memberrates are for personal use only For nonmember rates in Japan, contact Maruzen Co.. Ltd.

For more information or to subscribe, contack American Chemical Society Sales and Distribution Dept. 1155 Sixteenth Street, NW Washington, DC 20036

In a hurry? Call Toll Free 1800/227-5558 (US. only). In D.C. and outside the US.:(202) 8724363.