Anal. Chem. 2008, 80, 4767–4772
Limitations of Maximum Likelihood Estimation Procedures When a Majority of the Observations Are Below the Limit of Detection Ram B. Jain* and Richard Y. Wang Centers for Disease Control and Prevention, Mail Stop F-47, 4770 Buford Highway, Atlanta, Georgia 30341 We evaluated the performance of maximum likelihood estimation procedures to estimate the population mean and standard deviation (SD) of log-transformed data sets containing serum or urinary analytical measurements with 50-80% of observations below the limit of detection (LOD). We found that maximum likelihood procedures are limited in their ability to accurately estimate the population mean and SD when the percent of censored data was large and sample size was small. The means were more likely to be underestimated and the SDs were more likely to be overestimated using these procedures. When the sample size, N, was e100 and the percent of observations below the LOD, P, was g70%, the procedure without imputations performed better than those with imputations. However, the procedure with multiple imputations performed better than or was comparable to other procedures when N was at least 100. This finding was consistent with the improved estimates of the mean and SD in a data set (N ) 113) of polychlorinated biphenyl (PCB) concentrations using multiple imputations. We recommend the use of maximum likelihood procedures with multiple imputation when N g 100 and P < 70%. A maximum likelihood procedure without imputation should be preferred when N < 100 and P g 70%. However, it should be the expected that biases for both mean and SD in these circumstances may be unacceptably high. In a recent investigation,1 we evaluated the performance of three maximum likelihood procedures with one imputation (LUBIN_1), with five imputations (LUBIN_5), and without any imputation (LYNN_MLM) based on earlier work by Lubin et al.2 and Lynn,3 respectively, in estimating observations below the limit of detection (LOD) when the number of observation below the LOD was e40% in the data set. We extend our previous study by evaluating the performance of these estimation procedures in data sets when the percentage of observations below the LOD is as high as 80% using the same simulation methodology and log* Corresponding author. E-mail:
[email protected]. Fax: 770-488-8150. Phone: 770488-5002. (1) Jain, R. B.; Caudill, S. P.; Wang, R. Y.; Monsel, E. Anal. Chem. 2008, 80, 1124–1132. (2) Lubin, J. H.; Colt, J. S.; Camann, D.; Davis, S.; Cerhan, J. R.; Severson, R. K.; Bernstein, L.; Hartage, P. Environ. Health Perspect. 2004, 112, 1691–1696. (3) Lynn, H. S. Stat. Med. 2001, 20, 33–45. 10.1021/ac8003743 Not subject to U.S. Copyright. Publ. 2008 Am. Chem. Soc. Published on Web 05/20/2008
transformed data from NHANES (2003-2004).4 The same four chemicals as in our previous study1 were selected for this study also. After original data were log-transformed, the log-transformed data were tested for normality using SAS Proc UNIVARIATE. The log-transformed data were found to be approximately normally distributed based on Kolmogorov-Smirnov’s D-statistics. Specifically, chemicals used with their sample sizes, mean, and standard deviation (SD) of the log-transformed data were serum sodium (N ) 6491, µ ) 2.14 mmol/L, σ ) 0.006 mmol/L), serum iron (N ) 2904, µ ) 1.84 µg/dL, σ ) 0.22 µg/dL), urine albumin (N ) 7739, µ ) 0.97 µg/dL, σ ) 0.55 µg/dL), and blood lead (N ) 8373, µ ) 0.18 µg/dL, σ ) 0.29 µg/dL). Thus, anywhere we mention “mean” and “SD” in this manuscript, we really intend to say the “mean” and the “SD” of the log-transformed data. Data sets containing observations below the LOD greater than 80% were not evaluated because summary statistical computations, such as the mean and SD, are unlikely to be estimated with an acceptable bias is such cases irrespective of the applied estimation method. As in the previous study,1 performance of the three maximum likelihood procedures was evaluated by the relative percent bias averaged over 500 samples for both mean and SD and the coverage rates for the mean and SD. Coverage rate for the mean was defined as the percent of times the sample mean was not statistically significantly different than the population mean, and the coverage rate for SD was similarly defined. Also, we used the same1 field investigation data on serum levels of polychlorinated biphenyls PCB152, PCB74, PCB99, and PCB118 to evaluate the performance of the recommended procedures (see Table 1). The details about the experimental methods for this study are discussed elsewhere,1 so we directly proceed to report the results. RESULTS Results are presented in Tables 2–5, one for each chemical data set used in this study. Each table presents the average percent relative bias of the mean and SD by the expected percent of observations below the LOD, P, and sample size, N. The percent coverage rates, defined as the percent of times that the estimated means and SDs were not statistically significantly different from µ and σ, respectively, by N and P for Lubin_5 are given in Figure 1. Data for coverage rates for Lubin_1 and LYNN_MLM are not presented because coverage rates by these two procedures were consistently unacceptable for all N and P. (4) National Center for Health Statistics. http://www.cdc.gov/nchs/about/ major/nhanes/nhanes20032004/lab03_04.htm (accessed Oct 16, 2006).
Analytical Chemistry, Vol. 80, No. 12, June 15, 2008
4767
Table 1. Actual and Estimated Means and Standard Deviations for Log10-Transformed Serum Polychlorinated Biphenyl (PCB) Levels by Percent Censored Values, P parameter estimates parameter estimates parameter estimates by LYNN_MLM by LUBIN_1 by LUBIN_5 percent bias by percent bias by percent bias by procedure procedure procedure LYNN_MLM for LUBIN_1 for LUBIN_5 for
actual parameters mean P variable (units) SD (%)a PCB52
0.54
PCB74
0.84
PCB99
0.72
PCB118
0.86
a
0.26 50 60 70 80 0.37 50 60 70 80 0.35 50 60 70 80 0.40 50 60 70 80
mean
SD
mean
SD
mean
SD
mean
SD
mean
SD
mean
SD
0.60a 0.63 0.68 0.74 0.96 1.00 1.06 1.14 0.81 0.86 0.92 1.02 0.98 1.04 1.11 1.20
0.18 0.16 0.13 0.10 0.24 0.21 0.18 0.13 0.26 0.22 0.18 0.12 0.29 0.25 0.20 0.15
0.53 0.53 0.53 0.53 0.86 0.83 0.85 0.87 0.73 0.72 0.76 0.75 0.87 0.92 0.88 0.89
0.26 0.25 0.25 0.25 0.35 0.41 0.38 0.34 0.34 0.35 0.31 0.32 0.40 0.35 0.39 0.38
0.52 0.54 0.56 0.57 0.84 0.84 0.85 0.86 0.71 0.72 0.72 0.7 0.87 0.89 0.83 0.84
0.26 0.24 0.23 0.22 0.38 0.38 0.37 0.36 0.38 0.37 0.36 0.36 0.42 0.40 0.45 0.44
11.1 16.7 25.9 37.0 14.3 19.0 26.2 35.7 12.5 19.4 27.8 41.7 14.0 20.9 29.1 39.5
-30.8 -38.5 -50.0 -61.5 -35.1 -43.2 -51.4 -64.9 -25.7 -37.1 -48.6 -65.7 -27.5 -37.5 -50.0 -62.5
-1.9 -1.9 -1.9 -1.9 2.4 -1.2 1.2 3.6 1.4 0.0 5.6 4.2 1.2 7.0 2.3 3.5
0.0 -3.8 -3.8 -3.8 -5.4 10.8 2.7 -8.1 -2.9 0.0 -11.4 -8.6 0.0 -12.5 -2.5 -5.0
-3.7 0.0 3.7 5.6 0.0 0.0 1.2 2.4 -1.4 0.0 0.0 -2.8 1.2 3.5 -3.5 -2.3
1.0 -6.3 -11.3 -14.9 1.7 2.9 -0.2 -2.5 8.1 6.6 4.0 2.8 4.4 -0.2 13.4 10.5
Bold values are statistically significantly different from population parameters at R ) 5%.
Table 2. Average Percent Relative Bias of Mean and SD for Serum Sodium (µ ) 2.14 mmol/L, σ ) 0.006 mmol/L) by Sample Size N and Percent Censored Observations P average percent relative bias of mean
average percent relative bias of SD
procedure
P
N ) 20
N ) 40
N ) 100
N ) 200
N ) 20
N ) 40
N ) 100
N ) 200
Lynn_MLM Lynn_MLM Lynn_MLM Lynn_MLM LUBIN_1 LUBIN_1 LUBIN_1 LUBIN_1 LUBIN_5 LUBIN_5 LUBIN_5 LUBIN_5
50 60 70 80 50 60 70 80 50 60 70 80
-0.0178 -0.0088 -0.0104 -0.0041 -0.0403 -0.0019 -0.0428 -0.5061 -0.0301 -0.0275 -0.1247 -0.5442
-0.0067 -0.0084 -0.0087 -0.0075 -0.0030 -0.0063 -0.0089 -0.0957 -0.0042 -0.0078 -0.0065 -0.0670
-0.0011 -0.0018 -0.0058 -0.0033 0.0010 -0.0003 -0.0049 0.0006 -0.0005 0.0011 -0.0044 -0.0009
-0.0019 -0.0026 -0.0031 0.0044 -0.0013 -0.0002 -0.0016 0.0053 -0.0015 -0.0015 -0.0023 0.0039
1.2110 -2.4432 -2.2180 -1.9758 4.0675 -5.9087 3.1133 67.9408 8.4778 5.2713 40.7731 149.6061
1.4335 1.1193 1.2206 -2.4823 0.0089 0.0121 0.9658 7.0321 1.3458 1.8308 2.1145 14.7580
0.3095 -0.0019 0.9604 -1.0467 -0.5829 -0.6938 0.6481 -1.7108 0.3652 -0.6763 1.0976 -0.4404
0.0697 0.3990 0.5023 -1.4352 -0.1713 -0.3149 0.3073 -2.0265 0.0866 0.2515 0.7175 -0.9067
Average Percent Relative Bias of the Mean and SD. The absolute average relative bias of the mean was