Industry-wide performance in a pilot performance evaluation sample

Industry-wide performance in a pilot performance evaluation sample program for hazardous materials laboratories. 1. Elemental precision and accuracy...
0 downloads 0 Views 773KB Size
Industry-Wide Performance in a Pilot Performance Evaluation Sample Program for Hazardous Materials Laboratories. 1. Elemental Precision and Accuracy David Eugene Kimbrough” and Janice Wakakuwa

Southern California Laboratory, California Department of Health Services, 1449 West Temple Street, Los Angeles, California 90026-5698

This paper discusses the results of an interlaboratory study of the accuracy and precision of environmental laboratory results from 160 accredited laboratories operating in the state of California. The results from these laboratories were compared with a group of reference laboratories using Student’s t-test and Snedecor’s F-test. The mean results for both groups were statistically identical, but the variance for the accredited laboratories was significantly higher. The higher variance was due to several sources. Also discussed is the preparation and validation of the performance evaluation samples for the evaluation of hazardous materials laboratories for elemental solid waste analysis. The samples consisted of five soils spiked with arsenic, cadmium, molybdenum, selenium, and thallium. Introduction The last 10 years has seen explosive growth in the field of environmental chemistry. Concomitant with this growth has been the development of laboratory accreditation programs for environmental laboratories. It is generally agreed that performance evaluation samples should be an integral part of a comprehensive laboratory accreditation program. The federal government does not accredit environmental laboratories. Almost every state either has or is developing an accreditation program for environmental laboratories. The vast majority of these programs has focused on drinking water and waste water. The Environmental Protection Agency provides performance evaluation samples, the Water Sanitation and Water Pollution samples, for the assessment of these laboratories. While significant progress has been made in laboratory accreditation and development of performance evaluation samples for water matrices, there has been very little progress in the field of solid waste or hazardous materials laboratories. Not surprisingly, there has been little development of performance evaluation samples for hazardous materials laboratories for accreditation. This is due, in part, to the scarcity and newness of programs accrediting hazardous materials laboratories and the relative smplicity of preparing aqueous performance evaluation samples as compared to solid matrix samples. The California Department of Health Services through its Environmental Laboratory Accreditation Program (ELAP) is responsible for the accreditation of laboratories doing business in the 0013-936X/92/0926-2095$03.00/0

state analyzing water, waste water, and solid waste. The ELAP is mandated by California law to distribute performance evaluation samples to laboratories they accredit. The regulation and control of solid materials that are considered hazardous because of the concentration of toxic elements within them is a matter of great concern to environmental professionals and the general public. The key component of this process is accurate and precise laboratory data for elemental concentrations. While there are hundreds of hazardous materials laboratories analyzing solid materials for elemental composition, there are virtually no published data available to determine the quality of the results produced by these laboratories. To study the preparation of solid performance evaluation samples and the overall performance of the hazardous materials laboratory industry in terms of accuracy and precision, the California Department of Health Services through the Environmental Laboratory Accreditation Program, in association with the Southern California Laboratory, developed a pilot Performance Evaluation sample program for hazardous materials laboratories. Two sets of samples were sent out; one set was made up of soils spiked with five elements, and a second set spiked with PCBs. This paper will present the data from the inorganic set. Performance Evaluation Sample Theory There is significant confusion about the distinctions between performance evaluation samples, laboratory control samples (LCSs), and reference materials (RMs), also a lack of consensus as to how to prepare these solid matrix performance evaluation samples. A number of issues have contributed to this lack of consensus, the most important of which is the debate between spiked performance evaluation samples vs “real world” performance evaluation sample. This debate revolves around the benefits of “true values” vs mean values or of “real” samples vs artificial. For the purposes of this pilot project a set of definitions was developed for LCSs, RMs, and performance evaluation samples. A laboratory control sample is a material used by a laboratory for quality control/quality assurance purposes for a method or set of methods. It contains the analytes of interest in concentrations within the working range of the method or methods. It is homogeneous, of the same matrix type as the samples, and is analyzed with each batch of samples. The results for each batch should

0 1992 American Chemlcai Society

Environ. Sci. Technol., Vol. 26, No. 11, 1992 2095

fall within established control limits. These data can be used to monitor long-term trends and method performance. It is immaterial whether the LCS is spiked or not, since only a mean value and standard deviations are needed to create a control chart and set control limits. A reference material is used to determine the applicability, on a particular matrix, of a method, method comparison, instrument, or instrumental performance. It should be a “real world” sample, homogeneous, and must not be spiked. The analytes of interest should be present in measurable amounts. Performance evaluation samples are used to evaluate the performance of the entire laboratory system for a given analyte, not just the methods or instruments. This includes sample tracking, sample preparation, record keeping, method selection, method application, and data reduction. As with the other materials, the performance evaluation sample must have the analytes of interest present in concentrations that are within the linear range of the methodology. It must be homogeneous and of a matrix that approximates that of actual samples. Laboratories analyzing solid wastes cannot be evaluated by using a reagent water performance evaluation sample. Although laboratories use performance evaluation samples internally for self-evaluation, the most important use of performance evaluation samples is as part of a laboratory accreditation program. The accrediting agency submits the samples blind to the laboratory. The laboratory’s performance is evaluated based on the results obtained. The material can also be used for double-blind analysis; that is, it can be submitted to the laboratory without the knowledge of its personnel. As a result, a performance evaluation sample should have a physical appearance that will not give it away as a performance evaluation sample. For laboratories analyzing soils, the performance evaluation sample should look like a soil. Theoretically, a laboratory can be put out of business if it fails a performance evaluation sample. This leaves the accreditation program with a large window of liability. So in addition to the above-mentioned factors, a performance evaluation sample must be legally defensible. This means performance evaluation samples must be validated prior to distribution. All of these needs are best met by using a spiked sample. Spiking allows for choice of analytes, their concentrations, and the matrix, and can establish a “true” value, This last point is important in increasing the performance evaluation sample’s legal defensibility. Spiked samples are also easier and less expensive to prepare. Experimental Section (A) Experimental Design. This study had three parts. The fiist was to prepare the performance evaluation samples. The second was to validate the samples and the preparation procedure. The third was to distribute these samples among the accredited laboratories and examine their performance. (B) Sample Preparation. From previous experience in the preparation of soils spiked with inorganic analytes (1,2),it was decided to use water-soluble salts of the target elements. The use of water-soluble salts means that the most widely used acid digestion procedures will solubilize the target elements. It is also easier to make spiking materials using water-soluble salts. Strong oxidizers can attack the organic component of the soil and volatilize it, leaving the heavier silica and alumina portions. This increases the density of the soil and changes the concentration of the spiked analytes. The appearance of the soil is also altered, making it look more artificial. For per2096

Envlron. Scl. Technol., Vol. 26, No. 11, 1992

formance samples to be used as double-blind checks, they should appear as natural as possible. A large amount of a local soil was collected, milled, and sieved through U.S.standard no. 10 (2 cm2)sieves. It was analyzed for native amounts of 16 elements regulated by the state: antimony, arsenic, barium, beryllium, cadmium, chromium, cobalt, copper, lead, molybdenum, nickel, selenium, silver, thallium, vanadium, and zinc. Chromium, cobalt, copper, lead, nickel, vanadium, and zinc were found to be present in excess of 5 mg/kg. Since most laboratories use EPA SW 846 method 3050 (3)as the digestion procedure, antimony, barium, and silver cannot be used as they will be poorly solubilized. Beryllium was not used as most of the water-soluble salts are either unavailable commercially or are extremely toxic. Beryllium sulfate is relatively safe but has a very small mole fraction of beryllium. This would require such large amounts of beryllium sulfate that the soil matrix would be disturbed. This left arsenic, cadmium, molybdenum, selenium, and thallium as the target elements. The inorganic samples were prepared as described in Table I. The California Environmental Laboratory Accreditation Program accredits about 160 laboratories for inorganic hazardous materials analysis. Five kilograms of each sample was prepared to provide a 20-g aliquot of each sample to the individual laboratories. Spiking solutions were prepared for each of the salts as described on Table I. These solutions were diluted 1:lOO and checked against standards prepared from different stock materials. All of the solution were well within 10% of the expected value. The amount of each salt to be added to each soil was calculated and totaled as noted in Table I. The amount of salts to be spiked was subtracted from the 5 kg of soil. Thus when the salts were added, the total weight would be 5 kg. Due to the limited solubility or ammonium molybdate and thallium sulfate, dry salts were added for the 5000 mg/ kg materials. The appropriate mass of soil was placed in a plastic tray and mixed with enough deionized water to make a slurry. The slurries are then spiked with the amounts of the salts as indicated in Table I. The slurries were then dried at 95 OC with frequent mixing. After being dried, the materials were again milled and sieved through a US.standard no. 10 sieve. (C) Validation. A two-step validation was used. The initial validation was performed in-house at SCL. The performance evaluation samples were digested seven times using the aqua regia method. The organic samples were extracted and analyzed in duplicate. All of the results were within 20% of the expected values and had a relative standard deviation of less than 30%. The samples were validated by having at least 20 laboratories analyze the samples. The laboratories were either government or government-affiliated laboratories who agreed to do the work gratis (these will be referred to as reference laboratories). The samples would be considered valid if the mean value from these laboratories was within 20% of the spiked value and the percent relative standard deviation (% RSD) was less than 20% for the two higher concentrations for each analyte. It is to be expected that the percent relative standard deviation for an analyte will increase as the concentration decreases, all other things being the same. In the case of the inorganic materials, each low-concentration analyte was in a material with a highconcentration analyte. So for the lower concentration inorganic analytes, the analyte was considered validated if the mean value was within of 20% the spiked value and the high-concentration analyte in the same sample had an relative standard deviation of less than 20%.

Table I. Preparation of Performance Evaluation Samples samwle spike material

A

B

final conc As, mg/kg mass of salt, g solution spiked, mL final conc Cd, mg/kg mass of salt, g solution spiked, mL final conc Mo, mg/ kg mass of salt, g solution spiked, mL final conc Se, mg/kg mass of salt, g solution spiked, mL final conc T1, mg/kg mass of salt, g solution spiked, mL mass of all salts, g mass of soil, kg total mass, kg

4000 26.4 200 500 5.12 50 30 0.23 30 5 0.04 5 (1:lO)

500 3.30 25 50 0.51 50 (1:lO) 5 0.06 5

50 0.38 , 250 (1:100) 5 0.05 5 (1:lO)

500 4.1 50 50 0.31 50 50 4.950 5.000

5000 51.2 500 500 4.6 500 50 0.41 50 (1:lO) 5 0.03 5 62 4.938 5.000

liquid conc, g/L

dilutions

5000 46.1 5000 41 500 500 3.07 500 44 4.956 5.000

5000 30.7 39 4.961 5.000

5 0.04 25, (1:100)

34 4.966 5.000 Solutions mass

spiking material AS209

3Cd(S04).8HzO (NH4)6M07024.4HZO HzSeO, T12S04

a

MF" 0.757 0.438 0.543 0.612 0.808

anal*,

100 50 5 50 5

g

mass salt, g 132 114 9.2 82 6.19

100 50 5 50 5

1:100 1:lO 1:lO

MF. mole fraction.

(D) Distribution. These soils were distributed among 160 environmental laboratories accredited by the Environmental Laboratory Accrediation Program for analysis for all five elemental analytes (these will be referred to as accredited laboratories). The laboratories were required to use only approved EPA methods. Each sample was given an identification number for tracking purposes. To minimize comparisons by the laboratories of results, the ID numbers were random except the third digit: inorganic

sample ID no.

A B C D E

XXOX or XX5X XXlX or XX6X XX2X or XX7X XX3X or XX8X XX4X or XX9X

Each laboratory was sent a letter dated April 15,1991, 2 weeks in advance of shipping the samples. This letter was addressed to the laboratory director and informed him or her to prepare for the arrival of the performance evaluation samples. It also contained the deadline for the return of the results, June 22, 1991. These sheets contained all the information necessary to perform the analyses. Of the 160 laboratories that received the samples, 127 returned results. (E) Analytical Methods. Elemental analysis was performed at Southern California Laboratory using EPA SW 846 draft method 3055 [which in previous publications was referred to as the SCL method ( 1 , 2 , 4 ) ] for analysis by simultaneous inductively coupled plasma atomic emission spectroscopy (ICP-M), sequential ICP (ICP-Q), and flame atomic absorption spectroscopy (FAA). For graphite furnace atomic absorption spectroscopy (GFAA), EPA SW-846 method 3050 was used. This method was the digestion procedure used by m a t reference laboratories and all accredited laboratories. For the elements, salts, and concentrations used in this study, methods 3060 and 3055 give identical results. Analytical methods include

EPA SW-846methods, 6010,7061,7130,7131,7480,7481, 7740, 7840, and 7841. The reference laboratories used either these same methods or very similar ones. The exceptions to this were where chelation extraction was used with either FAA, colormetry, or UV/visible fluorescence. For the energy-dispersive X-ray fluorescence instrument (EDXRF), the samples were ground to pass a U.S. standard sieve no. 200. (F) Statistical Methods. Mean values were compared using the Student t-test. The variances were compared using Snedecor's F-test (5-7). Outliers were defined as a result that was 10 times higher or lower than the spiked value. Outliers were not used to determine the means or standard deviations. (G) Control Limits. Acceptance criteria for all the elements were *25% of the spiked value for concentration greater than 50 pg/g, rounded to two significant figures. For concentrationsof 50 pg/g or less, the limits were *50% with two significant figures. For concentrations of 10 pg/g or less, the limits were *50% with one significant figure. Results outside these ranges are considered out of control. Results Table I shows how the performance evaluation samples were prepared. It lists the final concentrations and the amount of salts, and how the spiking solutions were prepared. Table 11lists lists the mean values for both the reference laboratorim and the accredited or ELAP laboratories. Also listed are the Student's t-test results that test the difference between the two means and the number of data points. In almost every case, both at 96% confidence and at 99% confidence, the mean values are statistically identical. Table I11 lists the standard deviations and relative standard deviations for both reference laboratories and accredited laboratories. The value for the F-test, comEnvlron. Scl. Technol., Vol. 26, No. 11, 1992 2097

Table 11. Comparison between Means (fig/g) of Reference Laboratories and Accredited Laboratories’ analyte

sample

AS

A B C D E E

Cd

A

B C

D

Mo

E A B C D E A B C D E

Se

T1

reference mean N

accredited mean N

t-test

3970 418 48.3 12.7 7.05 4850 504 50.4 5.80 4740 445 33.9 8.96 4500 470 47.1 4.71 4440 461 45.2 9.37

4080 484 58.4 12.2 4.30 4930 509 50.7 6.15 4490 460 40.3 7.89 4710 446 52.7 5.28 4140 456 58.7 8.86

0.51 0.07 0.92 0.17 2.08 0.15 0.06 0.24 0.24 0.46 0.21 1.05 0.66 0.38 0.38 0.48 0.47 0.65 0.11 0.11 0.35

24 24 23 23 21 25 25 25 22 21 21 21 14 24 23 23 18 19 18 16 15

118 121 117 107 87 127 126 126 126 118 121 119 96 115 118 116 89 121 121 121 85

“he critical value for t at 95% confidence is 1.645;at 99% it is 2.326 for N > 120.

Table 111. Comparison between Standard Deviations (fig/& of Reference Laboratories and Accredited Laboratories’ reference

accredited

analyte

sample

SD

RSD

SD

RSD

F-test

AS

A B

441 57.7 37.4 9.80 7.79 328 40.9 3.53 1.13 387 54 9.23 8.75 449 111 10.1 3.88 488 62.3 26.4 13.6

11.1 12.1 67.6 77.5 111 6.67 8.12 6.99 19.5 8.17 12.1 27.2 97.7 9.96 23.7 21.4 82.5 11.0 13.5 50.6 145

1000 388 52.3 14.2 4.3 833 157 12.1 2.35 1120 174 20.0 7.19 1320 199 55.8 7.18 953 113 59.2 8.06

24.1 80.2 89.6 117 109 16.9 31.0 23.9 38.1 24.9 10.5 50.0 93.8 28.0 44.6 106 136 23.0 24.8 101 90.1

5.18 45.2 1.98 2.08 2.78 6.46 14.9 11.8 4.29 8.40 10.4 4.68 0.67 8.67 3.19 30.5 3.42 3.82 3.30 5.02 2.85

C

D E E A B

Cd

C

Mo

D E A B

Se

C

T1

D E A B C

D E

Table IV. Rates of Out of Control Results (%) reference labs range, mg/kg highs lows

0 0 0 64

Molybdenum 0 9.5 0 0 33 66

6300-3700 630-370 75-25 8-2 1>

0 4.2 0 11

8.4 0 5.5

5300-3300 630-370 75-25 8-2 1>

5.3 0 11 15

0 5.3 11 0

4.2 4.2 4.2

7.7 5.0 6.9 11 10

4.3 13 2.6 14 16

2.5 3.9 1.6 10

3.3 2.4 1.6 3.1

4.1 4.9 4.1 20

13 13 0 0.8

2.3 5.4 6.2 20

4.0 17 7.1 13

3.2 5.6 9.5 39

5.6 6.3 4.8 0

1.7 10 22

Cadmium

4.0 64

0.8 61

2.5 22 68

Selenium 11

4.2 25 83

1.7 15 81

Thallium 11 21 79

12 21 77

120 and a denominator equal to 20. ~~~

~

paring the differences in the variances between the two populations of laboratories, is also listed. In 15 out of 21 results, and in every case where the concentration was above 50 pg/g, the reference laboratories had statistically smaller standard deviations. Table IV lists the control limits for all the samples and analyta and lists the percent of results out of control, both high and low, for both the reference laboratories and the accredited laboratories. The column marked