Effluent analysis of wastewater generated in the manufacture of 2,4,6

SRI International, Menlo Park, California 94025. Cluster analysis is used to characterize the distribution of ether-extractable wastes-stream componen...
0 downloads 0 Views 501KB Size
Environ. Sci. Technol. 1982, 16, 233-236

Effluent Analysis of Wastewater Generated in the Manufacture of 2,4,6-Trinitrotoluene. 2. Determination of a Representative Discharge of Ether-Extractable Components Ronald J. Spanggord" and BenJamlnE. Suta SRI International, Menlo Park, California 94025

w Cluster analysis is used to characterize the distribution of ether-extractable wastes-stream components resulting from the production and purification of TNT. This method is compared to other methods of analysis to establish a representative mixture of components that can be used in toxicological evaluations. The accompanying paper (1) describes the identification of 32 components found in the ether extract of a complex effluent derived from the production and purification of 2,4,6-trinitrotoluene. These identifications serve as the primary step in the development of a hazard assessment that will eventually be used to recommend water quality criteria to governmental regulatory agencies. Associated with this hazard assessment are toxicological investigations that may become formidable tasks with respect to both time and cost as the number of compounds under investigation escalates. This is particularly true when the toxicological studies reach the chronic phase of investigation. In these cases, an investigation of a mixture of components representative of the discharge becomes a feasible approach for hazard assessment. This paper describes the use of cluster analysis to characterize the distribution of waste-stream components found in the ether extract of a complex discharge. The distribution is achieved by assigning component concentrations to clusters by using a procedure that minimizes the sum of Euclidean distances squared between all observations and their representative cluster centers. Comparisons are made of representative concentrations generated by the cluster procedure and those generated by a 90th-percentile approach to arrive at a representative mixture of components to be used in toxicological evaluations.

Experimental Section Water samples were collected over a 12-month sampling period and analyzed for 32 components by gas chromatography as previously described (1). N-Nitrosomorpholine and N-morpholinoacetonitrile were quantified during this period but were not included as part of a representative discharge because the use of morpholine-type water additives was being discontinued. Therefore, a 30-component distribution from 54 samples was used in the characterization study. The sampling data were highly variable, showing no consistent pattern of component outflows either by visual inspection or by mean-value determinations. To determine whether natural groupings of the observations existed, we subjected the data to a cluster analysis using a computer program developed by SRI that is similar to the profileanalysis method of Overall and Klelt (2). The cluster analysis program is a stepwise routine that selectively groups samples together according to the observed concentration values of all Components in the sample. This program also provides a measure of the total variability between the individual samples and their as0013-936X/82/09 16-0233$01.25/0

signed cluster centers. The clustering procedure is a multivariate extension of the process of graphing data-case variables (component concentrations) and pictorially dividing the data into clusters according to the proximity of the individual points. For the application described herein, it would be difficult to graph the data because 30 dimensions would be required to represent each of the 30 effluent components. However, the hypothetical two-dimensional model shown in Figure 1 illustrates the clustering procedure. For a 30-dimensional model, a computational routine is employed in which cluster centers, X L kare , selected in a manner that minimizes the sum of Euclidean distances squared, V, between all observations and their assigned cluster centers. According to the generalization of the Pythagorean theorem, the square of the distance between two points in p-dimensional space is equal to the sum of the squares of the differences in projections on p-orthogonal coordinate axes. This procedure is mathematically described in eq 1, where V is the square of the Euclidean K

v = k2= l

I

J

~ = j=1 1

(X,k

-

xcd2

(1)

distance, k indicates the cluster number, j indicates the sample number, i indicates the compon_entnumber, X V k indicates the observation value, and XLk indicates the cluster center values. Before the cluster analysis allocation was made, all variables were transformed by substracting the component-concentration average from the component concentration and dividing the difference by the componentconcentration standard deviation. This transformation (2 scores) has the effect of altering the statistical distribution of each variable to an average of 0 and a standard deviation of 1, thus giving equal weight to each variable. This transformation was made because the 30 variables differ in their variability. 2 scores prevent the variables with large variability from disproportionately influencing the cluster allocations. The experimental design did not permit a detailed analysis of the magnitude of the various sources of data variability. This variability is primarily due to actual variations in the effluent concentrations and to a lesser extent to measurement error. No other data transformations were evaluated. The log transformation, which is more commonly used with proportion-type data, was not attempted because many observations had a value of 0. The value of V is a measure of the total variability between the samples and their representative cluster centers. Different values of V are obtained when different numbers of clusters are selected. V has a value of 0 when the number of clusters and the number of samples are equal. The maximum value of V occurs when only one cluster is used. Statistical tests (3) can be applied for selecting the minimum number of clusters such that additional clusters will not significantly reduce the variability of V. The cluster program can also be used to identify samples that are extremely different from other samples. These

0 1982 American Chemical Soclety

Environ. Sci. Technol.. Vol. 16, No. 4, 1982 233

28

24

I-

2o

t

.’ Y

K h t a Points f o r Two Components

4t

Cluster 1

12

-I

i

L

l 4

2

. I

1

6

8

I 10

I

12

I 14

I 16

18

A]

1

I

1

I

I

I

I

I

20

2

4

6

8

10

12

14

16

2,6-Dinitrotoluene (ppm)

24

B J 2[

18

2 , 6 - D i ni trotoluene (ppm)

t

Cluster 1

I

-

20

Data point for 3rd cluster

2 Clusters

d , L

2

4

6

8

10

12

14

16

18

20

2,6-01nl tratoluene (ppm)

2

4

6

8

10

12

14

16

18

20

2,6-Dinitrotoluene (ppm)

Figure 1. Example of the clustering procedure: (A) A mean value is calculated for all points representing2,4- and 2,6dinitrotoluene concentrations. Each point represents one sample. (B) The data point farthest from the mean is selected to begin another cluster, end distances between that point and all other polnts are minimized yielding C. New mean values are calculated for each cluster, and the data point farthest from the mean Is selected to begin the third cluster (D), and the process is repeated.

samples usually appear as clusters containing only one sample.

Results Using the above approach with 30 variables (component concentrations) from 54 samples (1620 observations), we ran the program to select cluster centers for each of the variables for the cases of five, four, three and two clusters. With five clusters, the samples were divided into two major clusters containing 25 and 24 samples each and three minor clusters containing 1,3, and 1 samples. The component concentrations and percent composition for the major clusters appear in Table I. In the two-cluster case, two samples were readily distinguished from all others because of very large 5-amino-2,4-dinitrotolueneconcentrations, which indicated that these two samples represented unusual discharges. Inspection of the samples representing minor clusters in the four- and three-cluster cases also revealed unusual concentrations for one of the variables in each cluster and showed little difference from the five-cluster case. Discussion The cluster analysis results indicate that, after the unusual discharge measurements have been eliminated, the remainder of the observations can be assigned to two clusters. Thus, two sets of representative samples could be used to formulate mixtures of components for toxico234

Environ. Sci. Technol., Vol. 16, No. 4, 1982

logical evaluation. Because of the cost of conducting toxicological evaluations, it may be desirable to test only one mixture of components. We chose to average these two clusters because there is nearly an even distribution of samples between clusters and because the apparent unrepresentative samples in the three clusters having only four samples would be eliminated. This averaging also allowed components with zero concentrations (below detection‘limits) in one cluster to be represented in the average discharge. Toxicologicalstudies will then incorporate all components in screening tests, which may allow “supertoxicn trace components to be included in the average discharge. Pearson et al. ( 4 ) used a 90th-percentile-concentration approach to arrive at representative concentrations of complex discharges. Using the 90th-percentile approach for the condensate components, we obtained results comparable to the average cluster analysis (Table 11). However, the 90th-percentile concentration for a given component excludes 10% of the observations and is by definition slightly less than or equal to 90% of the observations. For most underlying statistical distributions, the mean values (obtained with the cluster analysis) provide better representative concentrations than the 90thpercentile approach. The upper-percentile approach could be used, however, to approximate “worst-case”component concentration distributions, which may be of value in short-term toxicological evaluations.

Table I. Relative Concentrations of Condensate Components Derived from Cluster Analysis cluster 1 cluster 2 condensate component toluene 2-nitrotoluene 4-nitrotoluene 3-nitrobenzonitrile 4-nitrobenzonitrile 2-amino-4-nitrotoluene 2-amino-6-nitrotoluene 3-amino-4-nitrotoluene 3-methyl-2-nitrophenol 5-methyl-2-nitrophenol 1,3-dinitrobenzene 2,3-dinitrotoluene 2,4-dinitrotoluene 2,5-dinitrotoluene 2,6-dinitrotoluene 3,4-dinitrotoluene 3,5-dinitrotoluene 3,5-dinitroaniline 1,5-dimethyl-2,4-dinitrobenzene 2-amino-3,6-dinitrotoluene 2-amino-4,6-dinitrotoluene 3-amino-2,4-dinitrotoluene 3-amino-2,6-dinitrotoluene 4-amino-2,6-dinitrotoluene 4-amino-3,5-dinitrotoluene 5-amino-2,4-dinitrotoluene 2,4-dinitro-5-methylphenol 1,3,5-trinitrobenzene 2,3,6-trinitrotoluene 2,4,6-trinitrotoluene

av

mg/L

%

mg/L

%

mg/L

%

0.228 0.008 0.004 0.000 0.000 0.013 0.005 0.000 0.006 0.011 1.834 0.000 4.578 0.053 2.261 0.092 0.339 0.000 0.246 0.000 0.012 1.057 0.476 0.146 0.027 0.296 0.001 0.004 0.006 0.007

1.950 0.069 0.039 0.000 0.000 0.119 0.043 0.000 0.053 0.096 15.651 0.000 39.058 0.459 19.293 0.787 2.894 0.000 2.106 0.000 0.109 9.02 4.062 1.250 0.233 2.526 0.012 0.036 0.055 0.066

0.170 0.009 0.0123 0.010 0.002 0.001 0.005 0.001 0.0013 0.0014 3.646 0.346 14.884 0.376 6.323 0.546 0.162 0.058 0.042 0.027 0.035 0.253 0.169 0.153 0.010 0.092 0.039 0.000 0.300 0.076

0.630 0.033 0.044 0.036 0.007 0.002 0.019 0.002 0.005 0.005 13.130 1.247 53.602 1.357 22.771 1.967 0.584 0.210 0.175 0.099 0.126 0.914 0.610 0.554 0.039 0.332 0.142 0.000 1.080 0.277

0.201 0.008 0.008 0.005 0.001 0.007 0.005 0.001 0.003 0.006 2.740 0.173 9.731 0.215 4.292 0.319 0.250 0.029 0.147 0.013 0.023 0.655 0.323 0.150 0.019 0.194 0.020 0.002 0.153 0.042

1.022 0.044 0.043 0.025 0.005 0.037 0.026 0.001 0.019 0.032 13.878 0.877 49.284 1.090 21.738 1.620 1.270 0.148 0.748 0.069 0.121 3.321 1.636 0.761 0.096 0.984 0.104 0.011 0.776 0.214

Another approach is to calculate average values for each of the 30 components and to use these averages to derive a representative sample. In this approach or the 90thpercentile approach, the data points are generally represented as one cluster and there is no opportunity to evaluate the natural groupings of the samples. “Outlier” observations are not considered as such by these approaches; and they are incorporated into the component distribution unless extreme value tests are applied to individual observations (3). In the cluster analysis procedure, samples are judged to be outliers on the basis of the values of all components. When an observation is found to be an outlier, all of its component values are discarded. The removal of samples containing outlier-component concentrations was found to be the primarily advantage of cluster analysis in determining representative component distributions. Several limitations to cluster analysis should be mentioned. First, spurious classifications can result when the number of clusters is large in relation to the amount of available data. In this study, at most five clusters were used with the 54 samples from a 30-component distribution, thereby creating a large data base for cluster selection. Another limitation arises when the different components have different variabilities. This tends to place more weight on the highly variable component in the determination of cluster centers. This problem was reduced through the use of 2-score transformations as previously described. The approach described here represents a methodology that can be used as part of a hazard assessment strategy to determine a representative ratio of components in a complex industrial wastewater. Formulation of component mixtures representative of the discharge will allow more complete toxicological investigations, especially when a compound-by-compound approach becomes economically prohibitive. Also, any hazard assessment should predict

Table 11. 90th-Percentile and Average Cluster Analysis Comparisons of Relative Concentrations

condensate component

re1 90th- av cluster percentile analysis re1 concn concn

toluene 0.590 2-nitrotoluene 0.089 4-nitrotoluene 0.295 3-nitrobenzonitrile 0.035a 4-nitrobenzonitrile 0.027O 2-amino-4-nitrotoluene 0.097 2-amino-6-nitrotoluene 0.030a 3-amino-4-nitrotoluene 0.080R 3-methyl-2-nitrophenol 0.035 5-methyl-2-nitrophenol 0.094 1,3-dinitrobenzene 11.803 2,3-dinitrotoluene 1.180 2,4-dinitrotoluene 43.377 2,5-dinitrotoluene 1.180 2,6-dinitrotoluene 21.541 3,4-dinitrotoluene 1.475 3,5-dinitrotoluene 1.534 3,5-dinitroaniline 0.171a 1,5-dimethyl-2,41.151 dinitrobenzene 2-amino-3,6-dinitrotoluene 0.089 2-amino-4,6-dinitrotoluene 0.059 3-amino-2,4-nitrotoluene 4.426 3-amino-2,6-nitrotoluene 3.541 4-amino-2,6-nitrotoluene 1.770 4-amino-3,5-nitrotoluene 0.590 5-amino-2,4-nitrotoluene 2.066 2,4-dinitro-5-methylphenol 0.251a 1,3,5-trinitrobenzene 0.451a 2,3,6-trinitrotoluene 0.791a 2,4,6-trinitrotoluene 1.180 a

1.022 0.044 0.043 0.025 0.005 0.037 0.026 0.001 0,019 0.032 13.878 0.877 49.284 1.090 21.738 1.620 1.270 0.148 0.748 0.069 0.121 3.321 1.636 0.761 0.096 0.984 0.104 0.011 0.776 0.214

Mean value of the observed nonzero values.

the actual effect that the discharge has on the environment. The testing of representative ratios of discharged Environ. Sci. Technol., Vol. 16,

No. 4,

1982

235

Environ. Sci. Technol. 1982, 16,236-239

components could serve as an integral part of this assessment.

Literature Cited (1) Spanggord, R. J.; Gibson, B. W.; Keck, R. G.; Thomas, D. W.; Barkley, J. J., Jr. Environ. Sci. Technol., preceding paper in this issue. (2) Overall, J. E.; Klelt, C.J. “Applied Multivariate Analysis”; McGraw-Hill: New York, 1972.

(3) Afifi, A. A.; Azen, S. P. “StatisticalAnalysis: A Computer Oriented Approach”;Academic Press: New York, 1972; pp 247-50. (4) Pearson, J. G.; Glennon, J. P.; Barkley, J. J., Jr.; Highfill, S. W. A S T M S p e c . Tech. Publ. 1979, 667, 284.

Received for review December 15, 1980. Revised manuscript received June 23,1981. Accepted December 22,1981. This work was performed under Contract DAMD 17-76-C-6050 from the US Army Medical Research and Development Command.

NOTES Precision and Accuracy of a ,8 Gauge for Aerosol Mass Determinations Wllllam J. Courtney Northrop Services, Inc., Research Triangle Park, North Carolina 27709

Robert W. Shaw” and Thomas 0. Dtubay Environmental Sciences Research Laboratory, US Environmental Protection Agency, Research Triangle Park, North Carolina 2771 1

Results of an experimental determination of the precision and the accuracy of a @ray attenuation method for measurement of aerosol mass are presented. The instrumental precision for a short-term experiment was 25 pg for a 6.5-cm2 deposit collected on approximately 1mg/cm2 Teflon filters; for a longer-term experiment the precision was 27 pg. The precision of the gravimetric determinations of aerosol deposits was 22 pg for Teflon filters weighed to 1pg. Filter reorientation and air density changes that were able adversely to affect the P-ray attenuation results are discussed. P-ray attenuation results are in good agreement with gravimetric measurements on the same fiiter-collected aerosols. Using dichotomous samplers in Durham, NC, we collected 136 aerosol samples on Teflon filters in two size ranges. A regression line was calculated implicitly assuming errors in both measurements of mass. The 90% confidence intervals lay within 21 pg of the regression line for mean fine fraction aerosol mass loadings of 536 pg and within 19 pg of the regression line for mean coarse fraction aerosol mass loadings of 349 l g . Any bias between gravimetric and P-gauge mass measurements was found to be less than 5%.

Introduction One of the most widely reported parameters of atmospheric aerosols has been the mass concentration deduced from particulate matter collected on filters. In such measurements mass is usually determined by weighing. Gravimetric procedures, however, are inconvenient for analyzing large numbers of samples, since such procedures require a significant amount of sample handling and are difficult to automate. Jaklevic et al. have recently described the design of a @ gauge for determining the mass of aerosols collected in dichotomous samplers on thin, Teflon membrane filters (1). The P gauge is a device consisting of a p-ray source, a @-raydetector, and counting electronics and is used to deduce the mass of an object placed between the source and the detector. Mass measurements by P gauge have been reported in several previous studies on atmospheric aerosols (2-5) and this paper provides experimental justification for the assumption of 236

Environ. Scl. Technol., Vol. 16, No. 4, 1982

equivalence of gravimetric and P-gauge measurements of mass. In this work we report experiments designed not only to test the equivalency of a P-gauge method to the standard gravimetric determination of aerosol mass but also to determine the operational characteristics of a P gauge. The filters used were Teflon membranes, and the aerosols were collected in two size ranges by using dichotomous samplers. Results are presented for experimental determinations of precision and accuracy of gravimetric and P-gauge measurements. Studies of the effect of filter orientation and laboratory atmospheric conditions on D-gauge response &e described. These experiments show that, if filter orientation is changed between the measurement of the blank filter and the loaded filter, the precision of the @-gauge mass determination suffers significantly. The effects of laboratory atmospheric conditions are also shown to be important.

Experimental Procedures Filters. The filters were unbacked, Teflon membranes having l-pm pore size and were used in two different types of mounts. In one, the filters were heat sealed onto supporting annular polyolefin rings; in the other they were heat sealed onto (5 X 5 ) cm2 plastic frames. Both filter types are standard for use in Beckman automated dichotomous samplers and were obtained from Ghia Corp., Pleasanton, CA, and from Beckman Instruments, Fullerton, CA. The area of the aerosol deposits in these experiments was 6.5 cm2. The filter material had an areal density of about 1 mg/cm2. Gravimetry. Mass measurements were made by using a Mettler ME22 balance with a BE22 control unit. Absolute accuracy was determined by using a new set of class S weights (6). The polyolefin-ringed Teflon filters used in this experiment had masses of approximately 100 mg; the tolerance for a 100-mg class S weight was 15 pg. Calibration of the balance over the range 0-100 mg showed that its absolute error was no more than 21 pg. A ‘loPo source was used to neutralize static electrical charge on the filters during gravimetric measurements. &Gauge Measurements. The /%gaugesystem used in

0013-936X/82/0916-0296$01.25/0

@ 1982 American Chemical Society