Systems Chemical Analysis of Petroleum Pollutants F. K. Kawahara*' and Y. Y. Yang2 U.S.Environmental Protection Agency, Cincinnati, Ohio 45268
The application of an established mathematical treatment useful for the characterization and identification of petroleum pollutants is described. Using discriminant analysis of relevant infrared spectrophotometric data, 99 % of numerous known and unknown oil samples have been correctly characterized and identified. Unknown samples include weathered crude oils, heavy residual fuel oils, and asphalts, which were all correctly identified by systems chemical analysis and by other chemical analysis such as electron capture detection gas chromatography, flame ionization detection gas chromatography, metal analysis, etc.
The discharge of petroleum products into surface waters is a continuing environmental problem; it affects the biota and can render domestic and industrial water unfit for use. Because Federal legislation (the Water Quality Improvement Act of 1970) assigns penalties for illegal discharge and assesses liabilities for cleanup and restorative costs, oil spills in the environment must be traced to their source. Tracing (or identifying) an oil spill to its source becomes difficult when a product is weathered, or a mixture of two or more products is involved, or there is no source specimen. In the latter instance, the spill specimen must first be characterized before the possible origins of the spill can be investigated and the spill source identified. This procedure was illustrated by the 1970 oil spill a t Aliquippa, Pa., on the Ohio River when the presence of triglycerides, organic phosphates, acids, and metallic particles in the spill specimen was related to a specific discharge of industrial products ( I ) a t their source. T h e procedure of characterization followed by identification may be clarified by the following definitions. Characterization of a petroleum product relates the features of composition and structure of a petroleum material necessary for a particular preparation. Identification of a petroleum product is the demonstration of sameness in the composition and structure found in petroleum material when compared to another under consideration. T h e technique of infrared spectrophotometry has provided useful data for characterizing unknown petroleum product specimens with respect to several known groups (2-8). In the present study, infrared data were produced, then relevant infrared bands and peaks of spectra were assessed to yield absorbance values from which all possible ratios of absorbances were calculated. Analyzing such data (8) provides a highly effective method of characterization. In this procedure, the potential for detailed identification of environmental samples is recognized. Kowalski et al. (9) applied a pattern recognition technique to the problem of determining the course of an oil spill, using its elemental composition as analyzed by neutron activation analysis, while Clark and Jurs ( 1 0 ) made qualitative determination of petroleum sample type from gas chromatogram fingerprints using pattern recognition Environmental Monitoring and Support Laboratory.
* Health Effects Research Laboratory.
techniques. Mattson et al. (7) also employed pattern recognition techniques to evaluate the capability of infrared spectrophotometry to serve as a method for identification of oils. Linear discriminant function analysis was applied to the evaluation of the infrared data. Thus, a systems chemical analysis (analytical measurements in systems chemistry) is introduced to describe, in quantitative terms, complex petroleum systems such as asphalts, heavy residual fuel oils, and weathered crude oils. T h e objectives of the present study were to characterize correctly an unknown specimen from among several known groups of the petroleum products studied and to match a specific source correctly and with a high posterior probability, preferably-1.000 (perfect) or nearly so.
EXPERIMENTAL Apparatus. Perkin-Elmer 467 and 137 B spectrophotometers were used. Demountable liquid cells with sodium chloride windows were employed for the heavy petroleum products, and polished sodium chloride plates were used for tacky asphalt products. Reagents and Samples. Crude oil residues dissolved in spectrophotometric grade chloroform were distilled and used as such in the analyses; spectrophotometric grade chloroform was also used to prepare asphalt specimens. Both products and No. 6 fuel oils were received from various petroleum refiners. Spill specimens collected from the environment were treated in the same manner as the known products unless otherwise specified. Infrared Techniques. Three to seven samples randomly selected from each specimen were spectrophotometrically analyzed. Each industrial product as received was prepared for infrared spectrophotometry. Crude oil residues were prepared according to ASTM methodology (11).Neat oil specimens were transferred by means of a syringe to the infrared cell. Sandwich assemblies were necessary for the tacky asphalt samples. The sample was dissolved in chloroform, and the asphalt film cast on a sodium chloride plate. With a second plate, a sandwich assembly was formed after drying the film at 75 O C for 2 min. The film thickness for asphalts was not adjusted if the minimum point of infrared transmittance (7') for the 13'75 cm-l peak lay between 18 to 66Oh 7'. For liquid petroleum products, the sample thickness was uniform because the fluid product was easily introduced into fixed cells. With asphalts, however, film thicknesses varied. T o compensate for this, Potts (12) recommends using absorbance ratios. The sample thickness for demountable cells was considered satisfactory when the peaks at wavelengths of 720, 810, 870, 1020, 1375, 1470, and 1600 cm-I indicated absorbance magnitudes in the linear part of the curve of absorbance vs. sample thickness. A thickness of 0.03 mm was satisfactory for No. 6 fuel oil and crude oil residues. If the absorbances from thin samples were rather small, the dispersion of absorbance values would be too large to show any meaningful relationship between samples. By contrast, for thick samples, a very high absorbance value at 1375 cm-' and 1460 cm-' may, a t times, lie outside the linear range. Note that association effects among chemical species in the sample may also produce deviations from linearity. Furthermore, when the range frequency of a peak is narrower than the exit slit, a deviation from linearity will also occur (13, 1 4 ) . The base-line method of measuring the absorption peaks as described by ASTM ( 1 5 ) and Potts (16) was adopted for this study. Data Consideration. In an earlier work, in which asphalts and No. 6 fuel oils were characterized, seven infrared peaks or bands from the spectrum generated 42 ratios of infrared absorbances for characterization of asphalts and No. 6 fuel oil products by discriminant analysis. One unknown had been introduced. In the present ANALYTICAL CHEMISTRY, VOL. 48, NO. 4, APRIL 1976
851
study, 24 crude oil residues, 81 No. 6 fuel oils, and 73 asphalts were included along with 16 unknowns. With the use of a computer program ( 1 7 ) ,the same 42 ratios were calculated and entered as data input to the stepwise discriminant analysis program. The average coefficient of variation in the set of samples for these individual petroleum specimens was approximately 8%. Computer time involved in the analysis ranged about 5 to 25 cpu on the IBM-370. D I S C R I M I N A N T ANALYSIS The basic problem of discriminant analysis is to assign a n individual of unknown origin t o one of a finite member of groups on the basis of a set of characteristics. T o solve this problem, it is necessary to be able to classify the individual data correctly. In defining groups, variables must exist and be used t o establish the groups, so t h a t a group can be matched against a n unknown individual under investigation. I n the present study for example, comparison of the chemical characteristics in terms of numbers from ratios rather than in terms of spectral features presents a workable and comparative structure-identity relationship. Thus, the comparison of infrared spectra of oils with result a n t positive or negative identification may be established with greater facility. Consider a problem of classifying an observation into one of k populations. A g X 1 vector X,,,
from the j t h sample in t h e i t h population, with the expected value, E(X,) = p,. Let B ( q X 9 ) be t h e “between” groups covariance matrix and W ( 4 X q ) t h e “within” groups covariance matrix. Then, if the parapeters p’s are known, we have 1 k (1) B = - 1 ( ~ r , - F.) ( p L- PI’ k r=l and W = 2 , where p = l / k B:=,pl, and Z is the population covariance. The prime in Equation 1 refers to the “transpose of)’; the row vector, for example, is the transpose of the column vector. If the parameters are unknown, as usually is the case, then Equation 1 will be .
1
h
B=1 ( X i . - 8 ..) k - 1 i=i
(Xi.
- x . .)’
where
T o minimize misclassification, Fisher (18) suggested finding the linear compound, X ( q X l), which maximized
o = - X’BX
(3)
X’WX
Taking partial derivatives of 8 with respect to A’, we obtain
-afl_ - 2Bh(X’WX) - PWX(X‘BX) = 0 set
ax
(X’WX)’
or 2(BX - OWX) =@ X’WX
t h a t is, to solve
(B - HW)X = 0 652
14)
ANALYTICAL CHEMISTRY, VOL. 48, NO. 4, APRIL 1976
Equation 4 has a nontrivial solution only if the determinant (B-QWl = 0
(5)
The solutions to Equation 5 are the eigenvalues of W-’B. T h e corresponding eigenvectors are linear compounds, A, t h a t will be used for discriminating. If m vectors are used, the decision rule becomes: assign any X to the i t h population, if
If w’s are unknown, the e h m a t e s , X, will be substituted. T h e probability of misclassification can be estimated if it is assumed that the X’s are g variate normal. The output of the stepwise discriminant analysis is important to the discrimination and subsequent assignment of samples: it consists of t h e posterior probability of each sample coming from each of t h e groups and the several eigenvectors. The first eigenvector accounts for the greatest proportion of the total dispersion, the second eigenvector accounts for the second greatest proportion of the total dispersion, and so on, in descending order. R E S U L T S A N D DISCUSSION Because of biological decomposition, evaporation, oxidation, solution, etc., samples of oil spills collected from t h e environment have usually undergone various degrees of weathering. The weathering effect may produce positive or negative changes in the minor absorbance peaks (19-21). Large changes in absorbance ratios calculated from minor peaks make their use unreliable. Thus, t h e appearance of new small peaks (e.g., due to autoxidation of thioethers to sulfoxides) or the elimination of small peaks by weathering should be considered to contribute only slightly, if a t all, to t h e identification of the original samples unless t h e source sample is weathered in a manner that compares exactly with t h a t of the spill sample. However, duplicating t h e weathering process is not simple. Minor peaks having too little absorbance should not be used in primary identification except in those cases where weathering is not apparent or where no changes result from weathering. Their use in t h e calculations of this study was avoided since errors associated with reading the component distances of these small peaks contribute to large variation. However, small peaks should not be completely discounted since their appearance, disappearance, or change may indicate the severity of weathering imparted to the sample. Certain unweathered minor peaks, when not reflecting oxidation effects, may be of value in supporting the identification. When compared with other petroleum products, such as No. 2 fuel oil, solid or semi-solid asphalt samples are not easily weathered below the surface of t h e mass. When peaks are too large and possess a large absorbance, relevance is reduced because of the decreased sensitivity of t h e instrument a t this high absorbance level ( 1 6 ) . I n addition, the relationship between absorbance and sample thickness sometimes lacks linearity with these large peaks. Three to seven samples were randomly taken from each specimen for infrared spectrophotometric analysis. It was assumed t h a t any random variation among the samples from a specimen reflects the variability in the source and in t h e spill sampling from which the specimens were collected. In an attempt to identify 178 individual samples from all specimens in a single stepwise discriminant analysis operation (even using all 42 variables), many misclassifications and low probabilities of correct assignment were encountered. Two stepwise discriminant analyses operations
were, therefore, required: class characterization with reference to an oil group selected from the known groups. followed by identification among t h e specimens within the assigned group. This analytical method has improved not only the characterization of known and unknown petroleum products, but also t h e identification of unknown spill
-6.046
Y >
z w c3
n 0 v,
14.115 2 13.368 12.622 11.875 11.128 10.382 9.635 8.888 8.142 7.395 6.648 5.902 5.155 4.408 3.661 2.915 2.168 1.421 0.675 -0.072 -0.819 -1.565 -2.312 -3.059 -3.805 -4.552 -5.299 -6.048 7
- 3.805 1
1
1
2'915
0.875
-IS6' 1
1
I
with known sources. This proves to be a great advantage, since by minimizing the number of possible sources through characterization. the possibility of a proper identification or matching of a sample to a source is greatly enhanced. Characterization. Of the 194 samples, 178 belonged to
I
1
7'395
5.155 I
1
I
I
1
I
14.1 15
II *875
5.635 I
I
I
'- 14.1 I 5
- 13.368 -- 12.622 11.815 - 11.128
- 10.382 : 9.635 - 8.888 1 8.142
-
.W
A
r
P
A A AA A AA#A A A A A A A AAAU M A A hA A A A A A AA W A A A A A A A A AA A AA A
A A
--
A
A
X
6 6
A
e
n !nm
8 *Y
6:
BW y IBBB
6
6 A
.X
A X
C
c c c c c c cc c c c c c c c ccccc c c
6 C
c
C,CC.fC~C
c
c cc
I
1
- 3 .805
1
1
ccczc c C ":c c
c
1
C
I
I
I
1
1
I
0.675 FIRST
-
-
c
cc c
7.395 - 6.648 1 5.902 - 5.155 4.408 6 -. 3.661 6 6 2.915 2.168 'Y 0 1.421 0.675 -0.072 -0.819 - -1.565 - -2.3 12 -3.059 -3.805 -4.552 -5.299 I -6.048 14.115
I
1
1
I
I
EIGENVECTOR
Figure 1. Two-dimensional plot o f the first t w o eigenvectors for each oil sample, Asterisk designates the mean of each oil group. A's, B ' s , C's designate asphalts. crude oil residues, and No. 6 fuel oils, respectively. W'S, X'S, Y ' s and 2 ' s are unknown petroleum products
ANALYTICAL CHEMISTRY, VOL. 48, NO. 4, APRIL 1976
653
T \ B L E !I
NO. OF NO. OF SUBGROUP
OIL GROUP
SA\lPLES CORRECTLY CLASSIFIED
KSOI\’S SO. O F PERCEUT SAMPLES CORRECT
wscLm-
cv.ssIrI-
FIED
TIOS
R E S U L T S OF I D E \ T I F I C . A T I O S
u\ KYOl