Mass Spectral Feature Selection and Structural Correlations Using Computerized Learning Machines Peter C. Jurs Department of Chemistry, The Pennsylvania State University, University Park, Pa. 16802
A feature extraction procedure based on computerized learning machines is used to investigate 630 low resolution mass spectra in the range of Cl-loH,~z400-4No-z and also to study how substructures of molecules are reflected in their mass spectra. The information relevant to two important compositional characteristics of molecules (oxygen atom presence or absence and nitrogen atom presence or absence) has been shown to be centered in relatively restricted portions of the overall mass spectrum. Perfect recognition and high predictive abilities are shown for pattern classifiers based on less than 20% of the original m/e positions in the mass spectra. The correlations found for these two characteristics of small organic molecules are presented and discussed.
PREVIOUS WORK has shown that learning machines can be successfully applied t o the interpretation of complex chemical data (1). Computerized systems were developed which could classify low resolution mass spectra, infrared spectra, or combinations thereof, into useful chemical categories. The learning machine method used depends o n representing the data such as mass spectra as points in a d-dimensional hyperspace XI,^?, . . . xd), where x i represents the intensity of the mass spectral peak a t m/e = i. A d-dimensional point ( x l , x 2 , . . ,xd) is equivalently represented by X, a d-dimensional vector, and is said t o have d components, or features. When points representing mass spectra of molecules which have common structural or compositional components are placed in the d-dimensional space, one might expect them t o cluster in limited regions of the hyperspace, a n expectation which is met. When spectra cluster, then positioning a decision surface between the clusters yields a decision method for assigning a new pattern point to one cluster or the other depending o n which side of the decision surface it is determined to fall. It turns out that this decision can be made most conveniently by using the normal vector of the decision surface, here called the weight vector, W. Given the normal vector t o a decision surface, W, and a point in the hyperspace, the side of the decision surface o n which that point falls is determined operationally by forming the dot of the pattern vector, X, and the weight vector, W, to obtain a scalar result, X . W = s. The sign of the scalar determines which side of the decision surface the point is on. When many such d component patterns are scattered throughout the hyperspace, then the problem becomes one of finding a useful decision surface which dichotomizes the hyperspace as desired. Feedback methods are used t o locate a decision surface (actually its associated normal vector W) in this hyperspace so that the points fall into the two desired categories on opposite sides of the decision surface. The detailed workings of the learning machine classification mechanism and the methods for finding useful decision surfaces have been discussed at length previously. Feature Extraction. I n investigating the classification powers of learning machines, it is extremely desirable to per(1) P. C. Jurs, B. R. Kowalski, T. L. Isenhour, and C. N. Reilley,
ANAL.CHEM.,41, 1949 (1969), and references cited therein.
form feature extraction o n the original data patterns before classification, that is, reduce the number of features per pattern as much as possible. Two reasons are as follows: (1) The length of the dot product operation depends o n the number of features per pattern. I n terms of the computer program implementing the learning maching procedure, the number of multiplications in forming the dot product W . X = s is a monotonically increasing function of the number of dimensions of W and X. Thus, reducing the dimensionality of W and X decreases the dot product computation time. A substantial decrease in the dimensionality of the system makes it possible to calculate these dot products o n a desk calculator instead of with a computer. (2) By reducing the number of features per pattern, one can investigate which portions of the original data are most important with regard t o the classification being made. That is, if the number of features per pattern can be substantially reduced with n o concurrent loss in recognition ability or predictive ability of the lower dimensional pattern classifiers, then one has narrowed the locus of information placement within the original pattern. Additionally, if the pattern components have physical meaning, one can then look at the results in terms of the physical interpretation of this reduction. Many methods have been proposed in the literature for the extraction of important features from data destined for subsequent use in pattern classification. Most of them rest o n statistical treatments of the data, and they often involve calculating eigenvalues and eigenvectors of covariance matrices (2-6). These methods are not directly applicable t o the type of physically meaningful data being considered here for several reasons. The basic reason is that mass spectra peak intensities d o not obviously follow simplified probability density distributions, the situation for which most of the statistical methods of feature extraction are designed. Other, nonstatistical methods must evidently be used to extract features from such complicated data sources as mass spectra. A nonstatistical method has been reported which relies on a feedback improved method for selecting subsets of the original data (7). This adaptive strategy is shown to function for low dimensional data (d = IS), but would be cumbersome with the extremely high dimensionality of mass spectra (up t o 195 dimensions for the data used in this work). Thus, some other (2) J. T. Tou, Pattern Recognition, 1, 3 (1968). (3) J. T. Tou and R. P. Heydorn, “Some Approaches to Optimum
Feature Extraction,” in “Computer and Information Sciences, 11,” J. T. Tou, Ed., Academic Press, New York, N. Y . ,1967 p 57. (4) S. Watanabe, P. F. Lambert, C. A. Kulikowski, J. L. Buxton, and R. Walker, “Evaluation and Selection of Variables in Pattern Recognition,” in “Computer and Information Sciences, 11,” J. T. Tou, Ed., Academic Press, New York, N. Y . , 1967, p 92. (5) W. G. Wee, IEEE Trans, IT-16,47 (1970). (6) Y. T. Chien and K. S. F u , Informa~iorzand Control, 12, 394 (1968). (7) Y. T. Chien, “Adaptive Strategies of Selecting Feature Subsets in Pattern Recognition,” Abstract, Proc. IEEE Tech. Symp. on Adaptive Processes, Penn State University, University Park, Pa., Nov. 17-19, 1969.
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
1633
Table I. Training for Oxygen Presence +1 weight vector - 1 weight vector mle PredicPredicAV Z positions Feedbacks tion, Z Feedbacks tion, % prediction 93.3 92.7 92.1 219 119 236 93.3 93.4 93.6 22 1 69 179 94.9 94.9 94.9 223 51 208 94.9 94.5 94.2 256 43 202 94.2 93.4 92.7 217 40 217 92.7 93.3 93.9 203 38 184 93.9 94.2 94.6 213 37 199 93.3 93.9 94.6 202 31 235 Av 93.8 Total Training set 82 218 300 Prediction set 92 238 330
+
Table 11. Training for Oxygen Presence -1 weight vector +1 weight vector PredicPredicAv Z mte positions Feedbacks tion, % Feedbacks tion, prediction 93.3 91.7 90.0 208 119 210 92.7 93.0 93.3 174 74 187 92.7 93.0 93.3 187 53 179 92.1 92.4 92.7 28 1 45 165 93.0 92.5 92.1 22 1 42 161 93.3 92.7 92.1 223 38 158 93.9 93.5 31 210 93.0 192 Av 92.7
strategies must be developed for feature extraction using this data. Previous work reported a procedure for selecting features in mass spectral data (8). That method depended o n quantitatively calculating a measure of the contribution each feature in the patterns was making t o the overall decision process and subsequently discarding those features with lesser contributions, I t was an algorithmic process, with the number of parameters to be discarded a t each stage decided in advance. The algorithm had no capacity t o decide when t o terminate; it therefore continued discarding features even when the learning machine’s performance was thereby substantially degraded. However, the application of that feature selection method to low resolution mass spectra showed that the pattern classifier could find useful decision surfaces while using a fraction of the total data available. A new dynamic feature extraction method is presented here whose outcome depends o n conditions during its execution. That is, the method discards features until it can find no more features which are not contributing substantially to the overall decision process and then terminates. The method has been successfully applied to mass spectra, and before describing it, the data pool employed must be described. Data Base. The mass spectral data used in this work were drawn from the American Petroleum Institute Research Project 44 tables. The data consist of 630 low resolution mass spectra of relatively small organic molecules in the range C1--10H1--2400--4NU--2. Only peaks with intensities greater than 1 of the largest peak in the spectrum were used; most of the spectra have 15 t o 40 such peaks, and a total of 17,137 peaks occur in all. No selection whatsoever was performed-the (8) P. C . Jurs, B. R. Kowalski, T. L. Isenhour, and C . N. Reilley, ANAL.CHEM., 41, 690 (1969). 1634
630 spectra include all the API spectra within the limits described. Before using the data with the pattern classifiers, all the intensities were square rooted in accordance with the procedure discussed previously. Out of the data pool of 630 spectra, 300 were randomly selected as a training set, with the remaining 330 spectra reserved for use as a prediction set. Within the entire data pool, the highest mje position for which some spectrum exhibits a peak is mje = 195. Thus, d = 195 and X and Y would have t o be 195 dimensional vectors. However, the linear decision surfaces being used treat each dimension independently with n o interaction between terms. Thus, the dimensionality can be reduced t o 155 because n o spectrum exhibits peaks for any of the other 40 rnje positions. A further reduction results by discarding the 36 rnje positions of which fewer than 10 peaks occur in the entire data pool. There are 17,137 peaks total, of which only 111, or 0.6%, are discarded by this selective procedure, leaving 119 m/e positions. Tests have shown that the positions thus discarded would make n o appreciable contribution to ease of classification if they were included.
OXYGEN PRESENCE FEATURE SELECTION The mass spectra data consisting of 119 features per pattern were investigated by a feature selection program based on a linear learning machine method. The sequence of operations is as follows. Two weight vectors are to be trained to detect oxygen presence/absence. One of the weight vectors is initialized with all components set equal t o +1, and the other one is initialized with all components set to -1. The two weight vectors are then trained t o classify the members of the training set. Their predictive abilities are tested, and then the rnje positions not contributing to the classification are discarded by the following scheme. The components of the two weight vectors which correspond to the same mje position are compared; if both components have the same sign, the m/e position is said to correlate well with oxygen absence if they are both negative and oxygen presence if they are both positive, and that mje position is retained. But mje positions for which the signs are different are considered ambiguous and are discarded. After all the mje positions have been checked, the entire cycle begins anew with the spectra of reduced dimensionality. The process is repeated as long as the feature extraction routine can find rnje positions which are ambiguous. Table I shows the results of applying the feature selection procedure to oxygen presencejabsence classification. The pattern classifiers are trained to detect oxygen presence in the compound yielding each spectrum regardless of the type of oxygen group present. The populations of the training set and prediction set are given at the bottom of the Table. The training set contained 82 spectra of compounds which contain oxygen and 218 which d o not. The first column in Table I give the number of mje positions being considered at each stage of the feature selection process. Column two gives the number of feedbacks performed while training to complete recognition of the members of the training set by the weight vector initiated with f l ’ s . Column four is the same parameter for the weight vector initiated with - 1’s. Columns three and five give the predictive percentages exhibited by the two weight vectors on the 330 members of the prediction set for each stage of the feature selection process. Through seven iterations, the number of features per pattern was reduced from 119 to 37 mje positions. Despite this decrease in dimensionality, the number of feedbacks performed during training remains approximately constant; the total computer time used in training thus falls because each classifi-
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
Table 111. Peak Placement in Training Set-Oxygen Detection m/e with f correlations NO. - .. mle
14 27 31 37 38 43 45 46 59 69 73 83 100 135
+
No. 53/46 207175 15/63 91/22 164144 159173 5/60 2/21 6/36 i31ji8 6/26 79111 lop1 1012
z, ” %+
0.43 1.04
0.09 1.55 1.40 0.82 0.03 0.04 0.06 2.74 0.09 2.70 0.34 1.88 Av 0.94
Av wt 1.33 0.30 2.77 0.86 0.79 0.14 1.04 1.08 1.15 0.48 0.40 0.58 1.20 1.38
cation involves a shorter calculation in a lower dimensional space. The average predictive ability remains high even with only 37 out of the original 195 m/e positions being considered. Table I1 shows the results of another test conducted identically to that of Table I except that a somewhat different training procedure was used. The difference in the training procedure is that the members of the training set are presented in a different sequence, thus yielding different decision surfaces. The results are comparable to those in Table I. After the two feature selection routines had been allowed to reduce the number of features per pattern t o 31 and 38 mje positions, respectively, a list of features common to both lists was compiled. Thirty-one m/e positions were selected by both feature selection routines. Then both routines were allowed to train using these 31 mje positions. Neither routine could find any more ambiguous m/e positions and they both terminated with the results shown at the bottom of Tables I and 11. It is interesting to note that in both cases the ability of the classifiers to correctly categorize complete unknowns was nearly the highest observed for any training sequence, 93.9 % and 93.5 %, respectively. It appears that removal of the ambiguous mje positions from the problem does not degrade the classifier’s performance. In order to investigate the character of the rnje positions selected by the feature extraction routine, Table I11 was constructed. For each m/eposition selected, a count was made of the number of peaks occurring in that position among the oxygen containing compounds’ spectra and among the nonoxygen containers’ spectra. These two counts are given in columns two and six. Notice that even the most populated rnje positions which correlate with oxygen presence d o not contain peaks for more than 75 of the 82 oxygen-containing compounds. Then each of the ratios thus obtained was multiplied by a factor which takes into account the fact that there were 82 oxygen compounds and 218 non-oxygen compounds in the training set. The numbers in columns three and seven, then, express the ratio of peaks occurring in nonoxygen spectra to those occurring in oxygen spectra, expressed as a ratio. Thus, a figure of 1.00 for a particular rnje position means that that peak occurs just as often in either subset of spectra, a figure between zero and 1.00 means that the peak occurs more frequently among non-oxygen compounds, and a
m/e with NO. No. 124148 717 11/17 612 27113 27/54 21 1/64 204145 64/64 14819 9317 12017 107118 6818 2115 5111 1212
+
mle 15
17 18 24 25 30 39 40 44 52 63 67 70 84 86 91 128
- correlations %+ 0.97 0.38 0.24 1.13 0.78 0.19 1.24 1.71 0.38 6.18 5.00 6.44 2.24 3.19 1.58 19.20 2.26 Av 3.12
Av wt -0.86 -0.81 -1.06 -1.23 -0.47 -0.38 -0.60 -0.90
-0.20 -0.72 -0.96 -0.74 -0.59 -1.07 -0.53 -1.23 -0.79
Table IV. Probable Fragments of the 14 mle Positions Correlating with Oxygen Presence mle
Fragments
14 27 31 37 38 43 45 46 59 69 73 83 100 135
CHz GH3 CHsO C3H C&, CzN CzH30, CsH7, C H ~ N ZCZHSN , CzHsO, CzH7N(r) CZHBO CaH70, CZH302, CZH~NZ, C2H5NO(r) C4H60, CE” C4H90, C4H11N(r) C~HTO, CBHll CsHloNO, C ~ H I ~ c~HizO(r) N, CsH702, CgHnO, C~HSNO
figure of greater than 1.0 means that the peak occurs more frequently among oxygen containing compounds’ spectra. The Table shows that the ratio varies widely among each of the subsets, and it also shows that no simple relationship exists between peak frequencies and features selected by the feature extraction process. Some m/e positions have ratios that are not surprising, for example, the mje positions 31, 45, 59, 73 occur much more frequently for oxygen compounds than for non-oxygen compounds. But other rnje positions with less favorable ratios have larger weight vector components, so the feature extraction procedure is not merely counting peak frequencies. Notable for their absence are the mje positions 16 and 32 which have not been retained by this feature selection process. The fourth and eighth columns give the average weight vector components from the four weight vectors trained using the final 31 rnje positions. A plot of peak occurrence ratios against the magnitude of the average weight vector component shows virtually no correlation. This result again demonstrates that the feature extraction procedure is not merely searching out simple linear correlations. Of the 31 m/e positions selected by the feature selection process, 14 have positive weight vector components--i.e., they correlate with oxygen presence. Table IV lists these 14 m/e positions along with the fragments which correspond to each
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
1635
Carbon No. 1 2 3 4 5
6 7 9 10
No. of spectra 3 5
10 12 16 19 5
6 6
Table V. Oxygen Compounds in Training Set Hydrogen No. No. of spectra Oxygen No. No. of spectra Nitrogen No. 2 2 1 43 0 3 1 2 33 1 4 3 3 3 5 2 4 3 6 9 7 1 8 15 10 13 11 2 12 14 14 8 16 2 18 2 20 4 22 4 Oxygen containing group Aldehydes Esters Ketones Acids Ethers Alcohols
Table VI. Training for Nitrogen Presence - 1 weight vector +1 weight vector mle PredicPredicAv positions Feedbacks tion, Feedbacks tion, % prediction 92.4 171 91.8 119 187 93.0 93.3 93.3 150 93.3 130 69 145 93.3 142 92.7 93.0 55 92.1 92.7 51 140 93.3 109 92.7 92.4 47 162 92.1 95 143 108 93.3 93.2 44 93.0 93.3 120 93.3 43 137 93.3 93.6 93.7 37 128 93.9 133 134 93.3 141 93.3 93.3 36 Av 93.0 Total Training set 38 262 300 Prediction set 43 287 330
+
Table VII. Training for Nitrogen Presence - 1 weight vector +1 weight vector mle PredicPredicAv 7 positions Feedbacks tion, Feedbacks tion, % prediction 92.1 92.6 93.0 159 119 184 92.9 136 92.4 150 93.3 71 93.0 93.3 62 151 93.6 145 93.0 93.6 142 141 94.2 56 93.0 93.8 94.6 123 49 132 94.2 93.6 117 131 93.0 47 93.3 120 93.6 93.0 44 115 94.6 93.9 124 121 93.3 43 93.9 94.2 122 126 94.6 42 95.2 94.5 114 125 93.9 37 94.6 94.3 112 147 93.6 36 Av 93.6
position, taken from a published compilation of mass spectral fragments (9). Rearrangement reactions are denoted by (r). Many of the mje positions in the list are those which would be expected, especially the series 31, 45, 59, 73. Most of the (9) F. W. McLafferty, “Mass Spectral Correlations,” Aduarz. Clzem. Ser., Vol. 40, American Chemical Society, Washington, D. C., 1963. 1636
No. of spectra 76 6
No. of spectra 5
17 15 3 26 16
m / e positions correspond t o fragments containing oxygen and are therefore not surprising. However, other m/e ratios such as 14,27,37, and 38 have no oxygen-containing fragments and, in the case of 27, 37, and 38, even have unfavorable peak occurrence ratios as shown in Table 111. These peaks appear to arise preferentially from the oxygen containing compounds in the training set. They apparently are used by the learning machine to classify the compounds of the training set which cannot be classified o n the basis of their oxygen-containing fragments alone. The training set used was composed of 300 spectra, of which 82 contained oxygen in some configuration. Table V gives a profile of these 82 oxygen containers. I t shows that within the limit of 10 carbon atoms allowed, there were many varied oxygen functionalities and there was a wide range of molecular sizes. Thus, it is felt that the feature selection results are real correspondences between oxygen presence and mass spectral features. NITROGEN PRESENCE FEATURE SELECTION Tables VI and VI1 show the results of applying the feature selection method to the problem of finding correlations between mass spectral features and nitrogen presence. The two implementations were able t o reduce the number of features t o 43 and 42, respectively. The two lists of features had 37 mje positions in common. These were supplied to the routines again and each routine found one more ambiguous mje position. Thus 36 features remain, and the learning machine can correctly identify all the patterns of the training set and 93.3% and 94.3% of the prediction set using only these mje positions. Table VI11 gives the peak placement information about the members of the training set for the nitrogen presence determination. Among the mle positions with postive correlations, there are few positions with very favorable peak occurrence ratios. The largest average weight vector component, m/e 30, only has a peak occurrence ratio of 0.39; the peaks used by the learning machine to classify spectra as nitrogen-containing compounds are evidently quite concealed. Eighteen of the 36 selected features correlated with nitrogen presence, and they are listed in Table IX. The list contains
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
~~
~
Table VIII. Peak Placement in Training Set-Nitrogen Detection mle with - correlations mle with correlations NO. %NO. 2 No. %+ Av weight mle No. %+ 0.85 12 18/11 0.24 0.66 141131 0.75 13 36/10 0.52 1.02 247135 16 8/10 0.12 1.03 0.71 234133 1.40 0.39 1.81 29 232124 59/22 1.47 0.69 31 7117 236133 1.04 0.96 0.74 0.86 39 239136 107121 1.57 43 207125 1.20 1914 0.69 0.28 45 5916 1.43 135122 0.89 0.62 53 203118 1.64 0.87 1813 0.55 55 187113 2.08 0.32 33/15 0.43 59 4012 2.90 0.51 2818 1.07 74 3416 0.82 18/10 0.26 27/10 0.39 0.54 85 5711 8.26 2313 1.11 0.12 0.81 86 8/10 0.74 87 2313 1.11 3114 1.12 0.31 91 4517 0.93 0.36 1014 92 2617 0.40 0.54 1713 0.82 0.46 119 1711 2.46 713 0.34 Av 0.64 Av 1.62
+
m/e 15 27 28 30 41 44 46 52 60 64 75 76 93 94 96 106 116 121
mle 15 27 28 30 41 44 46 52 60
+
Av weight -0.72 -1.66 -1.08 -1.53 -1.41 -1.30 -0.49 -1.03 -0.89 -0.74 -1.04 -0.81 -1.36 -0.82 -0.73 -0.89 -0.76 -1.30
Table IX. Probable Fragments of the 18 m/e Positions Correlating with Nitrogen Presence Fragments m/e Fragments CH3 64 C5H4 CzHa 75 CeH3, C3H70z3CaH7Nz0 N2, C&N, C2H4 76 CHzN03, C e " CHiN 93 CeHsO, C7H9, C B H ~ N ( ~ ) CzHaN, C3H5 94 C&O(r) CHzNO, C2H6N,C2H40(r) 96 C7H12 CzH60,NOz 106 C&O, C7H8N, Cd-L,O(r) CaHzN, CJ34 116 C&N, CgHs, CeHi4NO CZHBNO, CzH40z 121 CsHgO, C7H50z
Carbon No. 2 3 4
No. of spectra
5
4 10 1 5 3 2
6 7 8 9 10
+
4 3 5
Table X. Nitrogen Compounds in Training Set Hydrogen No. No. of spectra Oxygen No. No. of spectra Nitrogen No. 4 1 0 33 1 5 6 1 1 2 6 1 2 4 7 8 8 2 9 3 11 8 12 3 13
2
15 17
2 1
Nitrogen containing group 1O amines 2" amines 3' amines Ring N NOz Nitrile
more anomalies than for the oxygen case, but similar reasoning must be applicable. The m/e position of 14 is missing from the list, but 28 has been selected this time. Whether this indicates that the original samples had N2 impurities or that the learning machine is using other information a t mje 28 is not clear, although the probability of N? impurities for a t least some samples is high. Table X gives the profile of the 38 nitrogen-containing compounds in the training set. Several different nitrogen functionalities are present, including a substantial percentage of compounds with nitrogen incorpo-
No. of spectra 32 6
No. of spectra 6 6 4
16 5 5
rated into rings. Thus the feature selection results are felt to accurately characterize the nature of the mass spectra of small nitrogen-containing molecules. Application of this feature selection routine based on computerized learning machines show that the information relevant to the questions of oxygen presence/absence and nitrogen presence/absence is localized in relatively few m / e positions in low resolution mass spectra. The routine employed is capable of learning to recognize all the members of its training set and a very high percentage of the prediction set while using only a
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
1637
fraction of the available features of the mass spectra. On closer examination the features selected by the routine depend o n the fragmentation of these small organic molecules within the mass spectrometer and show that some fragments not containing oxygen or nitrogen atoms are important nonetheless for determinating their presence in the molecule. Further studies of lowresolution mass spectra using the methods described here along with larger data sets should yield insights into the molec-
ular fragmentation reactions involved in mass spectrometry. The same techniques can also be applied to feature selection for features which correlate to any substructures of molecules which are reflected in their mass spectra.
RECEIVED for review April 6, 1970. Accepted August 19, 1970.
Ion-ExchangeKinetics via Nuclear Magnetic Resonance Lawrence S. Frankel Rohm and Haas Company, 5000 Richmond Street, Philadelphia, Pa. 19137 A mixture of a cation-exchange resin in two different ionic forms will, when in physical contact, undergo ion exchange. The variation in the observed chemical shifts can be utilized to follow the ion-exchange reaction. At equilibrium, a random distribution in each resin bead of each type of counter-ion is observed. This method of studying ion-exchange reactions is free of ion selectivity and co-ion effects. The rate determining step is shown to be particle diffusion through the cross-linked matrix. Variation in cross-linking and cation is reported. Data for macroreticular cation-exchange resins is presented.
DIFFUSION PHENOMENA in ion-exchange resins are of great practical importance and theoretical interest (1-3). Virtually all previous reports measure the change in concentration of the external solution under batch operating conditions. During the course of our study of the nuclear magnetic resonance (NMR) spectra of ion-exchange resins, we have observed effects readily attributed to ion-exchange dynamic phenomena. The purpose of this paper is t o examine the feasibility of utilizing N M R as a technique for studying ionexchange kinetics. That this procedure has some unique advantages will be shown. The N M R spectrum of a column of ion-exchange resin will generally show sets of peaks. One peak originates from solvent or counter-ions inside the ion-exchange resin (interior peak); the other (exterior peak), is from the molecules in the volume of the column not occupied by the resin beads (void volume of column). The solid resin backbone does not contribute any peaks to the spectrum since its molecular motion is highly restricted. Several reports have appeared in the literature on the N M R spectrum of ion-exchange resins (4-10). (1) J. Crank and G. S. Park, ‘‘Diffusion in Polymers,” Academic Press, New York, N. Y., 1968, Chapter 10. (2) F. Helfferich, “Ion-Exchange,’’ Marcel Dekker Inc., New York, N. Y., 1966, Chapter 2. (3) F. Helfferich, “Ion-Exchange,’’ McGraw-Hill Book Co., New York. 1962. (a) Chapter 6; (b) p 230-231; (c) p 253; (d) p 285; (e) p 306. 14) J. E. Gordon. J. Phvs. Chem.. 66, 1150 (1962). (5) D. Reichenberg and I. J. Lawrenson, Tram: Faraday Soc., 59,
141 (1963). (6) . , R. H. Dinius. M. T. Emerson, and G. R. Choppin, J. Phys. Chem., 67, 1178 (1963). (71 J. P. devilliers and J. R. Parrish. J . Polym. Sci.,Part A , 2, 1331 (1964). (8) D. G. Howery and M. J. Kittay, Abstracts, 158th National Meeting, ACS, New York, N. Y . ,Sept. 1966, Division of Colloid and Surface Chemistry, No. 26. (9) T. E. Gough, H. D. Sharma, and N. Subramanian, Cart. J . Chem., 48, 917 (1970). (10) R. W. Creekmore and C. N. Reilley, ANAL.CHEM.,42, 570 (1970). \
,
1638
Gordon has given an excellent general discussion ( 4 ) which indicates the scope of information that may be obtained. Generally speaking our spectra are comparable to those of Gordon. The basic experiment examined in this paper utilizes a cation resin in two different ionic forms. If a resin mixture of this type is allowed to come into physical contact, the tendency of the system, to maximize its entropy will serve as a driving force for ion exchange. At equilibrium, the cation-resin mixture has lost its chemical identity and each resin bead contains the same ratio of concentrations of the two ionic forms, The variation in the observed chemical shifts can be utilized to follow the concentration changes during the dynamic process in each ionic form of the resin. Gordon ( 4 ) has briefly noted that experiments of this type were feasible. EXPERIMENTAL
All measurements were made o n a Varian H.R. 60 spectrometer operating at 56.4 M Hz. The ion-exchange resins employed are commercial products of Rohm and Haas Company and are available under the registered trademark Amberlite. The backbone consists of styrene crosslinked with divinylbenzene (DVB) which is subsequently sulfonated. The cross-linking is expressed as the nominal DVB content. Prior to use the resins were cycled, washed, and centrifuged by standard techniques (36) to obtain the hydrated form of the resin. Data have been obtained with water and cyclohexane as exterior solvents occupying the void volume of the column. The results with the exterior solvent water were obtained at an ambient probe temperature of 34 “C. Samples used to obtain the kinetic data were prepared by weighing a known amount of each ionic form of the hydrated resin (approximately 1 gram) in separate 10-ml vials which already contained approximately 2 ml of water. The contents of the two vials are mixed together, a t which point a timing device is started and the resin mixture is thoroughly mixed. The majority of the water is then removed with a hypodermic syringe. The resin mixture is transferred to a high resolution N M R tube to which water is subsequently added and the sample is placed in the spectrometer. This procedure was found desirable because it facilitates the mechanical transfer of the resin to the N M R tube. When cyclohexane was employed as the exterior solvent, the procedure was modified. Only one 10-ml vial was utilized. One ionic form of the resin was weighted and then the vial was turned o n its side and the second ionic form added. The following notation, Amberlite IR-120 Na,H is understood to designate the sodium and hydrogen form of Amberlite IR-120 ion-exchange resin. The composition of the resin mixture used to obtain the kinetic data contained a mole fraction of each counter ion of 0.50 0.01. Unless
ANALYTICAL CHEMISTRY, VOL. 42, NO. 13, NOVEMBER 1970
*