Structural Interpretation of Proton Magnetic Resonance Spectra by Computer: First-Order Spectra Graham Beech, Roger T. Jones,’ and Keith Miller2 The Polytechnic, Wolverhampton WV7 7L Y, England
A Fortran I V computer program is described which interprets essentially first-order PMR spectra in terms of molecular fragments and the probable molecular structure, using a series of logical steps which mimics the approach of a PMR spectroscopist. I n addition to structural information, the program provides the most probable interpretation of multiplets (which may be overlapping although individual peaks must be resolved). Results are presented for 30 compounds on which the program has been tested. The compounds contain the elements C, H, 0 , N, S, CI, and consist of saturated branched and straight carbon chain compounds which have a variety of substituents involving heteroatoms. Many mono-substituted and dl-substituted aromatic compounds may be processed. Structures involving nonequivalent methylene protons cannot be analyzed. The molecular formula of the unknown compound is not required by the program. The relative molecular mass may be used as optional input in which case only one molecular structure is provided as output.
There is a growing interest in the use of computers to elucidate molecular structures from spectral data. Mass spectrometry (1, 2) has been the most successful in this context. Spectrum matching techniques have been fruitful when used with d a t a from mass spectrometry (3-6), infrared (7-9) a n d N M R ( 1 0 ) . The complete structure of a molecule is most likely to be obtained by computer when d a t a from several spectroscopic techniques are used simultaneously. This approach is being developed in a t least one major industrial laboratory (11, 12) b u t few details have been published. We have produced a system of computer programs, written in Fortran IV, oriented toward off-line usage. The aim has been to obtain the maximum amount of structural information from a proton magnetic resonance spectrum and, where possible, to generate the structure of the compound. Sasaki e t al. (13) have attempted to derive
structures using P M R spectra, with information from IR and UV spectra. Their analysis required a prior knowledge of the molecular formula, but in our approach no such knowledge is assumed. We have made use of the molar mass of the unknown compound, as derived from low resolution mass spectrometry, to reject incorrect structures. Our analysis also differs from that of Sasaki in that we have attempted to simulate the approach used by a n N M R spectroscopist. Description of Method. A schematic representation of t h e program showing the various steps in the analysis is shown in Figure 1. The details of each step are given below. Identification of Groups of Lines. The spectrum is input to this section as a series of line positions (in ppm, 6 , relative to tetramethyl silane) and relative peak areas. It is examined for distinct groups of lines, such t h a t within a group no two consecutive lines are separated by more t h a n 10 Hz below 6 = 4.0 ppm, and more than 20 Hz above this value. This ensures that all components of a multiplet are contained within t h e same group of lines. Interpretation of Multiplet Structure. Each group of lines is examined for possible interpretations in terms of first-order multiplets. For example, a group of three lines could arise from a triplet, or one of three cases corresponding to a doublet and a singlet, or three singlets. Possible interpretations for a group of lines ( u p t o a maxim u m of eight lines per group), are systematically examined by the program using a set of nested loops. The area weighted mean chemical shifts appropriate to each interpretation are evaluated. By calculating probabilities based on line spacings and areas, the most probable interpretation for a group of lines is obtained. For example three lines of areas 11, I z , and I3 with separation Slz and 5’23 between t h e first and second, and second and third lines, respectively, have a probability of being a triplet given by
Present address, Alhright and Wilson Ltd., Oldhury Division,
C.K. To whom correspondence should he addressed. L . R. Crawford and J . D. Morrison, Anal. Chem., 43, 1970 (1971). D. H. Smith, 8. G. Buchanan, R. S. Englemore, A. M . Duffield, A. Yeo, A. E. Feigerbaum, J. Lederberg, and C. Djerassi, J. Amer. Chem. Soc., 94, 5962 (1972). H. S. Hertz, R. A. Hites. and K . Biemann, Anal. Chem. 43, 681 (1971). S. R . Heller, H. M . Fales, and G. W . A . Milne. J. Chem. Educ., 49, 725 (1972). S. R . Heller, Anal. Chem., 44, 1951 (1972). S. R. Heller, H. M. Fales. and G. W.A. Miine, Org. Mass Spectrom.. 7, 107 (1973). R. W. Sebasta and G . G . Johnson, Anal. Chem.. 44, 260 (1972). D. S . Erley, Anal. Chem., 40, 894 (1968). D. S. Erley, Appi. Spectrosc., 25, 200 (1971). R. J. Feldrnann and S . R. Heller J. Chem. Educ., 49, 291 (1972). S. Sasaki, H. Abe, T. Ouki. M . Sakamoto, and S. Ochiai. Ana/. Chem 40, 2220 (1968) S . Sasaki, H. Abe. M . Serdai, and T. Hato, German Patent No. 1,943,828; Chem. Abstr., 73, 3137t (1970). S. Sasaki, Y . Kudo, S. Ochaia, and H. Abe, Mikrochim. Acta, 1971, 726.
ANALYTICAL CHEMISTRY, VOL. 46, NO. 6 , MAY 1974
Similarly for a quartet
The formulas for other multiplets are related to the areas and spacings in a similar manner to the above. (If the probability calculated is greater than unity, then its reciprocal is taken). If for any interpretation P > 0.7, then no further interpretation is sought by the program for that group of lines. The relative areas of the most probable multiplets are calculated by summation of t h e individual peak areas for each multiplet and division by the smallest multiplet area. If each is within 10% of the nearest integer, then it is assigned the value of that integer. If any one of the values does not meet this condition, the relative areas are multiplied by 2 , 3, or 4 until the condition is satisfied. The
chemical shifts, multiplicities, and relative areas associated with the most probable interpretations of each group of lines are output for the users information (see first three lines, Table 111) a n d used as input for the next section. Generation of Molecular Fragments. Fragments are generated by accessing the correlation table, listed in Table I, and stored in t h e computer core. In this Table, t h e upper a n d lower limits of the chemical shift refer to t h e resonances of t h e fragments listed under the heading “middle group.” T h e chemical shift ranges are determined ( 1 4 ) by the environment of the middle group. This is defined in terms of the fragments listed under “peripheral groups” which are bonded to the middle group. The multiplicity of the middle group, implied by the environment of peripherals (neglecting couplings of less t h a n 2 Hz) is also given in Table I. (The case of the middle and (14) J. A . Pople, W G . Schneider. and H . J. Bernstein, “High Resolution Nuclear Magnetic Resonance Spectroscopy.” McGraw-Hill, New Y o r k , N Y . 1959, p 8 .
+
d
I
-
I
IDENTIFY GROUPS OF LINES
INTENSITIES
CALCULATE RELATIVT INTENSITIES
OUTPUT ALL INTERPRETATIONS (SHIFTS, COUPLINGS MlJLTIPLICITIES)
I
I
INTERPRET LINES I N TERMS OF MULTIPLETS
FILTER LIST AND BUILD STRUCTURE
Figure 1. Schematic flow diagram of computer program
Table I. Correlation Table Macrofragment Middle number group
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2C 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
CH, CH:r CHI CH CH, CH, CH H CH CH, CH? CH CH, CH, CH? CH? CH CH CH? CH CH CH CH, CH CH.j CH CHI H CHI CH, CHI CH, CH: CH, CH, CH, CH CH, CHI CH, CH? CHI CH, CH, CH CH, CH?
Shift limits ppm Peripheral groups
C
CH, CH CHI CH, CH CH.1
s
CH, CHa CH CH CHj CH, CH, CH? CH, CH,
c
CHj CH, CH.3 C=C CH., C=C CHI
co
CIC
S
-~
-
--
-
CHI CH, CH CHI
CH, ~--
CHr
-~ -
CH C C CH CH CH? CH C CH, CH,
c
CHI CH, CH? N CH ..~
CH -
C -
CH CH?
-
C CH C
-
CH -
CH.) CH1 .-
-
~-. ~~~
CO, -~ -. Ar CH.: C-C ~~C H I CEC CH C-C c C E C -CH, CN -CH, CH, ArCH, S -N CH, CO CH, CO, -CH CO CH CO? C CO -~ CH.; CH, CO C H j CO CHI CO? --
Lower
Upper
0 1 0 6 0 7 0 9 0 9 1 0 1 0 1 0 1 0 11 11 1 1 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 4 1 4 3 1 1 5 1 6 1 6 1 7 1 7 1 8 1 9 1 9 1 9 1 9 1 9 1 9 2 0 2 0 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 3 2 3
2 1 2 1 2 1 1 5 1 1 2 2 2 2 2 2 2 1 1 2 1 2 4 2 2 1 2 3 2 2 2 2 2 3 3 3 3 3 3 3 2 2 3 3 3 3 3
2 6 1 4 1 5 9 0 9 7 1 5 4 4 4 4 0 9 9 1 9 0 1 0 6 8 7 3 8 6 8 5 8 0 0 0 0 0 2 5 9 8 0 2 0 0 1
Multiplicity
1 3 2 7 6 3 9 1 5 4 2 3 5 5 4 3 8 8 1 7 7 6 1 6
1 10 1 1 1 1 1 4 3 2 1 4 5 4 1 3 3 2 2 1 6 4 4
Macrofragment Middle number group
Shift limits/ppm Peripheral groups
48 49 50
CH, CH? CHr
CH? CH C
CN CN CN
-
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92
CH CH CH CH CH, CH? CH? CH? CH? CHr CH? CH, CH CH, CH, CH CH CH CH CH CH, CH, CH? CH, CH CH: CHr CH? CH.j CH? CH? CH, CH CH? CH: CH, CH? CH CH? CH, CHI
CH? CHL CH, CH, CHa CHj CH? CH? CHI CH CH
CH CHI CH CH N Ar N S Ar N
AX CN CO CO?
C CHI CH? CHj CHI CH, CH
N CHI C N C CO CHI Ar CH, N CH Ar Ar C N -
CHI
S
-
--
c s CH, C H ? COc CO? -
Ar
N CO? CHI CH, CH. CO0 CH CO? Ar CH
c
CH? CH,
co
CH, CH, CH
s
CHj C1 C1
N ~
-
S -
C1 COS
-
C
0 -
c1 0 N
co
CO? CH? 0
-
-
CO-
-
c
o
-
COI
CN
-
Multiplicity
Lower
Upper
2 3 2 3 2 3
3 0 3 2 3 3
3 2 1
2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3
2 3 3 3 3 3 3 2 3 2 3 3 2 3 3 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 4 4 4 4 3 3 4 1 4 4 3
4 5 4 4 4 4 3 3 3 2 2 1 7 1 1 7 3 7 6 4 2 1 1
3 3 3 3 4 4 4 4 4 4 4 4 4 5 5 6 6 7 7 7 9 9 9 0 0 1 1 1 2 2 2 2
3 2
3 3 2 3 3 1 3 3 3
2 3 4 3 3 0 4 4 4
7 0 0 0 4 0 8 9 0 9 1 1 8 0 5 8 4 3 5 8 7 7 3 9 4 7 0 0 1 1 5 0 2 2 3 8 9 1 4 4 2 6
A N A L Y T I C A L C H E M I S T R Y , V O L . 46, N O . 6, M A Y 1974
1 7 4 3 1 1 2 1 1 2 1
3 1 1 3 1 2 1 1 715
Table I. (Continued) Macrofragment number
group
CO? CN N CH1 CO CH, 0 c1 CO?
CH, CHI CHI CH CH CH CH
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
co2 s
c H,
CH CH? CH? CHI CH, CH: CH CH CH CH CH CH CH CH HO CH, CHZ CH? CH? CH CH CH CH? CH, CH, CH, CH? CH? CH CH CH CH CH CH? CH? CH? CHJ CH, CH CH H CH CH CH CHI CH
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 a
Shift limits/pprn Peripheral groups
0
-
COJ
N
0
-
CH,
-
0
-
N CN CH? CHj CH COL CH Ar 0 0
-
~~
-
N
0 0 N 0 OIC
co co
-
_
co co, co C H OzC co coz CO, 0 CGC 0
CHI CH C CH CO? CH, CH? CH CH? CH: CO, 0 0 CHI CHI 0 CH Ar S CHI CH?
Ar
-
O?C 0,C
-
0,c
CH CO, Ar CHI CH CO OIC
-
C1
co,
N C1 O?C -
0 CN CEC -
CHs
c
__
0 Ar
s OrC c1
C1
o c=c C s co
Multiplicity
Lower
Upper
3 3 3 2 3 4 3 3 3 3 1 3 3 3 2 3 3 3 3 5 4 4 3 3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 1 4 3 4 5 0 5 5 5 8 5 5 5 5 7 1 2 5 5 6 6 6 6 6 6 7 7 7
3.8 3.8 4.0 3.6 3.9 5.1 3.8 4.3 5.0 4.1 1.4 4.0 4.0 4.0 4.5 4.9 4.1 4.5 4.7 6.1 4.9 5.0 5.5 4.0 4.7 4.2 4.2 5.0 4.3 4.2 4.0 4.2 4.2
1 1 1 7 6 3 4 1
4 3 3 3 4 3 3 3 2 3 3 4 4 4 4 3 4 4 4 4 4
0 8 8 8 0 9 9 9 2 9 8 0 0 0 0 5 0 0 0 0 1
4.4 4.5 4.6 4.8 4.5 4.4 4.5 4.7 2.8 4.4 4.7 4.5 4.5 4.3 5.0 5.5 4.7 5.5 4.5 4.6 5.0
1 2 1 3 1 4 5 3 1 3 1 1 1 6 3 1 2 1 1 4 3
3 4 1
1 1 1 7 7
5 4 4
4 4 2 1
Shift limitslppm
group
147 148 149 150 151 152 153 154 155 156 157 158 139 160 161 162 163 164 165 166 167 168
CHI CHI CH CH CH CH CHI CH, CH, CH CH? CHI CH CH CH H CH CH HO CH CH CH,
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
CH CH CH CH CH CH CH CH, CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH H H H CBHj C6H.i
Peripheral groups
NO, NO? -C H ? C1 c1 co, C1 CO? CO, 0 , C NO, 0
co, CH, c1 C=C 0 0 Ar
-
NO? -
0 0 0
-
Ar 0
0 0 -
CHa CH, 0
NO, 0,C -
Lower
LJpper
4.2 4.2 4.2 4.2 4.2 4.2 4.3 4.3 2.2 4.3 4.4 4.4 4.4 4.4 4.3 2.5 4.3 4.5 4.5 4.6 4.6 4.5
5.2 5.2 4.7 5.0 5.2 5.6 5.3 4.8 2.8 4.7 4.9 5.0 5.0 4.9 5.0 4.0 5.2
Multiplicity
8.5 4.8 5.3 6.1
2 1 7 4 2 2 3 1 1 6 1 1 3 4 4 1 3 1 1 7 5 1
5 2 5 5 5 7 5 6 5 3 5 3 5 4 5 4 5 5 5 6 5 6 5 9 6 3 5 7 6 1 6 0 6 1 5 7 5 7 5 8 6 1 6 0 6 0 6 1 6 1 6 2 6 5 8 3 106 13 5 8 5 8 5
7 3 2 2 1 1 1 1 1 6 4 3 2 1 4 3 3 1 1 1 2 1 1 3 1 1 1 1 1 1 1 1
L O
1 1 1 1 4
3 2 1 1 1
CH.9 C H J 0 , C 0 CH, Ar 0 CH 0 Ar CH 0 Ar Ar co Ar Ar eo, Ar N Ar AP OrC Ar CO? 0 CH:, C H ? O K c1 CHI AT 0 CH, CO CH C1 C1 Ar Ar Ar C1 C H , C1 0 CH, c1 c1 CH, C1 CO, CO, 0 , C CN CO, N 0 0 0 CH C1 co Ar CO 0 Ar 0 Ar CH, C1 NO? C1 c1 CO, Ar CO, 0 , C Ar Ar C1 COS CO 0,C ~~
---
-~
-~
-
_.
__
-
-~
-
_.
.-
4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 4 5 5 5 5 5 5 5 5 5 7 9 10 6 6
7 3 8 8 8 8 8 9 9 0 0 0 0 0 0 2 0 9 3 4 4 6 6 7 7 7 9 7 3 0 0 0
Ar = Phenyl ring.
Table 11. Basic Chemical Groups
Chemical group
Valency Chemical group
Valency Chemical group
Valency
716
Macrofragment number
CHJ
CHs
CH
CsH,
C,H:
1
2
3
1
2
C1
CEO 2
COO 2
N 3
0 2
OH
1
OOC
NO?
S
c-c
CN
H
2
1
2
2
1
1
C 4 1
A N A L Y T I C A L C H E M I S T R Y , VOL. 46, N O . 6, M A Y 1974
peripheral groups involving equivalent protons has been accounted for by setting the multiplicity of the middle group to unity). For clarity we shall refer t o each complete fragment ( i e . , middle groups and peripherals) in the correlation table as a macrofragment with an associated macrofragment number. All macrofragments are defined in terms of a set of 18 chemical groups given in Table 11. Only those groups containing protons can appear as a middle group, the remainder are limited to peripheral groups. Only those macrofragments with the correct chemical shift range, im-
Table 111. Computer Output for Propyl Acetate Group 1
Most probable interpretation Chemical shifta Multiplicity Possible macrofrags. Macrofrags. remaining after compatibility checks
Triplet probability 0.81 0.81
Group 3
Group 2
Sextet and singlet probability 0.93
2
1.49 6 5, 22
1.97 1 1, 27, 29, 30, 31
2
5
8, 28, 27, 30
3
Triplet probability 0.93 3.97 3 77, 85, 88, 101, 121, 129, 133, 135 77, 135
In ppm relative to T M S (6 scale).
plied multiplicity, and acceptable relative areas (in terms of number of protons), are selected on accessing the correlation table. For example, the spectrum of propyl acetate consists of three groups of lines. The most probable interpretations, together with the macrofragments selected by the computer, are given in Table 111. A set of macrofragments is eventually generated such t h a t there are no incompatibilities within the set. The compatibility logic steps are: 1) If a peripheral group is non-proton-containing, with a valency of n (see Table 11), the group must appear a t least n - 1 times amongst the peripherals of macrofragments derived from other multiplets. (This test is referred to later as the n - 1 test). 2) If the peripheral group is proton-containing, then it should be present as a middle group in one of the macrofragments derived from another multiplet and if such a macrofragment is found. then its peripherals must contain the middle group of the macrofragment under test. During this and subsequent checking procedures, COO and OOC are regarded as equivalent. Any macrofragment t h a t fails either of these tests is deleted from the list. The checks are repeated until either the total number of macrofragments for all multiplets is less than 10 or there are no further deletions from the list. The reduced list of macrofragments for propyl acetate is shown in Table 111. These macrofragments cannot simply be combined to produce larger fragments, since some of them arise from the same multiplet, and therefore exist as alternatives. With reference to Table 111, typical examples of permissible comhinations of macrofragments are lists A, B, C, and D of Table IV. These subsidiary lists are constructed by the computer, and each list is used as input to the fragment building section. Fragment Builder. The macrofragments in each subsidiary list are checked for compatibility. For example, with list A and list D (Table IV). List A is rejected, since it contains only one CO peripheral and to build a structure, another CO group is necessary as a peripheral of another macrofragment. List D is accepted, since all the macrofragments are compatible. When a compatible list of fragments is found. a macrofragment containing an end group is selected ( i . e , macrofragment 2 ) . The middle group is placed in the 5.5 position of a 15 x 15 array called JIG. The peripheral group(s) are placed around the middle group (Figure 2A), and the direction of any further building depends on whether a particular element of the array is already occupied. Before the insertion of more chemicai groups into the array, those with valency sites are identified. If this group is CsH4, CH3. CH2, or CH, then one of the remaining macrofragments having an identical species as its middle group is selected. This macrofragment must also have the
Table IV. Typical Permissible Combinations of Macrofragments for Propyl Acetate List
Macrofragment
A
2
B
5 27 77 2
C
27 135 2
Peripheral group
CH? CH3, CH,
co
CHL, C1 CH, CH3, CH,
5
5
30 77
D
2 5
30 135
co
CHJ, OOC CH? CHI, CHZ c 00 CHr, C1 CH? CH3, CH,
CH? CH, CH? CH3 CH? CH3 CH?
coo
CHI, OOC
D
I
CH3
CH2
CHZ
OOC
CH3 I
Figure 2. Fragment building in the case of propyl acetate
middle group already in JIG as one of its peripherals. In the case of propyl acetate, only macrofragment 5 of list D is acceptable on this basis; the alternative, macrofragment 136, does not contain the necessary CH3 as a peripheral. The selected macrofragment is added to JIG t o give Figure 2B. A repeat of this process results in the selection of macrofragment 135 and gives Figure 2C. When a nonproton-containing species is found with a free valency site in JIG, the peripheral groups of the remaining macrofragments are examined to find a n identical group. This results in macrofragment 30 being selected, giving Figure 2D when added to JIG. The resulting structure has no free valencies, and at this stage the masses of the chemical groups in JIG are summed and compared with the integer molar mass. Equivalence of the two mass figures prints a message stating t h a t the structure is consistent with the d a t a supplied. The analysis is terminated at this stage. If the correct structure is not obtained, the relative areas of the multiplets are doubled and the program is repeated A N A L Y T I C A L C H E M I S T R Y , VOL. 46,
NO. 6 ,
M A Y 1974
717
Table V. C o m p o u n d s T e s t e d Number of structures
Compound
CHaCHzC6Hj ClCH?CH,CO,H CHaCH?CH,O,CCHa CeHjCH (CHs)OzCCH3 (CH1O)?CHCHzCOCHa CH(CO2CHZCHa)a CH3CHrC tCHa) (CH&O?H)? CHZCHZCH ICOyH)? CsHjCH20,CCH:CH;j CH,O?CCH?CH,CO?H CHaOCH?COyCH,CH3 C6HjCHIOH)CH,CH, CeH&H,CH?O,CH (CHaCH,O)GHCO,CH,CHa CHsCH(C1)COZCHs CH,CH,SCH,CsH5 CICH,CH?CH,CsHj CHaCH2CHICl)COpH ClCH?CH,O,CCH, CHsCH20CH2CO?CHZCHs CHIOCHZCN CHaCH (OH)COCH3 ICHa),NCH,COYCH,CH, Cl,CHC02CH,CH, CeHjCHICHZOCHa CHaCOCH?CH?COCHa C6HjCH ICH3)NHz CHaCsH,CH(CH,)? CsHjCH?N(H)CH,CH, C6HjCH2N (CHZCN),
With mass check
Without mass check
Execution time/=@
1 1 1 1
1
6
2 1 1
6 7
1
5
1 1
1 1 1 1
1 1 1
1 1 1
1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1
4 5 1 2 1 1 2 3 1 1 1 3 1 2 1 6 5 1 3 1
11
6 46 6 12 6 9 24 18 7 7 11 7 10 8 6 6 20 14 6 39 5 26 24 5 10 27 53
a Using an ICL 1903A (48-K, 24-bit word) computer, equivalent to an IBM 370 135. Program storage requirements: core: 26 K, disk: 67 K .
from the generation of fragments step. This allows "dimer" molecules, such as 2,5 hexane-dione, to be analyzed. The fragment building subroutine also computes the ratio of the relative area of the multiplet to the number of protons in the middle group of the macrofragment selected for t h a t multiplet. This macrofragment is duplicated I - 1 times, where I = ratio obtained. For example ethyl malonic acid CH&H&H(COzH)2 has a n acidic OH of intensity 2; and since the middle group of macrofragment 198 (H-OzC) contains only one proton, the macrofragment is duplicated when building the structure. The macrofragments for C6H5 and C6H4 are selected without peripheral groups. The latter are added in two
718
A N A L Y T I C A L C H E M I S T R Y , V O L . 46,
NO. 6,
M A Y 1974
ways. (i) If a macrofragment has an aromatic group as a peripheral ( e . g . , CH2CH2C&), then the middle group (italicized) becomes a peripheral of the aromatic group. (ii) A non-proton-containing group which fails the n - 1 test (mentioned earlier) becomes a n aromatic peripheral. In the case of CsH4, (i) and/or (ii) may apply. Thus for anisole, CH30C&, the macrofragments selected would include (a) CH3, 0 and ( b ) c& (middle groups italicized). I n this case 0 would be inserted as a peripheral of t h e macrofragment ( b ) .
RESULTS AND DISCUSSION Table V shows typical types of molecule for which the program is applicable. These are open chain structures with essentially first-order spectra. No problems have been encountered with spectra involving intermingled multiplets (subject to the restriction on the total number of lines within a group). The overlapping of lines within a multiplet, due to accidentally equal couplings, is within t h e scope of the program but the exact overlapping of lines arising from different multiplets is not. This limitation is not serious since changes in solvent often resolve overlapping bands and the shift ranges in the correlation table are sufficiently wide to accommodate solvent shifts in most cases. The parent ion mass is a n optional input. In the absence of this information, the program may produce more t h a n one structure which is compatible with the spect r u m . Checking of the molar mass has been found to be an efficient way of selecting the correct structure from such a set of alternatives. For each of the compounds listed in Table V, only one structure was produced when the molar mass was included as input. The computer time required for analysis, shown in Table v, depends on the complexity of the multiplet structure and the number of macrofragments obtained on accessing the correlation table. The program may be extended by incorporation of a more comprehensive correlation table and the provision for additional optional input derived from other spectroscopic techniques and from qualitative elemental analysis. Further details of the program described in this paper are available from the authors on request.
ACKNOWLEDGMENT The authors express their thanks to the Computer Staff at the Polytechnic, Wolverhampton, and in particular to G. J. Smith for her help with the programming. Received for review March 13, 1973. Accepted November 5 , 1973.