J . Chem. In$ Comput. Sci. 1983, 23, 197-203
ACKNOWLEDGMENT We thank the National Institutes of Health (Grant No. RR-00612-12) for their generous financial support. Computer resources were provided by the SUMEX facility at Stanford University under National Institutes of Health Grant RR0785. REFERENCES AND NOTES (1) Part 44 of the series "Applications of Artificial Intelligence for Chem-
(2) (3) (4)
(5) (6)
ical Inference". For part 43 see: Lindley, M. R.; Shoolery, J. N.; Smith, D. H.; Djerassi, C., Org. Mugn. Reson., in press. Cone, M. M.; Venkataraghavan, R.; McLafferty, F. W. 'Molecular Structure Comparison Program for the Identification of Maximal Common Substructures". J. Am. Chem. SOC.1977, 99, 7668-7671. Varkony, T. H.;Shiloach, Y.; Smith, D. H."Computer-Assisted Examination of Chemical Compounds for Structural Similarities". J . Chem. Inf. Comput. Sei. 1979, 19, 104-111. Gund, P.; Wipke, W. T.; Langridge, R. 'Computer Searching of a Molecular Structure File for Pharmacophoric Patterns". In 'Proceedings International Conference on Computers in Chemical Research and Education"; Elsevier: Amsterdam, 1973; pp 5-33. Lesk, A. M. 'Detection of Three-Dimensional Patterns in Chemical Structures". Commun. ACM 1979, 22, 219-224. Rohrer, D. C.; Perry, H. 'FITMOL". In 'Public Procedures: A Program Exchange for PROPHET Users"; Wood, J. J., Ed.; Bolt, Beranek,
197
and Newman, Inc.: Cambridge, MA, 1978. (7) Cohen, N. C. 'Beyond the 2-D Chemical Structure"; In "Computer Assisted Drug Design"; Olson, E. C., Christoffersen, R. E., Eds.; American Chemical Society: Washington, DC, 1979; pp 377-381. (8) Marshall, G. R.; Barry, C. D.; Basshard, H. E.; Dammkoehler, R. A.; Dum, D. A. "The Conformational Parameter in Drug Design: The Active Analog Approach"; In 'Computer-Assisted Drug Design"; Olson, E. C., Christoffersen, R. E., Eds.; American Chemical Society: Washington, DC, 1979; Chapter 9, pp 205-226. (9) Avidon, V. V.; Pomerantsev, I. A,; Golender, V. E.; Rozenblit, A. B. "Structure-Activity Relationship Oriented Languages for Chemical Structure Representation". J . Chem. InJ Comput. Sci. 1982, 22, 207-214. (10) Nakanishi, K.;Kubo, I. 'Studies on Warburganal, Muzigadial, and Related Compounds". Isr. J. Chem. 1977, 16, 28-31. (11) Wenger, J. C.; Smith, D. H."Deriving Three-Dimensional Representations of Molecular Structure from Connection Tables Augmented with Configuration Designations Using Distance Geometry". J . Chem. In5 Comput. Sci. 1982, 22, 29-34. (12) Allinger, N. L. 'Conformational Analysis. 130. MM2. A Hydrocarbon Force Field Utilizing V, and V, Torsional Terms". J. Am. Chem. SOC.1977, 99, 8127-8134. (13) Harary, F. "Graph Theory"; Addison Wesley: Reading, MA, 1971. (14) Smith, D. H.; Carhart, R. E.; Crandell, C. W.; Venkataraghavan, R. "Constructive Perception of Shared Three-Dimensional Substructures". "Abstracts of Papers"; 186th National Meeting of the American Chemical Society, Washington, DC, Aug 28-Sept 2, 1983; American Chemical Society: Washington, DC, 1983; CINF No. 3. (15) Richards, M.;Whitby-Stevens, C. 'BCPL - the Language and its Compiler"; Cambridge University Press: Cambridge, 1979.
Carbon-13 Nuclear Magnetic Resonance Spectral Interpretation by a Computerized Substituent Chemical Shift Method? H. N. CHENG* and S. J. ELLINGSEN Hercules Incorporated, Research Center, Wilmington, Delaware 19899 Received December 30, 1982 A FORTRAN computer program (called C S H I ~ is ) developed for the rapid estimation of the
I3C NMR chemical shifts of aliphatic organic compounds. The method is based on additive I3C shift relationships, using empirical substituent chemical shift parameters. Examples are given that illustrate its use.
INTRODUCTION It is generally recognized that 13C N M R spectroscopy is a very powerful tool for organic structure determination. A major task in the interpretation of 13C N M R spectra is to estimate the chemical shifts of compounds known or suspected to be present. Two approaches are generally used: (1) look up the chemical shifts in spectral libraries of either the compound in question or, if not available, compounds with similar structures; (2) calculate the 13C shifts by using empirical substituent chemical shift rules. For the first approach the spectral collections of Sadtler,' Bremser,2 Breitmaier,3 and Stothers: among others, are very useful. In the last few years, many computer-assisted structure-determination methods have been d e v e l ~ p e d .Some ~ of the earliest, the CNMR program6 of Chemical Information Systems and its variants, have been generally available for several years. Recently the Stanford group has developed an array of sophisticated method^.^ Several other groups are also very active in advancing this important area.8-15 In the second approach, there exist empirical rules such as those formulated by Grant and Paul,15 Lindeman and Adams,16 and Carman, et al.17 for hydrocarbons, by Eggert and Djerassi18 and Sarneski et al.19 for amines, by Roberts20 and Ejchart2' for alcohols, and by Hagen and Roberts for car+HerculesResearch Center Contribution No. 1762. 0095-2338/83/1623-Ol97.$01.50/0
boxylic acids22,along with numerous others observed for other functional g r o ~ p s . Clerc ~ ~ , ~and ~ Pretsch have devised general additive rules for 28 functional groups.25 Dubois has used a topological parameter to model the alkyl environment.z6 Levy and Nelsonz7 and E j ~ h a r t ~have ~ , * ~proposed substitution methods whereby the 13C shifts are first estimated for the hydrocarbons, and heteroatoms are substituted later. Although these rules have varying accuracy, they serve as good starting points for spectral interpretation, especially when simple analogues cannot be located in the spectral libraries. A drawback to this approach is that it is labor-intensive and occasionally prone to arithmetic error. One way to facilitate the application of substituent chemical shift rules is to computerize them. One such effort was made by Clerc and S ~ m m e r a u e r .In ~ ~this work we have modified and extended the Clerc-Pretsch rules and computerized them using a different approach. Our program (called C S H I ~ )was written in a high-level language (FORTRAN IV) and has many special features. It is applicable to aliphatic carbons carrying 30 functional groups including the 28 listed by Clerc and P r e t ~ c h .It~ can ~ also take care of alicyclic compounds, although its accuracy tends to be lower. METHOD
In the Grant-Paul scheme,15 the 13C shifts are thought of as arising from empirical additive parameters that are char@ 1983 American Chemical Society
198 J . Chem. In5 Comput. Sci., Vol. 23, No. 4, 1983
acteristic of the neighboring atoms as in eq 1, where ko is a = ko + nnka + nBkB+ n,k, risks + S (1)
+
base value, n, is the number of carbons in the ith position, ki is an empirical parameter, and S is a steric correction term. Clerc and PretschZ5 generalized these rules by providing substituent chemical shift values for 28 common organic functional groups. The steric correction parameters were also generalized. We have revised their values for the halogens since it appears that the a effect depends strongly on whether the halide is primary, secondary, or tertiary. From the haloalkanes reported in the literature the following a-substituent values are obtained: Primary
Secondary
Tertiary
70.1 31.0 18.9 -1.2
69.0 35.0 27.9 3.8
66.0 43.0 36.9 20.8
I ~.
c1 Br I
CHENGAND ELLINGSEN Table I. Substituent Chemical Shift Parameters (in ppm) for ‘ ) C NMR To Be Used with Eq 1 with k , =-2.3 substituent
code
ab
8
Y
6
9.1
9.4
-2.5
0.3
49.0 28.3
10.1 11.3
-6.0 -5.1
0.3 0.3
30.7
5.4
-7.2
-1.4
11.0 12.0 -3.0 -0.5 22.1 9.3 -2.6 0.3 70.1 (1,2) 7.8 -6.8 0.0 69.0 (3) 66.0 (4) 31.1 (1,2) C1 CL 10.0 -5.1 -0.5 35.0 (3) 43.0 (4) 18.9 (1,2) Br BR 11.0 -3.8 -0.7 27.9 (3) 36.9 (4) I I 10.9 -1.5 -0.9 -7.2 (1,2) 3.8 (3) 20.8 (4) 26.0 -4.6 -0.1 NH,+ NH3+ 7.5 CN CN 3.1 -3.3 -0.5 2.4 62.0 4.4 -4.0 0 NO, NO 2 55.0 2.7 -4.0 0 -0000 11.7 C=NOH syn CNOS 0.0 0.6 -1.8 C=NOH anti CNOA 16.1 0.0 4.3 -1.5 21.0 SCN SCN 7.2 -4.0 0.3 31.1 9.0 -3.5 0.0 S(0)so 38.9 SO,H S03H 0.2 0.5 -3.7 29.9 CHO CHO -0.6 -2.7 0.0 22.5 3.0 -3.0 0.0 (20)co 20.1 COOH COOH 2.0 -2.8 0.0 coocoo- 24.5 3.5 -2.5 0.0 33.1 COCl COCL 2.3 -3.6 0.0 22.6 2.0 -2.8 0.0 C(0P coo 54.5 (1,2,3) 6.5 -6.0 0.0 OC(0) OCO 62.5 (4) 22.0 2.6 -3.2 -0.4 CONH CON NHCO NCO 28.0 6.8 -5.1 0.0 C=C“ c=c 21.5 6.9 -2.1 0.4 G C c3c 4.4 5.6 -3.4 -0.6 a Steric correction parameters (Table 11) apply to these substitNumber(s) in parentheses denotes the number of non-H uents. substituents on the carbon in question.
It is of interest that the a effect stays fairly constant from primary to secondary to tertiary for fluorine but increases for the rest of the halogens. The rules are strictly valid only for alkanes with monosubstituted halogens. When multiple halogens are added to the alkane structure in close proximity, the incremental chemical shifts cease to be additive, and the rules must take on more complex forms.31 Two additional functional groups are added to the list. The first is alkanesulfonic acids (RS03H). The additive parameters have been given by Freeman and A n g e l e t a k i ~ .The ~ ~ second one is the peroxy (and hydroperoxy) functionalites (ROOR’). The 13Cshifts have been found33for a long-chain secondary hydroperoxide to occur at 86.2, 33.2, and 25.9 ppm, giving 56.3, 3.3, and -4.0 as the a,p, and y parameters. More recent s t ~ d i e s provided ~ ~ , ~ ~ shifts for additional compounds; the numbers used for this work are 55.0, 2.7, and -4.0. For amides, alkyl substituents can occur on both sides of the functional group (Le., RCONHR’). Clerc and P r e t ~ c h ~ ~ originally only supplied the rules for alkyl carbons on the C=O side (i.e., R group). A new set of values is devised here for the carbons on the N side (Le., R’). For this group, a = 28.0, p = 6.8, y = -5.1, and 6 = 0. In addition, the values for thiocyanate and thiol are revised. Minor adjustments are also applied to several other functional groups. The complete list of substituent parameters is given in Table I, and the steric corrections in Table SI. Table 11. Steric Correction Parameters“ In the use of steric correction term ( S ) in eq 1, Clerc and i j=1 j=2 j=3 j= 4 P r e t ~ c hrecommended ~~ that one only counts the number of nonhydrogen substituents on the most branched a substituent. primary 0.0 -1.1 -3.4 0.0 However, in Grant and Paul’s original f~rmulation,’~ the steric secondary 0.0 0.0 -2.5 -7.5 tertiary 0.0 -3.7 -9.5 -15 correction should be used on all a substituents. In our exquaternary -1.5 -8.4 -15 tensive computations, we noticed that Grant and Paul’s approach to steric correction produces somewhat better results. Designation = S(i, j ) , where i = the carbon in question, and j = We have therefore followed Grant and Paul’s approach in this number of nonhydrogen substituents directly attached to the a substituent (applicable only to a-substituents marked with footwork. note u in Table I). The operation of the rules is best shown in an example. The base value ko is taken to be -2.3 ppm, and the I3C shifts for the carbons in choline chloride will be computed: example, in methyl acetate, there are two possible ways to count the neighboring substituents (see below). The same
c, base CY-”
3p-c CY-C &-OH S([?,4) calcd
-2.3 30.7 27.3 -2.5 0 -3.4 50.8
c, base CY-”
OC-c 3p-c ?-OH S(S,4)
calcd
-2.3 30.7 9.1 27.3 10.1 -7.5 67.4
c3
base OC-C &-OH 0-N’ 37-C S(S.2) calcd
-2.3 9.1 49.0 5.4 -7.5 0 53.7
In the application of these rules, caution is needed in dealing with some functionalities containing more than one atom. For
possibility 1
possibility 2
problem exists for C=C, C s C , and C(0)N. For the substituent values given in Table I, possibility 2 is actually the correct one to use. The second atom in the group (0 in this case) counts as a /3 atom, although its contribution is not figured in the calculation. (This is because this contribution
J. Chem. If. Comput. Sci., Vol. 23, No. 4, 1983 199
I3C NMR SPECTRAL INTERPRETATION
j1
ae;:;; 53 INPUT STRUCTURE
1 = i + 1
,1 STERIC CORRECT I ON
Figure 1. Logic used in the program CSHIFT for choline chloride. The calculation proceeds as follows: C,-CY(N+)-@,(C)-@,(C)-@~(C)~(C)-G(OH)-C,-a(N+) etc. STERIC CORRECTION
has already been incorporated in the COO group contribution.) PROGRAM CSHIFT
Structuring of the Program. In writing a FORTRAN computer program for the modified Clerc-Pretsch rules, we sought to incorporate all the necessary calculations in the program so that the user needs only to feed in the molecular structure to obtain the chemical shifts with no further work. After some preliminary investigation, we decided to use an algorithm with a treelike logic. An example is provided by the choline chloride example we used earlier (Figure 1). Thus, we start with the given molecular structure (the trunk of the tree) and examine each carbon in turn (the branches). For each carbon, whenever neighboring atoms are found, the subroutines corresponding to a,,f3, y,and 6 atoms are called, and appropriate values are added to the chemical shift. A simplified block diagram of the algorithm is given in Figure 2. The organization of the program consists of the following routines. (1) The main program reads in the molecular structure and initiates the computation one carbon at a time. (2) Subroutine ALPHA deciphers how many and what kind of a atoms are present and adds the a-substituent values and the steric correction terms to the calculated chemical shifts. This routine also contains the “dictionary” with all structural symbols and parameter values. (3) Subroutine BETA deciphers how many and what kind of ,f3 atoms are present. (4) Subroutine GAMMA deciphers how many and what kind of y atoms are present. ( 5 ) Subroutine DELTA deciphers how many and what kind of 6 atoms are present. (6) Subroutine SWITCH contains features to handle some special cases (vide infra). Two versions of program CSHIFT are available. The first was designed for batch operation. The structure needed is coded on cards, and the computed chemical shifts are obtained on the computer printout. The computer used for testing is a Perkin-Elmer Model 7/32. The second version was written for a Nicolet 1280 interactive computer with a raster CRT accessory. The structure can be entered directly on the keyboard with provisions for corrections or amendments. The computed chemical shifts are either displayed on the raster screen or printed on the keyboard. Needless to say, the second version is more convenient to use and is recommended for routine applications. Operational Details. One advantage of the Nicolet version of CSHIFT is the conversational nature of the program and the simple input procedures. Basically one enters the molecular structure into the terminal exactly as one draws on paper. The only restriction is that the format for each atom or bond is
Y ADD
7 EFFECT
ADD
6 EFFECT
j,
SUM
AND
b[
a,p,7,6 STERIC EFFECTS
Figure 2. Schematic diagram for the program
PRINT SHIFT
j
CSHIFT.
A4, such that an entry of C (for carbon) or an asterisk (for single bond) must be followed by three spaces, e.g. C*C
*
*0
NH3+* C
*
C 2-propanol
* COOH
C alanine
For simplicity, all hydrogens are omitted, and the structures are made up with the codes given in the second column of Table I. Since we attempt to depict three-dimensional structures in two dimensions, crowding of atoms and bonds is sometimes unavoidable. This may lead to ambiguity in deciphering the intended structure. To obviate this difficulty, one can use any number of asterisks as a single bond; for example C
* C * C * * C * COOH * *
c
* *
c The only qualification is that for each bond the asterisks must be either all horizontal or all vertical. An additional advantage of this feature is that cyclic structures with odd number of carbons can be easily depicted, e.g.
c*c*c * * c*c*c
c*c*c * * c ** * c
cyclohexane
cyclopentane
Alicyclic carbons usually give less accuracy than linear molecules in using additive shift rules. Nevertheless, the computer program calculates them just like linear molecules. Of course the rules do not work well for ring sizes less than or equal to 4. As noted in Table 11, the steric correction term applies only to the structures shown below. The subroutine ALPHA can
specifically trap for these groups. The correction term is not used for the other functional groups. Furthermore, provision has been made in the case of COO, CON, C=C, and C=C
200 J . Chem. It$ Comput. Sci., Vol. 23, No. 4, 1983
CHENGAND ELLINGSEN IURiOUlINl kLfHIlI,Jl C U A R A C I E V O ~~ ~ ~ i ~ i ~ , t o x o , n r i ~ , ~ ~ i , i n , ~ r CDlNDN N,l,G,D,LA,Ll,ICPEK DIlENSlON k D D l 3 I , ~ ~ , K S T O R E l 4 l , L I T O R i l I l REAL LCll4,41,Ll
C fhOG1AI C S Y I I T INICOLEI 1181 Y E l S l O l I C YRITIEN I7 N. W. CHENfi AND S. ELLlNlSfW C HERCULES RESEARCH CEWIER. UIL1110101. DELIUIRI l 9 l W
CHARACTER14 lON~,CARl,lfW~,llDD~lllLEllS~,N~lS,4ll,lW CHARACTER lL,lX,lW, If ,IR,IC,IS CON1101 I(,l,6,8,LA,Ll,ICHEN DAIA lLl’L’l,lW/‘W’l,lE/’f‘l,lRf‘R‘/,lC/’~’l,l~l’l’l
CARW’C
IEND*’I IADb’t
I
SUIII).’C SUl12l.’O
‘ ’
’
II
SUll7l*’PN
J=I,~I,
URllEl2,l3llll,l=2,El,lI,l~ll,l?l 131 F O R I A I I ’ ’,7l1I,3Xl,‘P’,2X,El12,2Xl,/l I41 DO 191 K:2,39 YRlTEl2,I35) K 135 F O R N A I I ’ ’,I2,2X) READll,l4lllNll,Kl,l=2,l7l 141 FORI(ATlI7A41 IN~N12,K) IFlII.EO.IEHDl60 IO 181 IF1IN.EP.1ADDlGO 10 170 IPS CONTINUE 171 ICHEK.1 I l l K11.K-I Nl2,K)=‘ 215 F O R N A I I A I I C SIRUCTURE LlDDIFlCIlIOl I O ~ T I O I A L I 311 YRIlf12,3111 311 ~ 0 R 1 1 A T l ’ l E N l E R OPIlDN: ECHO S T I U C l I R t l I l , ’ , 1141,’RE-ENIER SIRUCIUREIRI,’) URllEl2,31Il Ill fDRNAI1‘ CHANGE LINEILI, CHANGE SIlOLE ElllYiSl,’, * T41,‘OR IYIIIATL CALCULAlIOYlCI...‘l READII,2l51 IX IFlIX.EP.IE1 GO 1 0 321 lFIIX.EP.1Rl 60 IO I IFIIX.EO.IL1 GO T O 3 5 1 I F f I X . E P . 1 5 ) 60 TO 381 IflIX.EP.IC1 GO TO 411 G O T O 3PI 321 ULlltl2,321)iI,1~2,17) 321 FORIATI‘I ‘,I712~,l21/1 D O 322 K=2,YII 322 YRIIEl2,125lK,lNlI,K~,1~2,l7l 325 FORI(AII’I’,12,2X,l7A41 G.O. T O. III . 3 5 1 URITt12,3511 351 fORIIAT1’ CHANGE LlHE X ’I 3 5 5 READll.35blKCHG 356 fORIIATlI21 l F 1 K C H G - K ) o 361,366,15) 3SE XII=KCHG 311 URllEI2,3bl) 361 F O R R A T ( ’ ENTER HEY L I W E ’ I VRITEl2,1311 ~ l , ~ ~ 2 , 8 l , l I , l ~ l l , l 7 l YRIlE12,I351 KCHG READII,14I)IN~I,KCHGl,I~2,171 G O I O 311 3 8 8 URlTE1?,3811 381 F O R M T i ’ CHAHGE FNTRI’,lI6,’VEITICIL P9511IOI I ? ‘ 1 382 READlI,351lICH 383 YRITL12,384I 384 FOR11AlIT15,‘ H O R I Z O N T A L POSIlIOW I 7 ‘ ) 385 R E A D I 1 , 3 5 0 KCH 391 YRIlEl2,3PII 3P1 FORMATI‘ P R O V I D E NEY E Y T R l IFDRNAT A 4 1 1 ’I 392 READl1,1P11 NlICW,KCH) 393 F O R N A l l A 4 1 I F IKCH-KN 1 3111311,I95 395 KII*KCW 60 TO 311 c IIIIIATE ClLCULATIOY 411 URllfl2,4lllllllLEll~,l~l,lll 411 fOR11All’l‘,l8A4,/~ DO 451 K.2,KN DO 451 l*2,l? 1H=Nll.K) I F I I 1 1 : E Q . C A R l l [ I L L ALPXAI1,KI 451 ConTiituE
481 YRllEl2,49ll 491 F D R l b l I ‘ I E N T E R DPllOWi I X I T I E I , 1NlER NE1 SIRUCIUIElNI,’,I, t’ OR MAKE CHANGES OW OLD SlRUClURElCl...’I READll.2151 IX IflIX.EO.IE1 60 I O P O 1FIIX.EO.INI GO T O I IFIIX.EO.IC1 GO TO Ill GO T O 481 999 STOP END
‘ ’ ‘
SUIIO.’.
SUIl8).’f
I1 1111,KI.’ 1CHCK.I .-... .. C EYTER W E SIRUCTURE C 17Pf GROUP CODE DIRECILI WOE1 EACH HUlBER IfORHIl =I41 C TYPE ’I ’ A I POSITION 2 TO I L R N I N A I E STRUCIURE I I P I l C OR ’t ’ T O TERNINAIL AND TO DSlAlW SUlSlllUEWI VALUIS. TO USE NULllPLE 1 FOR SINGLE BOND, $1 NUS1 BE ALL HOIIZ OR VERI. C Ill YRllEl2,IIll I l l F O R R A l I ’ ENlFR lIlLf I72 CHARACTER$ NAXI‘I) 115 READll.II0 llllLflll,l~l,l8l . . I l l FORilAliISA4I 121 YRIIEl2,1211 I21 FORNATI’ EWTER STRUCIURE‘/l
’ ‘ ‘
SUIlIl.’W SUll4l”Nt SUII51.‘6
DO I I I.l,II
no
‘
lOND*’e
SUIIPI“CL
’
‘ ‘
sulllll.‘lR ‘ SUDlIll~‘~ ‘ SUIII2l=’NH3+’
sui(w=w ’ suBII0~”02 ’ SUIII5).’OO ‘
SUllIb)*‘CHOS’ SUBlI7l~’CNOA’ SU8118I=‘SCN ’
suBllPI=‘so
’
SUSl21l=‘SD3H‘ ’ SUBl21).’CHO suB122l=’co ‘ SUBI~II=’COOY’ SUBl24)*‘COO-’ SUBl2l)”tOCL” suol2l)=’coo ’ su912?1=” ’ SU1128I.’tON ’ suBl29l“NcD ‘
su8I3Il=’c=c
-
-
‘
s.u.B f. l o = ’ c 3-. c ‘ O A T A ADOl9.l,~9.~,2l.l,l~.7,ll.l,l.l,l2.l,7l.l,3l.l,l~.9, -7.2,26.1,3.1,b2.l,SS.I,I1.?,1b.1,21.1,31.1,38.?,29.9, 22.s.21. I ,24.s.33.1,22.~,s~.s,22.l,28.l.21*5,4*4,
S
*
P,~,~I.I,~I.I,~.~,I~.I,I~I,~.I,?.~;II.I,II.~, e
ll.P,7.5,2.4,4.4,2.7,l.b,4.3,7.2,P.I,l.5,~~.6,
0
3.l,2.l,3.~,2.3,2.l,b.~,2.b,b.~,b.9,5.b, -2.S,-b.1,-$.1,-7.2,-1.1,1.1,-2.6,-6.8,-1.1,-3.E,
0
-I.5,-4.b,-3.3,-4.1,-4.1,-1.8,-1.5,-4.1,-1.5,-3.7,-2.7,
1 1
-3.l,-2.~,-2.5,-3.6,-2.8,-b.l,-3.2,-5.l,-2.1,-3.4, l.3,l.3,l.3,-1.4,l.5,l~l,l.3,l.l,-l.5,-l.7, 0 -l.9,-l.l ,-l.5,l.l,l.l,l.l,l.l,l.3,l.l,0.2,0.l, 0 l.l,l.l,l.1,l.l,l.l,l.l,-l.4,l.l,l.4,-l.~/ D A I A LCl/l.l,l.l,l,l,-l.S,l.l,I.l,-3.7,~E.~,-l.l,-2.~,-~.S,0 -3.4,-7.5,-15.1,1.1/ C SIHGLL DDND 1 0 ) IS CALLED S U B O ) C HARES LXXX A R E ALL USED T O DETERMINE C O R R E C T l O Y TERNS 8
LAXI
LTSI.
1=1.
8*l. Gzl.
D.I. START
CHECKIUG THE NEIGHBORIUG P O S 1 1 1 0 N S
DO Ill NI.1,2 DO I l l 1J-I,2 L0.I
I2-YI+NJtl-I JZ’NI-NJtJ 111*1(112, J21 1FlIll.fO.BONDIGO
T O 175
60 ID Ill I75 11*2*12-1 J3.28J2-J IPI DO 111 XS*l,JI IS=SUBlKSl IN~Nl13.Jll
I F I I N . E ~ ~ ~TOS I151 ~ CONTINUE YRIlfl2~12$1 l3,Jl,Illl,Jll I25 fORllAll’IILLEGAL INPUT A1 POSITIOI 1’,11,‘,’,12,’1 ’,M)
111
RE TURN
151 Ill
IFIKS-I1 1BI,171,181 IX4~2*13-12 JXI=l*JJ-J2 12.13 J2*J3 II:IX4 JliJX4 60 IO 191 l 8 b IF I K S - 2 0 42l,4ll,4ll 411 If IKS-291 4ll,4ll,42l 411 CALL SYIl~HIKS,12,II,J2,J31 421 L A = L l k l A~4*ADDlKS,IL KSTORElLAl*KS IF I K S - 2 0 2ll,2ll,2ll 211 CALL GANHAl 12, J2,!3,JI,SUt,1DI I GO IO I l l 211 CALL lElAf12 J2 I 3 J3 SU),AIDI IF IKS-Ill Sil,~21:5li 511 IFIKS-51521,S2l,511 311 LB.-I 521 LSTORElL1IiLBtl 111 CONTINUE C THE CHEMICAL S H I F T IS CALCULITED k N D PRINlEl OUT D O 311 K=I,LA LB*LSIORE(KI
I F I t 8 1 321,321,321 121 Ll.LltLCIlLA,Lll 321 KS*KSlOREIKl 60 IO 13ll,3ll,343,3441,LA 141 IFlKS.LP.11 A=A-l.P IFlKS.CO.0 A.04.l IFlKS.fP.lllA=At9.l IFlKS.EO.I11A-A+11.1 60 TO 111
15.8,
J. Chem. If. Comput. Sci., Vol. 23, No. 4, 1983 201
I3C NMR SPECTRAL INTERPRETATION 344
lFlKS.EP.27IA~A+E.1 IFIIS.CO.81 ArA-3.9 1FlKS.EP.P) A ~ A t 1 2 . 0 lFIKS.EO.lO)A=AtIE.I IFlKS.EP. I 1 )L=At28*0
3lB
COYTIHUE
SHlFl=-2.3tA*BtG+l+LT URlTEI2,250)1,J,SHlfT 251 F O R N A T I ' P O S I T I O N ' , 1 2 , ' , ' , 1 2 , T 2 0 , F 1 6 . 3 , ' FM') LFIlCHEK1241,241,241 241 YRITEl2,24$) A,B,G,D,LT 245 FOkllATl'I IA=',Fb.Z,' B=',F6.2,' 1=',16.2,' * ' S:',Fb.2,' ]',/I 241 RETURN END
I*',F6.2,
SUITCHIKS,I1,III,JJ,JJJ)
SUERCUIINE
~F~IllltJJJl-~ll~JJll211,2fl1,l10 210 IFIKS-Z*IKS/2))21,2I,21 C I F KS=26 OR 2 8 , CHANGE 10 27 OR 29, PYD V l C l IEkSL.
20 RS=KStI RETURN 21 KS=KS-1 1 s t RETURN
78
CONllNUL IFIKS-6181~El~80 IX8=2*17-1h JX8*2*J7-J6 16'17 JbJ7 I7:IX8
7) 81
END
J?=IXP _.
GO TO 82 8 1 lFIKS-2b)42l,416,401
411 IiIKS-28)411,411,420
SUkROUIIRE IIElIlI?,J!,II,JI,SUl,PDD) CHRRAClEKt4 ~OWD,SUBI31~,NIl8,4ll,lI,IS COfltiON ll,B,G,D,LA,LB,ICHEK DIIIENSIOW ADD131,41
41 I ChLL S U I I C H I K S ..I h ., I ? , J6. J71
421 G.WlDDIKS,31 lFIKS-2612#~,1#1~1## C l L L DELTI11b,J6,17,J71SUI,LDDI CDNTlNUE
211 111
'
BOND='*
DO I11 N1.1,2
no 311
311 321 331 351 111
82
78
111
RETURN END
HJ=I,Z
I4=Nl+WJ*I3-3 J44I-NJtJ3 lFI14-12)321,311,321 1FI14-1311ll,321,11# IFIJk-J2)3S1,331,3S# lFIJ4-J3)tll,3SI,l#1 IN=tiII4,Jkl I F I I ) I . E O . B O N D l G O T O 110 6 0 T D 111 IS=2*14-13 JS;2*J4-J3 DO 78 KS=I,31 IS=SUBIKS) Ifl=fll15,JS) 1FIIM.EO.IS)GQ TO 79 CONllUUE
77
IFIKS-6181,8l,81
81
IX6*2*IS-I4 JX6*2*JS-J4 14:15 Jk=JS IS=lXh
__
1511x1
GO TO 82 8 1 li~KS-2b)42I,411,4l9
411 IflKS-29)4I1.41#.k?P .. .. ..
B:BtADDIKS,Z) IiIKS-2112l1,211,211
211
CALL OELTAlIk.J4.I5.JS.SU1.LDO) GO T O 111
END
Figure 3. Program listing for the Nicolet version of
CSHIFT.
to count the second atom in the group in the computation. (This is actually done by leapfrogging to the second subroutine in the hierarchy, e.g., BETA calling DELTA, skipping GAMMA.) A problem in the computer approach is the asymmetric groups, e.g., esters and amides. 0
II
C,-C-O-Cz
E-
Cz-OP
-Cl
TC
Cl-C-NH-Cz
-
-
C2NH-!-C, c
If one inputs the ester structure as COO, the calculation is fine for Cl. However, for Cl, a different set of rules would apply because we must calculate from right to left and the substituent rules for OCO must be used. This problem is taken care of in the subroutine SWITCH,where the direction of the computation is specifically taken into account. Program Listing. The program listing for the Nicolet version is given in Figure 3. Readers interested in the batch version may write to the authors for the program listing and input instructions.
RESULTS AND DISCUSSION The computer printout for choline chloride is given in Figure 4. An intentional error is introduced in the input structure,
and provisions for corrections are illustrated. The program as presently written can handle molecular structures with 16 X 40 structural elements. Since structural elements include atoms and bonds,a total of 8 X 20 atoms can be handled. For molecules larger than this, they can be broken up into smaller pieces and calculated piecewise. Aternatively, the dimension statements in the program may be changed. A comparison of calculated vs. observed I3C shifts for a variety of compounds is provided in Table 111. Four classes of compounds are noted: acyclic hydrocarbons (class I), hydrocarbons with one substituent (class 11), hydrocarbons with two substituents (class 111), and cyclic compounds (class IV). The observed shift values are mostly taken from Johnson and Janko~ski.~~ For the vast majority of organic compounds, the modified Clerc and Pretsch rules as given in the CSHIFT program work
202 J. Chem. if. Comput. Sci., Vol. 23, No. 4, 1983
CHENGAND ELLINGSEN
Table 111. Comparison of Calculated vs. Observed (in Parentheses)"C Shifts
Class 1. Acyclic Hydrocarbons shift, ppm 1
2
3
YH3$H1$H2CH2CH3
14.0 (13.5)
22.8 (22.2)
34.7 (34.1)
7H3 ?H,$H$HI$HI$H3
22.3 (22.7)
28.2 (27.9)
CH, 6H3 I 1 :H,$H-$H$H,$H,
19.8 (20.0)
31.8 (31.9)
compd
4
5
41.6 (41.9)
20.3 (20.8)
14.3 (14.3)
40.0 (40.6)
27.2 (26.8)
11.8 (11.6)
6
17.0 (14.5)
Class 11. Hydrocarbons with 1 Substituent shift, ppm compd
1
YH,C(O)$H,$H,$H,
5
1
d
2
4
5
6
7
8
n
7H,CH3 CH,CH,CH,CH,CHCH,OH 6
3
27.4 48.1 16.4 13.5 (13.4) (29.25) (45.10) 117.60) 44.8 (44.6) 35.3 (35.2) 20.2 (20.4) 13.2 (13.4)
$H,$H,$H,~H,CI 7
2
67.4 (65.1) 40.6 (42.1) 30.6 (30.3) 30.3 (29.3) 23.4 (23.2) 14.0 (14.1) 24.0 (23.5) 11.8 (11.1)
,
YH,CH=CHCH,CH,CH,$H,YH3(trans) 17.0 (17.8) 33.0 (32.7) 30.0 (29.9) 32.6 (31.7) 23.2 (22.8) 14.0 (14.1) 2
3
4
Class 111. Hydrocarbons with 2 Substituents
shift, ppm
compd
1
2
SH